Build and run Docker images in JupyterLab!

andrew.manning · June 1, 2023, 5:28pm

Overview

Starting today, our JupyterHub service is configured to allow you to build and run Docker images within your individual JupyterLab servers.

This offers several advantages for MUSES module developers:

Your JupyterLab server continues running when you close your web browser, so long-running image builds will continue remotely when you go offline.
The software environment is consistent for everyone. The base software environment is set by our (customized) JupyterLab server image, and so collaborators are more likely to be able to reproduce each others’ code execution.
We have shared file storage mounted at /home/jovyan/jupyter_share that allows developers to share files seamlessly.

This feature should be considered a beta version. There are certainly bugs and misconfigurations we will discover, as well as ways we can improve and optimize the system to best serve our development needs. Please be vocal about your experiences and problems you encounter so we can fine tune it.

Technical notes

There are some quirks related to the use of rootless-docker.

UID mapping

One quirk of rootless-docker is the confusing system user id (UID) situation. For example, say you want to share a folder with your container. You map the volume using the -v option as show below, where you try to create text file in the output/ folder that is shared with the container:

jovyan@jupyter-andrew-2emanning$ docker run --rm -it \
    -v $(pwd)/output:/home/flavor_equil/output \
    feq:dev \
    touch output/$(date +"%Y%m%d%H%M%S").txt

touch: cannot touch 'output/20230525151352.txt': Permission denied

This fails because the owner UIDs of mounted files/folders do not get mapped the way user IDs get mapped. The UID of the host user running the docker run command is mapped to root (UID=0) in the container. Thus, if the host user is UID=1000 and that user owns the files mounted into the container, they will appear to be owned by root inside the container.

You might think that you want to run the container as root using the --user option so that the container processes will own the mounted files. The problem is that if your container is configured to run as a specific UID, the behavior may be incorrect. For example, if in your Dockerfile you do something like

FROM ubuntu:22.04
RUN useradd --uid=1000 --create-home clu
USER clu
RUN pip install --user some-package

then if you run the container with UID=0 using docker run --user=0 , Python will complain that the package is not installed (because it will only be installed under the local home directory clu).

The only way around this dilemma is to ensure that the shared directory mounted to the container has sufficiently open write permissions (i.e. chmod a+rwX /shared/path) prior to running the container.