Docker, part 2: The Pointless Sequel

My last blog on Docker was on configuring a docker image for the main system (the web server and database). This one is (mostly) going to be above configuring a container to run user scripts.

My main revelation since I wrote my last post was learning that docker containers were disposable thing not intended for data storage. For data persistence you need volumes where file and folders are mounted on to a docker. So I added a named volume to store the database, Django migrations and the secret key. Using a named volume means we're not dependent on the existence of a particular path within the host system to store the data, which make it easier to run on different operating systems.

Running user submitted scripts was always going to the trickiest part of designing Einstein 2.0. So it always made sense to run them in separate docker containers from the main one. Our original plan was to control one of more containers from the main container using Docker's command line interface.

In practice the complexity of trying to coordinate more than one docker container was too much, and the overhead of running a script using docker run or docker exec was just too high for speedy script execution so an alternative method had to be found, such as using rsh or ssh or similar. Ultimately I opted for writing a execution daemon which would live on the "dirty" container and execute userscripts directly. The execution daemon acts like a shell but communicated in JSON. The other mode of communicate between the two containers is a shared temporary directory in which the script daemon can write files for the execution daemon to execute.

The now single "dirty" container is much simpler that the main container. It just contains enough to run the execution daemon and whatever scripts students are asked to write in - at the moment only Python and Java. Like the main container it also uses an Alpine Linux image.

    FROM alpine:3

    # install some packages
    RUN apk add --no-cache python3 openjdk17-jdk

    # copy scripts
    COPY *.sh *.py /usr/local/bin/
    RUN sh -c "chmod 755 /usr/local/bin/*.sh /usr/local/bin/*.py"

    STOPSIGNAL SIGINT

    # entry point
    CMD init.sh

Most of the action happens in the compose.yaml file which sets the tcp/ip port and volume options for both containers

    services:
      clean:
        image: einstein:latest
        container_name: bohr
        volumes:
          - einstein_tmp:/einstein_temp
          - einstein_run:/var/run/execution_daemon
          - einstein_storage:/einstein
        ports:
          - 8000:80
        environment:
          - EINSTEIN_TEMP=/einstein_temp/
        restart: "no"
      dirty:
        image: einstein_script_server:latest
        container_name: niels
        volumes:
          - einstein_tmp:/einstein_temp
          - einstein_run:/var/run/execution_daemon
        environment:
          - EINSTEIN_TEMP=/einstein_temp/
        restart: "no"
    volumes:
      einstein_storage:
        external: true
      einstein_tmp:
      einstein_run:

I also use named volumes for some shared folders - einstein_tmp and einstein_run. Ideally these would use some kind of disposable temporary directories. Unfortunately, while docker does support tempfs folders, they are only supported on Linux and they can't be shared between containers.

Comments

Popular posts from this blog

Docker, Part I

Emails and User Registration