Persistent storage for local Docker containers

In my previous post we looked what Docker is, how to install and run a test container locally.

If you also read the Jupyter Notebooks post, you will have noted the example of how to start up a Jupyter Notebook container on your local machine.

What happens however when you shutdown the container? Any data you created within it is lost! Why is that?

The data loss is actually intentional as Docker containers are designed to be ephemeral, containers can be deleted or removed and their internal data is lost.

But what happens if we require some more persistent data to remain?

If you want to see the commands for running a container for a Jupyter Notebooks then skip the Bind Mounts section at the end.

Docker Storage Types

As it happens there are two primary options for containers to store files in the host machine, so that the files are persisted even after the container stops: bind mounts and volumes .

No matter which type of mount you choose to use, the data looks the same from within the container. It is exposed as either a directory or an individual file in the container’s file system.

Bind Mounts

  • Mount a specific path on the host machine to the container
  • Not portable, dependent on the host machines file system and directory structure

Volumes

  • Stores data on the host file system but the storage location is managed by Docker
  • More portable
  • Can mount the same volume to multiple containers

Volumes

Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker.

Create and manage volumes

Unlike a bind mount, you can create and manage volumes outside the scope of any container.

Create a volume:

1
$ docker volume create my-docker-volume

List volumes:

1
2
3
$ docker volume ls
DRIVER VOLUME NAME
local my-docker-volume

Inspect a volume:

1
2
3
4
5
6
7
8
9
10
11
$ docker volume inspect my-docker-volume
[
{
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/snap/docker/common/var-lib-docker/volumes/my-docker-volume/_data",
"Name": "my-docker-volume",
"Options": {},
"Scope": "local"
}
]```

Remove a volume:

1
$ docker volume rm my-docker-volume

Run a container with a mounted volume

If you start a container with a volume that does not yet exist, Docker creates the volume for you.

In the below examples we tell Docker that we want to create a container from the image busybox, we will be using a volume mount called my-volume which will be attached to the /root folder within the container.

Within the container we want to run the bourne shell _sh_ to execute the echo command which outputs our text to the hello.txt file within our volume. It then also displays the contents of the hello.txt file to the screen.

$ docker run --mount type=volume,source=my-volume,destination=/root busybox sh -c 'echo Hello Cloud Commanders! > /root/hello.txt && cat /root/hello.txt'
Hello Cloud Commanders!

So where on our local host can we find the hello.txt we just created?

Lets run the inspect command to find out:

$ docker volume inspect my-volume
[
    {
        "CreatedAt": "2019-08-13T12:51:28+01:00",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/snap/docker/common/var-lib-docker/volumes/my-volume/_data",
        "Name": "my-volume",
        "Options": null,
        "Scope": "local"
    }
]

We can see that the mount point within our local host is at /var/snap/docker/common/var-lib-docker/volumes/my-volume/_data

So lets see what files are available there. We can run the ls command to list them, put be sure to run it as sudo as you will not have access to that folder by default.

$ sudo ls /var/snap/docker/common/var-lib-docker/volumes/my-volume/_data
hello.txt  jovyan

Hopefully you are now able to see the hello.txt file we created. Lets have a look at its contents.

$ sudo cat /var/snap/docker/common/var-lib-docker/volumes/my-volume/_data/hello.txt
Hello Cloud Commanders!

Don’t forget to remove any containers and volumes you no longer require.

This command will list all available volumes

1
$ docker volume ls

And you can remove them like this

$ docker volume rm my-volume

You mind encounter an error however stating that the volume is in use. In which case you might need to remove the container first.

Run the below command to list all containers, both running and stopped.

$ docker ps -a

If like you end up with a very long list of containers and don’t want to remove them manually then you might consider the use of the below command which deletes all stopped containers. Use with caution!

1
$ docker rm $(docker ps -a -q)

If you previously had any problems removing a volume, you should now be able to proceed.

Bind Mounts

The other form of persistent storage for containers is a bind mount in which a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full or relative path on the host machine.

By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.

We will be using bind mounts in our example of providing persistent storage to a Docker container hosting a Jupyter Notebook.

You might ask why we prefer a bind mount over a volume in this instance? That has to do with folder permissions on the host directory accessed by the container. By default, the processes within the container do not have write permission on that directory. We need to give the container the required permissions however using volumes the host directory is managed by Docker not us.

By using a bind mount we can set the required folder permissions.

Firstly create a folder on your host machine which will be mounted to the container.

$ sudo mkdir /home/cloudcommander/docker/jupyter-volume

Then we need to set the appropriate folder permissions

$ sudo chown 1000 /home/cloudcommander/docker/jupyter-volume

With the folder permissions now set, we can go about started our container with the bind mount!

$ docker run -v /home/cloudcommander/docker/jupyter-volume:/home/jovyan/work -p 8888:8888 jupyter/tensorflow-notebook

You should now see the URL appear on your screen that you can use to access the Jupyter notebook within your container which has full write access.

Any files you place with in the /home/cloudcommander/docker/jupyter-volume location will be accessible within your container.

References: