Slide Deck


Exercises

Interactive look at Docker container

In this exercise, we will:

  • run a container based on the rocker/tidyverse:4.4.1 image
  • explore the file system of the container
  • take note of the rstudio/home directory
  • open R
  • exit the container

Before you start, make sure that your terminal emulator is in your project’s root directory by using pwd. If you are not in the correct folder, use cd to navigate to the root directory of your folder.

At the command line, execute the following command.

docker run -it rocker/tidyverse:4.4.1 bash
  • -it means we want to run an interactive container
  • bash at the end means we want to open a bash terminal in the container (see below)

If you are on a Windows machine you may need to pre-fix the call with winpty:

winpty docker run -it rocker/tidyverse:4.4.1 bash
<<<<<<< HEAD

If you are on a Mac that has an Apple Silicon chip (e.g., M1/M2/M3), you may see an error similar to the following:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

If you see this you need the --platform linux/amd64 option to your docker run commands (and all subsequent docker run commands below). So your command would look like:

docker run --platform linux/amd64 -it rocker/tidyverse:4.4.1 bash

When you execute this command, docker will check for the existence of the image on your computer. The first time you run this command, you likely will not have downloaded the rocker/tidyverse:4.4.1 image from docker hub. As a result, the command will automatically download the image for you. While the image downloads you will see various status updates that look like this:

=======

When you execute this command, docker will check for the existence of the image on your computer. The first time you run this command, you likely will not have downloaded the rocker/tidyverse:4.4.1 image from docker hub. As a result, the command will automatically download the image for you. While the image downloads you will see various status updates that look like this:

>>>>>>> e05cf3875280297f81990f236a51718f8f5e5a56
Unable to find image 'rocker/tidyverse:4.4.1' locally
4.4.1: Pulling from rocker/tidyverse
7646c8da3324: Pull complete 
2a532d0e6beb: Pull complete

After the image has finished downloading, you will notice that your command prompt has changed to look something like this:

root@1d09e8db1e4d:/#

You have opened a terminal emulator inside a running container!

The first thing we will do is explore the file system. You should see that when you run pwd your working directory is /, i.e., the root directory.

If you use ls to list the contents of / you will see:

bin   etc   lib    libexec  mnt   rocker_scripts  sbin  tmp
boot  home  lib32  libx32   opt   root            srv   usr
dev   init  lib64  media    proc  run             sys   var

This is the file system for the container. I.e., none of these directories are permanently stored in your computer’s file system. They only exist for as long as the container is running.

We will care about one directory in particular for future exercises: the /home/rstudio directory. Confirm that you can navigate to this directory by executing:

cd /home/rstudio

Use ls -a to confirm that the directory is empty.

Run the following command to create a file named foo in the /home/rstudio directory.

touch /home/rstudio/foo

Confirm that the file was created using ls again. Once you have confirmed the existence of the file, exit the container by typing exit and pressing Enter.

Execise: Start a new container using exactly the same syntax as above:

docker run -it rocker/tidyverse:4.4.1 bash

(Remember to add winpty if you are on Windows and using git bash)

Check for the existence of the file you created above, /home/rstudio/foo. Can you find the file?

The file will not be findable!

A container’s file system is destroyed whenever the container is stopped. Thus, if we would like to run analyses in containers, we need a strategy for retrieving our results before we exit the container!


Running R Studio in a container

In this exercise, we will see how to

  • run R Studio server inside an image
  • set a password for login using the -e option
  • open a port so we can view R Studio using the -p option
  • sign into an R Studio session in container
  • check working directory of R Studio

To run an interactive R Studio session, run the following command:

docker run -e PASSWORD="SECRET123" -p 8787:8787 rocker/tidyverse:4.4.1

(Remember to add winpty if you are on Windows and using git bash and to add --platform linux/amd64 if you needed to use that option earlier)

You will see some printed messages. You will know that your container is successfully running if you see:

[services.d] done.

Before opening R Studio, let’s pause to describe the options that we included in the docker run call:

  • -e PASSWORD="SECRET123 specifies a password that we will need to login to our session of R Studio in the container
  • -p 8787:8787 opens a port on the container, which will allow us to connect to R Studio running inside the container

To log into R Studio open web browser and go to http://localhost:8787.

When prompted, your user name is rstudio and your password is SECRET123 (or whatever you passed to the -e option).

Once you log-in, you should see a normal R Studio session.

In the R console, type getwd() to see what the “default working directory” is for R Studio in the container.

When you are finished, return to your terminal emulator and press ctrl + c to close the container.


Mounting project directory

In this exercise, we will:

  • run a container with a mounted directory
  • attempt to build our report in the container
  • install R packages in a container

Before we start confirm that your working directory is in project root directory, using pwd and cd, as needed.

Once you have confirmed your working directory is your project directory, if you are on a Mac, you can execute:

docker run -e PASSWORD="SECRET123" \
-p 8787:8787 \
-v $(pwd):/home/rstudio \
rocker/tidyverse:4.4.1

(Remember to include --platform linux/amd64 if needed.)

If you are on Windows, you can execute:

winpty docker run -e PASSWORD="SECRET123" \
-p 8787:8787 \
-v /$(pwd):/home/rstudio \
rocker/tidyverse:4.4.1

There are two new things about the above commands:

  • we are using \ (i.e., an escape character) to break the command over multiple lines (so it’s easier for you to see on the website)
  • we added the -v option

The -v option makes the /home/rstudio directory exactly the same as our current working direction, i.e., our project directory. In this way all our project files will be available immediately in R Studio!

  • If you care about the details, $(pwd) is using bash command substitution, which means that the expression pwd is evaluated prior to the docker run command and is replaced by its evaluation (i.e., your current working directory).

Exercise: Sign in via your browser to the R Studio session that is running. Confirm that you can see project related files and attempt to knit the report. What happens?

To log into R Studio open web browser and go to http://localhost:8787.

When prompted, your user name is rstudio and your password is SECRET123 (or whatever you passed to the -e option).

Once R Studio is open, click on the Folder in the upper left corner to browse files. You should see all project related files now.

Open the R Markdown file used to create the analysis report and attempt to knit the file.

You should see an error related to packages not being installed.

At this point, you could use install.packages to install the appropriate packages. However, this is not a great solution as these packages will disappear as soon as you exit the container. After all, R packages are just files downloaded on your computer and are therefore destroyed along with the container.

At this point you can close the container by returning to your terminal emulator and pushing ctrl + c.

See the next section for a better solution!


Building your own image

To build a new image that includes the required R packages, we can create a Dockerfile.

Open a new text file in R Studio: Select File: New: Text file. Save the file with name Dockerfile by clicking File: Save As.

Add the following contents to this file:

FROM rocker/tidyverse:4.4.1

# install R package from command line
RUN Rscript -e "install.packages('here')
RUN Rscript -e "install.packages('RSocrata')
RUN Rscript -e "install.packages('forcats')
# add additional R packages required here

When you are finished use the following command to build the image:

docker build -t rstudio-plus .

Here, we have given the image a “tag” (i.e., name) with the -t option. We can now run this image using:

docker run -e PASSWORD="SECRET123" \
-p 8787:8787 \
-v $(pwd):/home/rstudio \
rstudio-plus

(Remember to add winpty if you are on Windows and using git bash)

Note that this is the same syntax above, except we have changed the name of the image that we are using to rstudio-plus.

Exercise: Log into the R Studio session and attempt to build the report. If the attempt does not work, modify the Dockerfile, rebuild the image and try again until it works.

A working Dockerfile should look roughly like this:

FROM rocker/tidyverse

# install R package from command line
RUN Rscript -e "install.packages('RSocrata')

Once the report is successfully built, note that the final report is accessible even after the container has stopped due to the use of directory mounting (-v option).