This Docker image is meant to isolate and stabilize that environment, and should be portable across multiple operating system, as long as Docker is available.
To learn more about the use of containers for research reproducibility, see Carpentries' docker-introduction. For commercial services running containers, see codeocean.com, gigantum, or any of the cloud service providers. For an academic project using containers, see Whole Tale.
NOTE: The image created by these instructions contains binary code that is © Stata. Permission was granted by Stata to Lars Vilhuber to post these images, without the license. A valid license is necessary to build and use these images.
You need a Stata license to run the image. If rebuilding, may need Stata license to build the image.
In the documentation below, we will use a (bash) environment variable to abstract from the actual location of the Stata license. This has been tested on MacOS and Linux, and it should work using Git Bash on Windows. Comments welcome.
The Dockerfile contains the build instructions. A few things of note:
Set the TAG
and IMAGEID
accordingly. VERSION
should be the Stata version.
VERSION=18
TAG=$(date +%F)
MYHUBID=dataeditors
MYIMG=stata${VERSION}
Basic image building is easy:
DOCKER_BUILDKIT=1 docker build . \
-t $MYHUBID/${MYIMG}:$TAG
This will generate a lot of output, and may take a while:
[+] Building 4.7s (17/17) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
...
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:2dc159dee0413040c99b02f885eb7a6559b647cd6e86a 0.0s
=> => naming to docker.io/dataeditors/stata17:2022-02-09 0.0s
List your images:
docker images
output:
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 52e8f83a14f8 25 seconds ago 665MB
<none> <none> fb095c3f9ade 31 minutes ago 670MB
<none> <none> a919483dbe22 34 minutes ago 107MB
The resulting docker image can be uploaded to Docker Hub, if desired, or any other of the container registries.
docker push $MYHUBID/${MYIMG}:$TAG
We can browse the provided images at https://hub.docker.com/u/dataeditors:
In the following, we are going to assume that your project has the following directory structure, and simplify the directory mounts:
project/
data/
code/
01_preparedata.do
02_runanalysis.do
results/
main.do
setup.do
setup.do
installs all required Stata ado packages via the necessary commands (ssc install
, net install pkg, from(url)
). main.do
is the main controller script, which here resides in the root directory of the project. Other options are possible. We will run all commands with the working directory set to /path/to/project
. When NOT using Docker, you can achieve the same goal by double-clicking main.do
, which will automatically set the working directory to where main.do
resides.
Using a pre-built image on Docker Hub to run a program.
NOTE: because Stata is proprietary software, we need to mount a license file.
NOTE: We are using a working directory of "/project" here - check the Dockerfile for the precise location.
For all the subsequent docker run
commands, we will use similar environment variables:
VERSION=18
TAG=2024-02-14
MYHUBID=dataeditors
MYIMG=stata${VERSION}
and either
STATALIC="$(pwd)/stata.lic.${VERSION}"
or
STATALIC="$(find $HOME/Dropbox/ -name stata.lic.$VERSION | tail -1)"
where again, the various forms of STATALIC
are meant to capture the location of the stata.lic
file (in my case, it is called stata.lic.18
, but in your case, it might be simply stata.lic
).
docker run -it --rm \
-v "${STATALIC}":/usr/local/stata/stata.lic \
-v "$(pwd)":/project \
$MYHUBID/${MYIMG}:${TAG}
The above builds and runs the container using Docker. While there is a free Community Edition of Docker, others may prefer to use one of the other container management software, such as Podman or Singularity. For instance, in Singularity, the following works:
singularity run \
-B ${STATALIC}:/usr/local/stata/stata.lic \
-B $(pwd):/project \
-H $(pwd) \
docker://$MYHUBID/${MYIMG}:${TAG}
We have also converted the Docker image to a Singularity Image File (SIF),
sudo singularity build stata${VERSION}.sif docker-daemon://${MYHUBID}/${MYIMG}:${TAG}
and uploaded the resultant SIF file to the Sylabs.io servers (library/vilhuberlars/dataeditors/stata17), so it can be used directly in a way similar to DockerHub:
SYLABSID=vilhuberlars
singularity run \
-B ${STATALIC}:/usr/local/stata/stata.lic \
-B $(pwd):/project \
-H $(pwd) \
library://$SYLABSID/$MYHUBID/${MYIMG}:${TAG}
without the need to first convert it.
The docker image has a ENTRYPOINT
defined, which means it will act as if you were running Stata:
cd /path/to/project
docker run -it --rm \
-v ${STATALIC}/stata.lic.${VERSION}:/usr/local/stata/stata.lic \
-v $(pwd):/project \
-w /project \
$MYHUBID/${MYIMG}:${TAG} -b program.do
Your program, of course, should reference the /data
and /results
directories, ideally in a location-agnostic manner:
// we start in the rootdir
// Note: we could double-check that we are in the right directory:
// confirm file "main.do"
global rootdir : pwd
global data "${rootdir}data"
global results "${rootdir}results"
// all subsequent use references the globals
use "${data}/mydata.dta"
graph export "${results}/figure1.png"
- Adjust the
setup.do
file - list all packages you want installed permanently. - Remember to have the
stata.lic.17
file available - Start your Dockerfile with (adjust the tag)
# syntax=docker/dockerfile:1.2
FROM dataeditors/stata18:2024-02-14
# this runs your setup code
COPY code/setup.do setup.do
RUN --mount=type=secret,id=statalic,dst=/usr/local/stata/stata.lic /usr/local/stata/stata-mp do /setup.do
USER statauser:stata
VOLUME /project
WORKDIR /project
# run the master file
ENTRYPOINT ["stata-mp","/code/master.do"]
build, and then run this Docker image with
docker run \
-v ${STATALIC}/stata.lic.${VERSION}:/usr/local/stata/stata.lic \
-v $(pwd):/project \
larsvilhuber/greatpaper:2021-06-08 -b main.do
and the results of running the code (in code
) on the data (in data
) will show up in the results
folder which is local to your workstation, with no need to install any additional Stata packages.