DockerFile Layout Good Practices

Lessons & Guidelines Learned

Docker is everywhere and there are many ways to make Dockerfiles, and many examples, all very simple, with few really good ones around. And the good ones are rarely annotated or explained on why they are good or what you should do.

So, this article is about useful standards and guidelines I’ve found useful in creating more complex Dockerfiles and is related to my .

General Concepts

First, there are some general guidelines, mostly centered around long-standing good engineering practices that also apply to Dockerfiles. They apply here even more so, as a Dockerfile can be a very dynamic file that changes often and over a long time, necessitating serious efforts to make it very high quality at all times.

Syntax Block

The syntax block, if you have one, must be the first line, which is kinda painful as we can’t start the file with a nice title and comment block, but we have no choice. Few systems need a syntax block, but it might be required for various experimental options, and might look like this:

# syntax = docker/dockerfile:experimental

Title & Comment Block

The real start of the file should include a real title, purpose, owner, and other usual things that document a project. A Dockerfile is often a top-level file that needs to mostly stand-alone by itself, so titles and info here is much more important than general source files in a large project.

You should also include various assumptions, issues, and complexities that future users and maintainers might need to know, including Docker versions, how this might interact with Composer, Kubernetes, and other relevant things.

You might also point out how this file and resulting containers interact with the larger software system you are building, including any dev, test, or production-related elements.

TODO

You always need a TODO section, and while you can sprinkle things throughout the file, there are often some bigger or meta things to do, too.

Build-Time Arguments

We don’t use this, but if you think you ever will, add a commented-out section with notes, so it’s clear where it goes and what it’s for.

FROM

The all-important FROM statement. The all-important FROM section, with any notes, history, or issues. It’s also very important to know why this image and version was chosen, if it’s unusual in any way.

In this case why we’re using a specific base version, so a future developer doesn’t change this randomly and break things.

# Based on official PHP container
# Note below on assumptions from that base
# Use 7.3 for now as no mod_php yet via php7-apache2 on Alpine
FROM php:7.3.16-alpine

Global Arguments

We don’t use this much, but if you think you ever will, add a commented-out section with notes, so it’s clear where it goes (in this case it must be after the FROM), and what it’s for.

# Global Args from Docker/BuildKit must be added here after FROMARG TARGETPLATFORM

ONBUILD

We don’t use this, but if you think you ever will, add a commented-out section with notes, so it’s clear where it goes and what it’s for.

# ONBUILD used to run things in downstream builds
# Such as single layer copies
# Not used for now
# ONBUILD

Labels

Labels are a very important, yet diverse, set of things that have a myriad of uses, for the build and deployment process, for lifecycle management, and more.

In this case, we only use it for OCI labels for our application, which is good for general meta data and management.

# OCI Annotations from LABEL org.opencontainers.image.maintainer="Steve.Mushero@ELKman.io"     \      org.opencontainers.image.authors="Steve.Mushero@ELKman.io"        \      org.opencontainers.image.title="ELKman"                       

# org.opencontainers.image.revision="" FROM git
# org.opencontainers.image.created="2020-05-01T01:01:01.01Z"

Base Container Info

You’ll always be using a base container, and often one with some things already installed, such as a base Apache, PHP, Java, or whatever container.

These all have assumptions, paths, and already-defined ENV variables that you should be aware of. It’s good to investigate and document there here (even though they can change), so we can be sure we are doing the right things in the right places in this file — more complex base containers will have more of these, which can easily confuse later builders.

You can also override any you use to make sure they never change, though it’s probably better to keep the inherited ones intact.

# Offical PHP Apache container defaults & assumptions
# From base Dockerfile:
# User: www-data
# WORKDIR: /var/www/html (Note we change this)
# php.ini: In /usr/local/etc/php (Note we update this via sed)
# Apache Conf: In /etc/apache2 (Note we update this via sed)
# Packages: LOTS of dev like gcc, g++, make, etc. probably remove

ENV Variables

Here you set various users, paths, etc. Be sure to set these as ENV and not hard-code further down in the file, which makes future changes very difficult. While changing one of these invalidates the cache, it’s usually better to use variables for everything you can —otherwise, it’s so easy to have very hard-to-find errors, especially over time as things get changed, copied, etc.

ENV MAINWORKDIR /var/www
ENV MAINUSER root
ENV MAINGROUP root
ENV APACHEUSER www-data
ENV APACHEGROUP www-data

Install & Repo Setup

Setup your yum, apt, etc. repositories as needed, building caches if you need them, and generally get ready to install things. This area might also include notes on installers, options, etc., especially as they are often needed during container builds to avoid caches, minimize space, etc.

# apk supports --virtual to group & later remove packages
# RUN apk add --no-cache --virtual .build-deps gcc
# RUN apk del --no-cache .build-deps

ENV Install Tools

Put the basic OS tools in an ENV var like this, which makes it much easier to manage and cleanly change over time. This list will often be long to start, to aid troubleshooting and early deployments, then get shorter as things can be removed (or added, as needed).

# Lists of tools - will shrink over time to reduce size
# Alpha order, please
# Telnet not available on alpine
ENV INSTALL_TOOLS \
bash \
busybox-extras \
curl \
less

Install Basic Tools

Install the basic tools using variables, so you never have to touch the actual install line again, which makes it much easier to ensure you have the right installer options, etc. without having to repeat and edit these lines:

# Update Repo Info & Install Basic PackagesRUN apk update --no-cache && \
apk add --no-cache --clean-protected ${INSTALL_TOOLS}

Install Specialized Packages

Separate out & install unusual or non-OS packages that might need special sequences, processes, or options to install. That makes them obvious and easier to manage.

# Install Specialized Packages
# We need SQLite for Telescope & other uses
ENV EXTRA_PACKAGES sqlite3
RUN apk update --no-cache && \
apk add --no-cache ${EXTRA_PACKAGES}

Remove Useless Stuff

Include a section that removes things you don’t need to reduce attack surfaces and make your containers smaller. For example, lots of base images include gcc which you never need at run-time, so get rid of it.

This assume you’ll do a multi-stage build or use the Squash option, both of which do a final copy and flatten to only include the active files among all the layers.

# Stuff to remove for smaller size
# Packages: Some images have dev stuf like gcc, g++, make, etc.
ENV REMOVE_PACKAGES gcc
RUN apk del $(REMOVE_PACKAGES)

Section Markers

It’s good to mark your sections to keep the file organized, to find things easily, and to keep future pieces from being added randomly in the wrong places. This is an important part of Dockerfile hygiene.

##### End of OS Items #####

Service Items Section

Services inside a container can vary wildly, from Apache or Nginx to large code bases, to larger data systems like MySQL or Elasticsearch. All have their own requirements and complexities, and most are fairly simple, but you may need lots of details to deploy them.

Generally, it’s best to use their dedicated containers, but sometimes you need to include them in your container, such as Apache in a PHP Laravel application container. In that case, there are a lot more details to deal with.

These sections are often mini version of above, with package lists, the install file, and often configuration file copying or editing-in-place.

This one is for Apache, starting with an ENV variable with the list of packages we need to install, then installing them.

##### Apache Items #####

# Install Apache & PHP Modules
# php7-apache2 installs much of PHP & Apache,
ENV PHP_PACKAGES php7 php7-apache2 php7-json php7-phar php7-iconv \
php7-openssl php7-curl php7-mbstring php7-fileinfo \
php7-tokenizer php7-dom php7-session php7-pdo php7-pdo_sqlite \
php7-xml php7-simplexml php7-xmlwriter php7-zip
RUN apk update --no-cache && \
apk add --no-cache --clean-protected ${PHP_PACKAGES}

Specialized Configurations

There are many ways to set config files, and you should separate and document them clearly. In this case, we want to retain nearly all the defaults, so rather than make copies of the base files, we just edit them in place to adjust a few things.

Basically, we set the variables then run sed to make changes in the various files. Note in the first part we used to also just copy over an artifact file to start with, but later moved to using the included base image file.

# Using default Alpine Apache configs and modifying from there
# Then we override, which lets us use unmodified official files
ENV APACHECONFFILE /etc/apache2/httpd.conf
ENV APACHECONFDDIR /etc/apache2/conf.d
ENV APACHEVHOSTCONFFILE ${APACHECONFDDIR}/default.conf
ENV APACHESECURITYFILE ${APACHECONFDDIR}/security.conf
# Copy over PHP file from PHP-Apache
# Skipping as seems the Alpine version has one: php7-module.conf
# COPY /deploy/apache/docker-php.conf ${APACHECONFDDIR}/docker-php.conf
RUN echo && \
# Remove stuff we don't want nor need for security, etc.
rm /etc/apache2/conf.d/userdir.conf && \
rm /etc/apache2/conf.d/info.conf && \
#
# Apache main config overrides
#
sed -ri -e 's/^#ServerName.*$/ServerName elkman/g' ${APACHECONFFILE} && \
sed -ri -e 's/^ServerTokens.*$/ServerTokens Prod/g' ${APACHECONFFILE} && \
sed -ri -e 's/^ServerSignature.*$/ServerSignature Off/g' ${APACHECONFFILE}

Other Services

Then you follow with other services and configurations, in this case for PHP. For this base, php is already installed, so we just have to deal with configs, via copying and adjusting, plus cleaning up the base and removing things to make sure it’s all clear.

##### PHP Items #####

# PHP Configs - Complicated as there're two PHP on Alpine 7.3
# Some PHP containers use date-specific extension dir in php.ini
# On Alpine, careful of which php is used for CLI
# vs. mod_php to verify their paths - Very confusing

# Disble default php so can't get confused on configs, modules, etc.
# Then the one we want works fine in pathRUN mv /usr/local/bin/php /usr/local/bin/php.bad# For Alphine 7.3 we use /usr/bin/php and /usr/etc/phpENV PHP_INI_DIR /etc/php7
ENV PHPEXTDIR "/usr/lib/php7/modules/"
# Use the default prod configuration from php:7.4.4-apache (php.ini-development also exists)COPY deploy/php/php.ini-production $PHP_INI_DIR/php.ini# Copy overridesCOPY deploy/php/php-override-prod.ini $PHP_INI_DIR/conf.d/
COPY deploy/php/php-sourceguardian.ini $PHP_INI_DIR/conf.d/
# Install composer & prestissimo for parallel downloads if needed
RUN curl -sS https://getcomposer.org/installer | \
php -- --install-dir=/usr/local/bin --filename=composer && \
composer global require hirak/prestissimo --no-plugins --no-scripts

Add Your Code

Now that you have services installed, add your code, in this case from the build environment. You can also pull from git, install as a package, etc. but our build environment already has pulled all the code, artifacts, build scripts, docker file, etc. so coping is easiest.

The COPY commands are very specific and the result of lots of testing. Note also the comments on .dockerignore, permissions, etc. as this has to be consistent and well-understood. These are often the result of many, many hours of work, so they have to be clear for all time.

#### Add Code ####

# Need to change WORKDIR as Apache default is /var/www/html
WORKDIR ${MAINWORKDIR}

# Copy files from VM
# Copy App Directories - Not setting owners here, it's done later
# Note will ignore the .dockerignore things, so tune that, too
# Currently we depend on git to create/ignore all the dirs we need, especially in storage
# We do this because later we want to git clone into container as part of build
COPY app app
COPY config config
COPY resources resources
COPY routes routes
COPY bootstrap bootstrap
COPY database database
COPY storage storage
COPY public public
COPY tests tests
# Copy Specific Files
COPY artisan ./
COPY composer.json ./
COPY composer.lock ./
COPY package.json ./
COPY package-lock.json ./
COPY webpack.mix.js ./

Building & Compiling Things

After you have some code, you often need to build or otherwise work on it — especially for Javascript things, but in our case also running PHP Composer as part of the container build.

As always, make the documentation, purpose, and special issues crystal clear, as this is often the results of days or weeks of work and testing.

In this case we run PHP composer inside the container build process to get & set all the right libraries. This is messy, and we also use a prior cache for performance, though this was the result of a lot of trial, and error.

# Run Composer install#   ENV COMPOSER_CACHE_DIR - Can set if needed, now using default#   Cannot use RUN mount here as we need a cache dir, and mount only supports files (as far as I can tell)#   Copy in composer cache, use and remove

COPY /composer-cache/files /root/.composer/cache/files
# Note: Have to run 'composer dump-autoload' for some reason here; seems install not fully doing itRUN composer install --no-dev --classmap-authoritative --no-ansi \
--no-scripts --no-interaction --no-suggest && \
composer dump-autoload && \
rm -rf /root/.composer/cache

Then we have npm stuff to run to get Vue.js and all the Javascript in the right places.

# NPM Stuff & Webpack (part of dev script)
# RUN npm install --no-optional
# Moving to ci instead of install (ci uses lock file)
RUN npm ci --no-optional
RUN npm run prod

Here’s an example of going one way then another as we get the rest way to do things. Managing Javascript things is especially challenging, but we retain older stuff to avoid reinventing the wheel later and trying broken methods again.

# Move public artifacts to doc root - do this after npm run
# Get .htaccess, too
# We missing anything in the standard html?
# Not moving as better to point Doc Root to our public
# RUN mv public/* html/ && mv public/.htaccess html/

Data & Things

Once we have all the services and all the code ready, we can turn to the data, in this case touching the empty database needed by the application (which will be seeded in a later setup step).

# Move DB file from source tree to writable storage area
# For now, touch empty file - we initialize this DB later
# Later we can copy a default DB if we wish
# RUN mv database/db.sqlite storage/database/
RUN touch storage/database/db.sqlite

Environment Setup

Once all the services, code and data are ready, we can setup our environment files, which are needed at runtime, but also for some final setup steps.

# .env File - Need to copy for productionCOPY .env.production .env

# Copy dusk env for now for testing
COPY .env.dusk.testing .env.dusk.testing

Setup System

Now we’re ready to setup some parts of the system itself. For Laravel, this means running a bunch of Laravel commands to setup PHP configs, keys, and build the starting database structures. This kind of thing changes a lot, so it’s important to include good comments.

# Setup configs & code; may later do as other user, fixed UID, etc.# Generate a new key each time (though we also need on install)RUN php artisan key:generate# Optimize & cache; do before we migrate or run other artisan jobsRUN php artisan optimize# Seed tables, Telescope, etc. data into DB
# Run after keygen, before other artisan cmds
RUN php artisan migrate# Update DB version to app code version; this for container's initial DB onlyRUN php artisan elkman:update

Remove Logs

Clean up the logs from all the above setup, both so we start clean and to reduce space. Always remember to purge any logs created during the build process (in part as they may have stuff you don’t want users or customers to see).

# Remove log file so we start clean (and with right log file owner)RUN rm -rf storage/logs/*

File Permissions

Setting file permissions in Docker can be very messy, due to all the copies and command actions, and especially so for complex run-time environments like Laravel.

So be sure to decide a clean way to do it and document it, as in our case this is the result of lots of experience and testing. Here we do them all at once in one place, so it’s easy to see and make changes or exceptions.

# Permissions carefully managed here
# Set all directory permissions
# Set global owner & read perms, then set for writable, exec, etc.
ENV READPERM 440
ENV WRITEPERM 660
RUN chown -R ${MAINUSER}:${APACHEGROUP} ./ && \
chmod -R ${READPERM} ./ && \
chmod -R ${WRITEPERM} storage && \
# Set all dirs to be executable so we can get into them
# Do after any chmods above
find ./ -type d -print0 | xargs -0 chmod ug+x

Final Purging

A final phase of cleanup of all the things to reduce space for smaller images.

### Data Purge
# Need to purge & cleanup
# rm composer & caches
# rm npx & caches
# rm any man pages, etc.
# vendor cleanup

RUN rm -rf /tmp/*

# End of apk installs, we can clean
# As apk cache clean seems useless
RUN rm -rf /var/cache/apk/*

Docker Runtime Items

Finally, everything is set and we can specify the run-time options.

We first repeat any defaults from the base image so we know what it sets and we might need to override. And a final note on how to actually run this container, as a reminder, especially if there are special ports or options involved.

#### Docker Run Time Items ####

# Offical PHP Apache container defaults & assumptions, from base
# EXPOSE 80
# ENTRYPOINT ["docker-php-entrypoint"]
# CMD ["apache2-foreground"]

# TODO: Set USER ?

# Override base container port to specify TCP
# And make it easier to change
EXPOSE 80/TCP

# Careful to separate each argument, e.g. like "-f" and the file:
CMD ["/usr/sbin/httpd", "-f", "/etc/apache2/httpd.conf", "-DFOREGROUND"]

# Docker Run command:
# docker run -d -p 8000:80 elkmanio/elkman:latest

That’s it. A good and fairly Docker file by example.

CEO of ChinaNetCloud & — Global Entrepreneur in Shanghai & Silicon Valley