Module 06 — Containerising Tooling¶
Type 9 · Tool-Build — package a security tool (trufflehog) as a reusable, hardened container image others can run; the deliverable is the published image with a non-root user, minimal base, and a clean ENTRYPOINT. (Secondary: Build-&-Operate.) Go to the hands-on lab →
Last reviewed: 2026-06
Security Automation — package a security tool once, as an image others run, and it runs the same way everywhere — pinned, non-root, with a clear --help.
In 60 seconds
A security tool is only useful if it runs the same way for whoever picks it up next — and an
unpinned install (curl | bash, FROM ubuntu:latest) is "run arbitrary binary data and hope"
dressed as convenience; the docker123321 campaign rode exactly that, ~5M pulls of backdoored
images. The fix is to package the tool as a reusable image you build and control: pin the base
and the tool version (verify the checksum), run non-root, set a clean ENTRYPOINT with no
CMD, and never bake a secret into a layer. The deliverable is judged like a tool — clear entry,
safe by default, reproducible by someone else.
Why this matters¶
A security tool is only useful if it runs — reliably, the same way, for whoever picks it up next.
Un-containerised tooling fails exactly there. volatility3 wants one Python, the IDS wants a
specific libpcap, the scanner wants an OpenSSL that conflicts with the one your distro shipped,
and the next analyst's laptop has none of it. The result is the most familiar failure in security
engineering: "works on my machine." The same tool gives different results on a laptop, a CI
runner, and an incident-response box — or simply won't install — and the time goes into dependency
archaeology instead of the investigation.
There's a sharper edge for security tools specifically: an un-pinned install is a supply-chain
exposure. The curl | bash install that grabs "latest" from a CDN, or a FROM ubuntu:latest
that resolves to a different image every build, means the thing you ran today is not provably the
thing you ran last month. When the tool's job is to find secrets and vulnerabilities, you want it
to be the most boring, reproducible, auditable artifact in your kit — not the one piece you can't
account for.
This is not hypothetical. Between mid-2017 and 2018 an account named docker123321 published
17 backdoored images on Docker Hub,
named to look like ordinary tomcat, mysql, and cron images. They sat on the public registry
for roughly a year and were pulled around 5 million times — some images over a million each —
quietly running XMRig Monero miners and embedded reverse shells on whoever docker pulled them.
Fortinet and Kromtech traced the campaign back to the single account, and Docker removed the images
in May 2018; the miner alone netted about $90,000. The lesson is the one this module turns on:
docker pull <something> is "run arbitrary binary data and hope for the best" unless you control
and pin what goes in — which is exactly the discipline you apply when you build the image yourself.
This module's product is that artifact: a security tool packaged as a reusable image — something an engineer pulls and runs without installing Go, resolving versions, or trusting an unpinned script. The deliverable is judged the way a tool is judged: does it have a clear entrypoint and usage, does it run safely by default, can someone else build it and get the same result.
The core idea¶
A Dockerfile is the recipe for a tool other people run, so the design judgments are tool design judgments, not just packaging. Four of them carry the weight.
The mental model
A tool image is a product with users, not a packaging chore. What you pin, what you keep out, what the entrypoint exposes, and what you refuse to bake in are the same decisions you'd make designing any reusable tool — they just happen to be written in a Dockerfile.
The base image is a dependency you're choosing — pin it and keep it small. FROM ubuntu:latest
pulls whatever is current on build day and ships everything Ubuntu installs by default; every one
of those packages is attack surface and a potential CVE in your tool's image. FROM ubuntu:22.04
pins the OS so the build is reproducible. A minimal base — cgr.dev/chainguard/static, distroless,
or FROM scratch for a static binary — gives you almost nothing but the tool itself: far fewer
packages, far fewer CVEs, a far smaller artifact to push and pull. For a security tool the rule of
thumb is unambiguous: less in the image is better. Pinning isn't only the OS — pin the tool
version too (v3.88.1, not "latest") and verify its checksum before you install it, so the binary
in the image is provably the one you intended.
Run as non-root, always. This is the single most important hardening step and the one a model
most often skips. RUN useradd -m scanner && USER scanner before the entrypoint means a
compromised container starts with far less to work with; a root container can, in some
configurations, escape to the host. It's a CIS Docker Benchmark item and a checkov check —
which is to say it's the kind of thing a reviewer (human or scanner) will hold the tool to.
ENTRYPOINT is the tool; CMD is its default arguments — and the distinction is a usability
decision. A tool image should set ENTRYPOINT ["/usr/local/bin/trufflehog"] and no CMD, so
docker run image <args> behaves exactly like running the tool with those args — and running it
with no args prints the tool's own usage rather than doing something surprising. That's what makes
the image feel like a command, not a black box: the caller's --help reaches the real tool, and
there are no dangerous baked-in defaults.
Never put a secret in a layer — not even one you delete. Docker keeps every layer; a key in
RUN echo "key=x" is visible in docker history --no-trunc even if a later layer rms the file.
Secrets are passed at runtime (env vars, secret mounts, a read-only volume), never baked at build
time. Layer ordering is the cheaper companion rule: put what changes rarely near the top
(apt-get install ca-certificates) and what changes often near the bottom (COPY . /app), so the
cache does the work and rebuilds stay fast.
The gotcha
Two traps look harmless and aren't. A secret in RUN echo … survives in docker history even
after a later layer deletes the file — a deleted layer is not a gone layer. And FROM
ubuntu:latest (or an unpinned tool version) makes "the thing you ran today" un-provably the
same as last month's — fatal for a tool whose entire job is being the boring, auditable artifact
in your kit.
AI caveat
A model hands you a functional Dockerfile fast, and the gap to a production tool image is this
module's checklist. The omission to watch for is the checksum verification — models reliably
skip it, and it's the supply-chain control that makes the binary provably the one you intended.
Close the loop like a reviewer: checkov -f Dockerfile on the draft and the hardened version,
and confirm the hardened one clears the HIGH findings (non-root, etc.).
Learn (~2 hrs)¶
Dockerfile best practices (~1 hr) - Building best practices — Docker docs — read the full page; covers layer ordering, pinning, multi-stage builds, and keeping images small. The reference for why the Dockerfile is shaped the way it is. - Docker security — non-root, secrets, read-only — Docker docs — read the "Docker daemon attack surface" and run-as-non-root guidance; this is the runtime side of the hardening the lab applies. - CIS Docker Benchmark — Center for Internet Security — skim sections 4 (container images) and 5 (container runtime); each benchmark item maps to a concrete Dockerfile instruction, so it doubles as your hardening checklist.
Minimal base images & the tool (~1 hr)
- Chainguard Images — "Why distroless?" — Chainguard — read the "Why distroless?" framing for the concrete argument that fewer packages means fewer CVEs; this is the case for the minimal-base stretch goal.
- 17 backdoored Docker Hub images removed — BleepingComputer (~10 min) — the docker123321 case (Fortinet/Kromtech): legit-looking images, ~5M pulls, cryptominers + reverse shells. Read it for why you pin and build, not blindly pull.
- trufflehog — GitHub README — the tool you're packaging; read the README's usage and the git/filesystem source sections so your image's entrypoint and --help expose the right surface.
Key concepts¶
docker pullis running someone else's binary: the docker123321 images (17 backdoored, ~5M pulls, miners + reverse shells) are why you pin and build rather than trust a legit-looking name.- Pin everything reproducible: the base image (
ubuntu:22.04, notlatest) and the tool version (v3.88.1) — verify the binary's checksum before install. - Minimal base = fewer CVEs: prefer distroless/
scratch/Chainguard where the tool allows it; less software is less attack surface. - Non-root, always:
useradd+USERbefore the entrypoint — CIS Docker Benchmark, acheckovHIGH if missing. ENTRYPOINTis the tool, noCMD: the image behaves like the command;--helpreaches the real tool; no surprising default action.- No secrets in any layer:
docker historykeeps deleted layers; pass secrets at runtime, never at build. - Order layers stable→volatile: rare changes up top for cache, frequent changes (
COPY) at the bottom.
AI acceleration¶
Ask a model to write the Dockerfile and it will hand you a functional one — and a revealing one,
because the gap between its first draft and a production tool image is the checklist this module
teaches. Then ask it to harden: pin the base and the tool version, add the checksum verification,
add a non-root user, drop unneeded packages, fix the entrypoint. The diff between the two drafts is
the list of things the model didn't consider by default; the omission to watch for is the checksum
step — models reliably skip it, and it is exactly the supply-chain control that makes the image
trustworthy. Then close the loop the way a reviewer would: run checkov -f Dockerfile on both
versions and confirm the hardened one clears the HIGH findings. AI drafts → you review every line
→ you own the image that ships.
Check yourself
- Why is
FROM ubuntu:latesta supply-chain problem for a security tool specifically, not just a reproducibility nuisance? - A teammate deletes a leaked key in a later
RUN rmlayer and says it's fixed. Why are they wrong, and what command proves it? - Your image sets
ENTRYPOINT ["/usr/local/bin/trufflehog"]and noCMD. What doesdocker run image --helpdo, and why is that the right behaviour for a tool image?
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).