Module 10 — Container & Image Security¶
Type 1 · Concept Autopsy (+ Type 4 · Audit→Build→Verify) — a working image tells you nothing about what's dormant in its layers; scan with trivy/grype and render a verdict on what's hidden. (Secondary: Audit→Build→Verify — triage CVEs by fixability, not count, and harden the image.) Go to the hands-on lab →
Last reviewed: 2026-06
Cloud & Container Security — a container image is a tarball of someone else's decisions; "it works" and "it's clean" are orthogonal.
In 60 seconds
A container image is a tarball of someone else's decisions: the moment you write FROM, their CVEs
become your CVEs. "It works" and "it's clean" are orthogonal — a working image can hide a miner, a
reverse shell, or a secret baked into a layer that docker history reveals but running the container
never would. Scan with trivy and grype, but triage by fixability, not count: fixable
HIGH/CRITICAL = rebuild now, unfixed = track. The fix is a multi-stage rebuild to a minimal,
pinned base — ship the artifact, not the toolchain.
The case¶
Between mid-2017 and mid-2018, a Docker Hub account named docker123321 quietly published a set
of public images under the names of software people pull every day — cron, tomcat, mysql. They
worked. You could pull docker123321/cron, run it, and get a working cron. What you also got, baked
into the layers, was an XMRig Monero miner and, in cases Kromtech traced, an embedded reverse
shell. The campaign ran for roughly ten months before Fortinet published on it and Docker pulled
the images on 10 May 2018. By then docker123321/cron alone had been pulled over a million times,
the family over five million, and the actor had mined ~$90,000 in Monero on other people's
compute. Documented in
Kromtech's writeup
and Fortinet's threat research.
Three years later, the same logic ran in reverse. In the Codecov 2021 breach
(Codecov's own post-mortem), the attacker's entry
point was a credential left in a layer of one of Codecov's public Docker images — extracted, used
to alter the Bash Uploader, and exfiltrate the CI secrets of thousands of downstream projects. One
image leaked a key out; the docker123321 images smuggled malware in. Both are the same class of
failure, and both turned on a single fact: a container image is opaque, and "it runs" tells you
nothing about what else is dormant in its layers. So this module turns on one question:
This image runs your app perfectly in testing. What ELSE is in it?
Your job¶
By the end of this module you'll call what's hidden in a working image, then prove it — scan with
trivy and grype, triage CVEs by fixability (the load-bearing judgment, not the count), and audit
the Dockerfile for the hygiene failures a scan-for-CVEs misses. Then you do the half a scan-and-triage
skips: author a hardened multi-stage rebuild — minimal/distroless base, no build tools in the final
layer, secrets out of the image, base pinned by digest — and prove the rebuild measurably cuts the
attack surface. Finally you encode the verdict as a CI scan gate that fails the bad image and
passes the rebuild. That loop — what's dormant → prove it → rebuild it clean → make it un-recurrable —
is the exact motion of a container-security review.
Call it before you read on¶
Don't scroll. An image you pulled builds, the app's tests are green, it serves traffic. Write your gut answers — being wrong is the point.
Q1. You wrote zero of the vulnerable code; your
FROM python:3.8-slimbase "just works." Whose CVEs are you now shipping, and how many — roughly none, a handful, or dozens?Q2. A
trivy imagescan comes back clean of HIGH/CRITICAL CVEs. Does that mean the image is safe to ship?Q3. The image runs and passes every test. What can
docker history/ a layer scan still reveal that running the container never would?
The verdict, revealed¶
Hold your answers against these.
Q1 — you inherit everyone above you in the FROM chain. An image tagged python:3.8-slim is a
frozen snapshot of a Debian root filesystem plus a Python install plus their transitive packages —
curl, libssl, glibc, dozens more you never named. The moment you write FROM, their CVEs
become your CVEs, and you ship them whether you read them or not. Worse, tags are mutable; the bits
behind them aren't. python:3.8-slim pulled today and the same tag pulled last year are different
digests, and a container started from a cached old digest runs old, vulnerable bits forever. Scanners
(trivy, grype) work by extracting the image's software bill of materials — every package and
version — and matching it against CVE feeds (NVD, GitHub Advisory, distro advisories). The naive guess
("I wrote clean code, so my image is clean") is exactly backwards: the supply-chain risk lives in the
base and package layers you didn't write, and a typical slim base will surface dozens of CVEs on
a fresh scan. You're not auditing your code; you're auditing someone else's decisions, inherited.
The mental model
An image is a stack of inherited decisions, not your code. You don't audit what you wrote — you audit
everything above you in the FROM chain, pinned to a digest so the bits can't drift out from under you.
Q2 — "no fixable HIGH/CRITICAL" is the answer, and even that isn't "safe." Two traps live here.
First, the count is a distraction; fixability is the verdict. A Critical CVE with no patch yet
available is something you track, not something you can fix today — gating your pipeline on it just
blocks every build for no action. A High with a fixed version in the distro's repo means rebuild
now. The right gate is severity-and-fixable (--exit-code 1 --severity HIGH,CRITICAL, ignoring
unfixed), not raw severity. Second — and this is the docker123321 lesson — CVE scanning answers
"are the packages patched," not "is anything malicious or wrong in here." A clean CVE report says
nothing about an embedded miner, a planted reverse shell, a secret in a layer, an extra binary, or a
container that runs as root with a debug port open. "It scanned clean" and "it's clean" are not the
same claim.
Q3 — the layers remember what the running container hides. This is the docker123321/Codecov
heart. A running container shows you the intended process; the image layers show you everything
ever added, including things deleted in a later layer (still recoverable) and credentials baked into an
ENV or ARG (readable by anyone with registry access via docker history / docker inspect).
That's literally how Codecov's attacker got in: a key sitting in a published layer. So the second half
of image security isn't CVE counting at all — it's hygiene and provenance: does it run as a
non-root USER; is the base pinned to a digest you can audit, not latest; are secrets injected at
runtime instead of baked in; is the final layer free of build tools, package managers, and shells an
attacker could pivot through? trivy config and hadolint catch this configuration class that a CVE
scan walks right past.
The gotcha
"It scanned clean" is not "it's clean." A CVE scan answers "are the packages patched" — it says nothing about an embedded miner, a planted reverse shell, a secret in a layer, or a container running as root with a debug port open. Counting CVEs while ignoring hygiene and provenance is the trap.
Go deeper: fixability is the verdict, not the count
A Critical with no patch yet is something you track, not fix — gating the pipeline on it just blocks
every build for no action. A High with a fixed version in the distro repo means rebuild now. So the
right gate is severity-and-fixable (--exit-code 1 --severity HIGH,CRITICAL --ignore-unfixed), and
one scanner is one opinion — run trivy and grype and reconcile.
The fix is a rebuild, not a patch. You can't apt upgrade your way out of an inherited base — you
rebuild from a minimal, current, pinned base. A multi-stage build does the work in a fat "builder"
stage and copies only the artifact into a tiny final stage (distroless or -slim), so the build
tools, package manager, and shell that an attacker would use never ship. Fewer packages → smaller SBOM
→ fewer inherited CVEs → smaller blast radius, all at once. "It works" and "it's clean" are
orthogonal axes; the rebuild is how you move on the second one without losing the first.
AI caveat
A model is a strong first-pass triage of trivy --format json + docker history — it orders findings
by fixability and proposes the minimum base bump. But it sees the SBOM, not the call graph: it can't
tell you whether the vulnerable path is reachable in your app, and it won't flag a planted miner or a
secret-in-a-layer no CVE feed lists. Its ranking is a hypothesis; you own the reachability call.
Learn (~3 hrs)¶
Curate the scanner mechanism; the trust argument above is the spine. Read the case first.
The supply-chain anchor in the practitioner's words (~30 min)
- Kromtech — Cryptojacking invades the cloud (the docker123321 campaign) (~15 min) — the discovering researcher's writeup; read for how a working image hides a miner + reverse shell and the pull counts.
- Codecov — April 2021 post-mortem (~10 min) — first-party RCA; note the entry point was a credential in a Docker image layer, and that the remediation was "squash / convert public images to multistage." The fix is this module.
Scanning: SBOM, severity, and fixability (~1 hr)
- Trivy — Filtering scan results (~30 min) — official docs; read the "By Severity" and "By Status" (--ignore-unfixed) sections so the fixable-vs-unfixable judgment is concrete, not abstract.
- Anchore Grype — README & supported ecosystems (~15 min) — skim how Grype's DB sources differ from Trivy's; the lab runs both because one scanner is one opinion.
- NVD — CVSS v3.1 scoring (~15 min, orient) — read the Base-Score vectors (AV/AC/PR/UI) so you can say "Critical in theory, unreachable in this context."
Hardening the artifact (~1 hr)
- Docker — Multi-stage builds (~20 min) — the mechanism for shipping only the artifact, not the toolchain. This is the graded rebuild.
- GoogleContainerTools — distroless (~15 min) — read the README's rationale: no shell, no package manager, minimal CVE surface — why the final stage should be near-empty.
- OWASP — Docker Security Cheat Sheet (~20 min) — maps one-to-one onto the Dockerfile.bad findings: non-root USER, no secrets in layers, pinned base, no --privileged.
Key concepts¶
FROMinherits the base's CVEs and its trust — you ship someone else's decisions whether you read them or not- Tags are mutable, digests are immutable: pin
@sha256:...and rebuild on a schedule, don't trust a tag - Fixability, not count, is the verdict — fixable HIGH/CRITICAL = rebuild now; unfixed = track; gate on severity-and-fixable
- "Scanned clean of CVEs" ≠ "clean": miners, reverse shells, secrets-in-layers, and root/
--privilegedconfig all pass a CVE scan - Image layers remember what the running container hides —
docker history/inspectreveal baked-in secrets and deleted files - The fix is a multi-stage rebuild to a minimal/distroless base, not a patch — ship the artifact, not the toolchain
- One scanner is one opinion: run
trivyandgrypeand reconcile - MITRE ATT&CK T1195.002 — Supply Chain Compromise: Compromise Software Supply Chain
AI acceleration¶
Paste your trivy image --format json output and docker history into a model and ask it to rank the
findings by exploitability for your workload (e.g. internet-facing API, non-root). It's a strong
first-pass triage — good at ordering by fixability and proposing the minimum base upgrade that closes
the most CVEs. But it sees the SBOM, not the call graph: it cannot tell you whether the vulnerable
code path is actually reachable from your application, and it will not flag a planted miner or a
secret-in-a-layer that a CVE feed doesn't list. Treat its ranking as a hypothesis; you own the
reachability call before you file a remediation ticket, and you own the rebuild. AI drafts → you review
→ you ship it.
Check yourself
- Your
FROM python:3.8-slimbase "just works" — whose CVEs are you shipping, and why doesn't writing clean code reduce that count? - A
trivy imagescan is clean of fixable HIGH/CRITICAL. Name two dangerous things that report still does not rule out. - Why is the fix a multi-stage rebuild rather than
apt upgrade, and what does the final stage not contain?
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).