Lab 09 — Unpacking & Obfuscation¶
Hands-on lab · ← Back to the module concept
Setup¶
git clone https://github.com/plaintext-security/plaintext-labs
cd plaintext-labs/malware/09-unpacking-obfuscation
make up
make fetch-sample # pulls a real, genuinely UPX-packed sample from MalwareBazaar into the isolated container
make demo
⚠ This lab unpacks a live malware sample. Handle it accordingly — and never execute it. - Static only. This module unpacks and reads; it never runs the sample — entropy, section names,
strings,upx -d, YARA. Unpacking is a static transform; do not detonate the binary or the payload it carries. - Isolation. All work stays inside the isolated container; never copy the sample to your host. - Hygiene. The sample is fetched at lab time (password-protected zip, passwordinfected) and is never committed —.gitignorecoverssamples/.make fetch-sampleneeds a free abuse.ch Auth-Key (setMB_AUTH_KEY). Theupxtag is the reliable way to guarantee a genuinely UPX-packed binary. - Offline fallback. No key / MalwareBazaar unreachable? Skipmake fetch-sample;make demofalls back to the bundled synthetic target — compiled from benigndata/target.c, packed in-container with UPX — plusdata/encoded_strings.pyfor the string-deobfuscation half.
Scenario¶
A triage queue drops a Windows executable an EDR flagged as "high entropy, likely packed." Open it in strings and you get almost nothing — a short stub, the import table mostly gone, and a tell-tale .UPX0 / .UPX1 section pair. That is the point of packing: the real code is compressed and only decompresses in memory at runtime, so static tooling sees a near-empty shell. This is MITRE ATT&CK T1027.002 — Software Packing, and UPX is the canonical packer: because it ships a standard unpack stub, upx -d reverses it back to the original payload without ever running it.
Your job is to defeat two layers of obfuscation, statically. First, confirm the binary is packed (entropy above 7.0, the .UPX0/.UPX1 section names, the UPX banner), unpack it with upx -d, and re-run strings and imports to reveal everything packing hid. Second, decode a loader script's XOR-obfuscated strings to recover the IOCs hidden from naive string extraction. The real packed sample (make fetch-sample) is the primary artifact; the bundled synthetic target + encoded_strings.py are the offline fallback that the demo and steps below exercise identically.
Throughout, the sample = the real UPX-packed PE that
make fetch-sampledrops insamples/(path printed by the target). In offline mode it is the bundled synthetic target, compiled and packed in-container bymake demo.
Do¶
The steps read against the sample — the real UPX-packed PE in
/lab/samples/aftermake fetch-sample(use the path it printed). If you skipped the fetch, work the offline fallback in the indented notes: a synthetic target you pack yourself, which exercises the identical mechanism.
- [ ] Confirm the sample is packed — entropy + section names.
Run
python3 /lab/data/entropy_check.py /lab/samples/<sha256>and record the entropy. A packed binary runs hot — sections (or the whole file) above 7.0. Then dump section names (strings /lab/samples/<sha256> | grep -iE 'UPX|This file is packed', orobjdump -hfor the PE section table) and find the.UPX0/.UPX1pair and the UPX banner. Those two signals together — high entropy and UPX section names — are your "this is UPX-packed" verdict.
Hint: entropy below ~6.5 is typical for unpacked code; a UPX-compressed section sits above 7.0. This is T1027.002.
Offline fallback: start from the clean synthetic binary.
sha256sum /lab/data/targetandpython3 /lab/data/entropy_check.py /lab/data/target(records a low-entropy baseline), thencp /lab/data/target /tmp/target_packed && upx --best /tmp/target_packedto create your own packed copy. Re-runentropy_check.pyon/tmp/target_packedand watch entropy cross 7.0 — that delta is the teachable signal.
-
[ ] Snapshot what packing hid —
stringsand imports before unpacking. Runstringsand (for the PE) list imports (objdump -p, orpefileif you brought it) on the still-packed sample. You should see almost nothing of substance — a short stub and a near-empty import table. Save this "before" view; the contrast in step 4 is the lesson. -
[ ] Unpack with the UPX stub. Run
upx -d /lab/samples/<sha256> -o /tmp/unpacked(work on a copy;upx -drewrites in place otherwise). UPX decompresses cleanly because it ships a standard unpack stub — no execution required. Re-runentropy_check.pyon/tmp/unpacked: entropy should drop back toward normal, confirming the compressed layer is gone.
Hint: if upx -d errors, the sample may be a UPX-lookalike or have had its header tampered with — note it and try another sample from the tag.
Offline fallback:
upx -d /tmp/target_packed -o /tmp/target_unpacked, thensha256sum /tmp/target_unpackedand compare against the clean-binary hash from step 1 — they must match, provingupx -dis loss-free. (For the real sample you have no clean reference hash, so the drop in entropy and the restoredstrings/imports are your proof instead.)
-
[ ] Re-analyse the unpacked payload — strings and imports after. Re-run
stringsand the import dump on/tmp/unpacked. Compare against your step-2 "before" snapshot: the import table is now populated and real strings (URLs, paths, error messages) appear. This is why you unpack — the indicators packing concealed are now visible for triage. -
[ ] Decode the obfuscated loader strings. Packing hides a binary's strings; in-script obfuscation hides them inside source. Examine
data/encoded_strings.py— an XOR-encoded byte array with a decode loop. Identify the key byte and the encoded blobs, then runpython3 /lab/data/decode.pyto recover the IOCs (a C2 URL, a registry key, a mutex). Record them. -
[ ] Map to ATT&CK. The packing maps to T1027.002. The string encoding maps to T1027.013. Record both IDs in your notes alongside the evidence (the entropy before/after, the UPX section names, the restored imports, and the recovered IOCs).
-
[ ] Author a YARA rule on the UPX section names and prove it (the build half). Detecting the packer is only half the job — turn the indicator into a detection. Write
upx-packed.yarthat keys on the UPX section-name strings you confirmed in step 1 (UPX0,UPX1, and the$Info: This file is packed with the UPXbanner), requiring more than one of them in the condition so a stray substring can't trip it. Then prove the two-sided result: run it against the packed sample (the real/lab/samples/<sha256>, or/tmp/target_packedin the fallback) — it must match — and against the unpacked twin (/tmp/unpacked, or/tmp/target_unpacked) — it must not match — and against/bin/ls(a benign, unpacked control), which must stay quiet too. That no-match on the unpacked twin is the whole point: the rule detects packing, not the payload, so unpacking the sample makes it disappear from the rule. If it fires on the unpacked binary, your strings leaked out of the UPX stub — narrow them to the section names only. Defeating the obfuscation and authoring the packer-detection rule that proves you recognised it are equal halves. Hint:yara /path/to/rule /path/to/file; nopemodule is needed — these are plain section-name strings, so anany of them/2 of themstring condition is enough.
Success criteria — you're done when¶
- [ ] The packed sample is confirmed packed: entropy > 7.0 and the
.UPX0/.UPX1section names (or UPX banner) are present. - [ ]
upx -dsucceeds and entropy drops back toward normal on the unpacked output (offline fallback: pre-pack and post-unpack SHA-256 hashes match). - [ ] The before/after
strings(and import) comparison shows the indicators packing had hidden. - [ ]
decode.pyoutput shows the plaintext IOC strings. - [ ] Your notes contain both ATT&CK technique IDs with evidence.
- [ ]
upx-packed.yarmatches the packed sample and does not match the unpacked twin (or/bin/ls) — the build half, proven two-sided.
Deliverables¶
Commit to your portfolio repo:
- packing-notes.txt — entropy before/after, the confirmed UPX section names, the before/after strings/import comparison (offline fallback: pre-pack and post-unpack hashes with equivalence confirmed), recovered IOC strings, and ATT&CK IDs.
- upx-packed.yar — the authored packer-detection rule, with the match (packed sample) / no-match (unpacked twin, /bin/ls) proof recorded in packing-notes.txt.
- Do not commit the fetched sample, any unpacked payloads, compiled binaries, or entropy reports containing sample hashes.
Automate & own it¶
Required. Write triage_packed.sh — a shell script that accepts a binary path as its argument and:
1. Computes and prints the SHA-256 hash.
2. Runs strings and checks for UPX signatures, printing [PACKED] UPX detected or [CLEAN] No UPX signature.
3. If UPX is detected, attempts to unpack with upx -d, then re-hashes the result and prints both hashes with a match/mismatch verdict.
Draft the script with AI, then test it against both the packed and clean binary to confirm both code paths work. Commit triage_packed.sh.
AI acceleration¶
Paste data/encoded_strings.py into a model and prompt: "This Python script XOR-encodes strings. Identify the key, decode all encoded byte arrays, and list the plaintext strings." Verify each decoded string manually by running decode.py and comparing. Note any discrepancy.
Connects forward¶
Module 10 covers the case where static unpacking fails — the packer is custom and only unpacks at runtime. You will use strace and LD_PRELOAD tricks to extract the payload from a running process without a debugger.
Marketable proof¶
"I can identify a packed binary by entropy signature, apply a static unpacker, verify payload integrity by hash, and decode XOR-obfuscated strings from loader scripts — producing ATT&CK-tagged evidence for an IR case file."
Stretch¶
- Modify the UPX section names with
upx --forceplus a hex-editor patch (renameUPX0/UPX1to arbitrary 4-byte names — a trick real loaders use to defeat naive packer rules), then confirm your step-8 rule now misses the packed binary. Add a second condition toupx-packed.yarthat keys on a packer-independent signal instead — the high.text-section entropy or the UPX decompressor-stub byte pattern — and show it re-catches the renamed sample. This is the arms race in miniature: string rules are brittle, structural ones are not. - Extend
triage_packed.shto run yourupx-packed.yaras part of its detection step, so the script both classifies (rule fires) and unpacks in one pass.
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).