Lab 03 — Static Analysis — Strings & PE¶
Hands-on lab · ← Back to the module concept
Setup¶
git clone https://github.com/plaintext-security/plaintext-labs
cd plaintext-labs/malware/03-static-strings-pe
make up
make fetch-sample # pulls a real Agent Tesla sample from MalwareBazaar into the isolated container
make demo
⚠ This lab analyzes a live malware sample. Handle it accordingly. - Static only. This module never executes the sample —
strings,pefile, entropy, YARA. Do not run it. - Isolation. All work stays inside the isolated container; never copy the sample to your host. - Hygiene. The sample is fetched at lab time (password-protected zip, passwordinfected) and is never committed —.gitignorecoverssamples/.make fetch-sampleneeds a free abuse.ch Auth-Key (setMB_AUTH_KEY). - Offline fallback. No key / MalwareBazaar unreachable? Skipmake fetch-sample;make demofalls back to the bundled syntheticloader.exe, andutil.dllstays the benign control.
Scenario¶
A triage queue drops a sample an email gateway flagged as a likely Agent Tesla infostealer — the family FortiGuard dissects in the module's Learn path. Before anyone detonates it, the team wants a structured static metadata dump for the case ticket: imports, strings of interest, compile timestamp, and section entropy. Your job is to produce that dump from the real sample using strings, pefile, and analyze_pe.py, then write a detection rule from what you find — and confirm the strings/IAT line up with the keylogging + credential-theft + SMTP-exfil behaviour the FortiGuard writeup documents.
Throughout, the sample = the real Agent Tesla PE that
make fetch-sampledrops insamples/(path printed by the target). In offline mode it is the bundled syntheticloader.exe;util.dll(or/bin/ls) remains the benign no-match control.
Do¶
-
[ ] Run
stringsagainst the sample. Capture all printable strings (minimum length 6). Categorise the output into: file paths, registry keys, IP addresses or URLs, error messages, and "other interesting." Do you see the credential-store paths, the SMTP server / port 587, or a mutex? What does the string set tell you about the binary's intended behaviour? -
[ ] Dump the Import Address Table (IAT) for the sample (and
util.dllas control). Usepefilein Python to list every imported DLL and every imported function. For each function you don't recognise, look it up on MalAPI.io. Flag the keylogging combination (SetWindowsHookEx/GetAsyncKeyState) and anything else in MalAPI's "suspicious" category. -
[ ] Parse the COFF timestamp. Use
pefileto extract theTimeDateStampfrom the COFF header and convert it to a human-readable date. Does the timestamp look plausible? (CheckPE.FILE_HEADER.TimeDateStamp; convert withdatetime.utcfromtimestamp().) -
[ ] Extract per-section entropy. For each PE section, calculate Shannon entropy and record the section name, raw size, and entropy. Flag any section above 7.0 as potentially packed or encrypted.
-
[ ] Run
analyze_pe.pyand review the JSON output. Runmake demowhich executes the analysis script against all samples. Verify the JSON output matches your manual findings from steps 1–4. Fix any discrepancy. -
[ ] Write the analysis note. In
static-analysis-note.md, forloader.exe: summarise the imports, flag any suspicious API combinations (referencing MalAPI.io), note the timestamp, and give a verdict on whether this binary warrants dynamic analysis. -
[ ] Author a YARA rule from your findings and prove it (the build half). Reading the metadata is only half the job — now turn the highest-signal findings into a detection. Write a YARA rule
static-strings-pe.yarthat keys on what this sample actually exposes: the suspicious-API combination you flagged in step 2 (the import names as strings) plus one distinctive string from step 1 (a mutex, log path, or URL), gated onpe.is_pe. Then prove the two-sided result:yara static-strings-pe.yar loader.exemust match, andyara static-strings-pe.yar util.dll(the benign control from the same case) must not match. If no benign PE is at hand, point it at a known-good binary like/bin/ls— it must stay quiet there too. A rule that fires on the benign control is keyed on the wrong thing; narrow it until onlyloader.exematches. Recording the verdict and authoring the rule that proves you understood it are equal halves. (Hint:yara /path/to/rule /path/to/file; use thepemodule for the PE check and require all the chosen strings in the condition so a single shared import can't trigger it.)
Success criteria — you're done when¶
- [ ] IAT dump for both PE files is complete and every function is labelled (suspicious / benign / unknown).
- [ ] Compile timestamp is parsed and assessed.
- [ ] Section entropy table is complete; any high-entropy sections are flagged.
- [ ]
static-analysis-note.mdexists with a verdict and reasoning. - [ ]
analyze_pe.pyruns cleanly and produces valid JSON output. - [ ]
static-strings-pe.yarmatchesloader.exeand does not matchutil.dll(or/bin/ls) — the build half, proven two-sided.
Deliverables¶
analyze_pe.py (see Automate & own it), static-analysis-note.md, static-strings-pe.yar (with the match/no-match proof recorded in the note). Commit all three.
Automate & own it¶
Required. analyze_pe.py is provided as a starting point in data/. Extend it to also: (1) output a list of strings matching any of the patterns in a simple patterns file (data/string-patterns.txt) — one regex per line — and (2) add a "verdict" key to the JSON that is "packed" if any section entropy >= 7.0, "suspicious" if any import is in a hardcoded list of high-risk APIs, and "benign" otherwise. AI can draft the pattern-matching extension; you write the high-risk API list yourself by hand after reviewing MalAPI.io.
AI acceleration¶
Give an AI your IAT dump and ask it to map each import to a MITRE ATT&CK technique. Cross-check five entries against MalAPI.io — any that are wrong or missing, note them and correct the AI's output in your analysis note. Attribution of technique to API is a skill, not a lookup.
Connects forward¶
The analyze_pe.py output feeds directly into Module 04 (capability detection with capa) — capa's JSON output supplements the per-function analysis with higher-level behaviour labels. In Module 07 you will correlate the imports you found here with the actual disassembly to see where each function is called.
Marketable proof¶
"I extract and interpret PE metadata — imports, strings, entropy, timestamp — and produce a structured analysis report from a binary without executing it."
Stretch¶
- Add Unicode string extraction (
strings -elfor UTF-16LE) and note whether any Unicode strings differ from the ASCII set. - Detect if any section is named with non-standard characters or has an unusual combination of flags (e.g., a writable + executable section — a common packer characteristic).
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).