Lab 01 — Map an Attack Surface¶
Hands-on lab · ← Back to the module concept
Setup¶
git clone https://github.com/plaintext-security/plaintext-labs
cd plaintext-labs/offensive/01-recon
make up # builds the Python recon harness
make demo # runs 4-step passive recon on example.com bundled data (offline)
make down
Bundled data: data/crt_sh.json (8 CT log entries), data/dns_enum.json
(12 resolved subdomains, NS/MX/SPF), data/tech_stack.json (8
fingerprinted hosts). The recon.py harness implements the four passive
recon steps — CT log parsing, DNS grouping, tech fingerprinting, priority
scoring — without live network calls, so the demo is deterministic.
Authorization: this app is yours — attack it freely. The habit still matters everywhere else: only test systems you own or have explicit written permission to test (DVWA, PortSwigger Academy, targets you own).
Scenario¶
A client has asked for an external attack-surface assessment. Start with no credentials — only a domain name. Map every externally visible asset, fingerprint the stack, and identify priority targets for the next phase.
This lab is live-first. The primary path is real passive recon
(crt.sh + DNS) against a real domain you control or one that is in
scope for a public bug-bounty program — that is how the work is
actually done, and it pulls real edge CVEs by name. The bundled
example.com dataset (RFC-2606 reserved) is the offline fallback so
the demo is deterministic and so you can validate the harness before
pointing it at a live target. Do the live run; fall back to the bundled
data only when you have no authorized domain handy.
Live mode — the primary path (authorized targets only)¶
Run real passive recon against a domain you control (your own site, a lab tenant) or a domain that is in scope for a public bug-bounty program (read the program scope first — only in-scope assets). This is the real workflow; do this before falling back to the bundled data.
# 1. Pull live crt.sh output for your authorized domain:
curl -s "https://crt.sh/?q=%.yourdomain.com&output=json" > data/crt_sh.json
# 2. Resolve each discovered host (requires the host CLI):
for sub in $(jq -r '.[].name_value' data/crt_sh.json | sort -u); do
host $sub 2>/dev/null | grep "has address" | awk '{print $1, $NF}'
done
# 3. Fingerprint and score with the harness against your populated data:
make shell && python3 recon.py --report
Authorization: only run live recon against assets you own or that are explicitly in scope (your domain, a written-permission engagement, a bug-bounty program's listed scope). Passive recon still creates logs on the target's DNS servers.
Do¶
Run steps 1–5 against your live target where you have one; use the
bundled example.com dataset as the offline fallback otherwise.
-
[ ] Run the recon (live, or
make demofor the bundled fallback). Read the priority ranking: what are the top three targets and why does each score high? Which CVE is the most critical? -
[ ] Add a new CT entry for a backup VPN host (e.g.
vpn2.<domain>) — live, this surfaces naturally from crt.sh; offline, add it todata/crt_sh.jsonasvpn2.example.com. Re-run and confirm it appears in the subdomain list. -
[ ] If the SPF record includes a third-party relay (e.g.
sendgrid.net), what does that mean from a phishing-simulation perspective? What check would confirm whether the target actually uses that relay for outbound email? -
[ ] The
score_interest()function inrecon.pyuses hardcoded rules. Extend it: if the tech stack containsWordPress, add 15 points (WP has a large CVE surface and many discoverable plugins). Confirm a WordPress host (e.g.www.example.comin the bundled data) rises in the ranking. -
[ ] Run
python3 recon.py --report(in the container shell viamake shell) and read the generatedrecon-report.md. This is your deliverable template.
Success criteria — you're done when¶
- [ ] You can explain why certificate transparency is more comprehensive than brute-force DNS enumeration for passive recon.
- [ ] The
score_interest()WordPress extension fires correctly. - [ ] You generated
recon-report.mdwith the asset inventory and top-target rationale.
Deliverables¶
recon-report.md (generated by python3 recon.py --report): the scope
statement, full asset inventory with sources, and the top three priority
targets with CVE justification. This feeds directly into module 02's
scanning phase.
AI acceleration¶
Have a model interpret the tech-stack fingerprint and suggest CVEs to check — then validate each against NVD before including it in your report. Models hallucinate CVE numbers; always cross-reference the NVD entry.
Automate & own it¶
Required. With AI drafting and you reviewing every line: extend
recon.py with a --live flag that calls the real crt.sh API
(https://crt.sh/?q=%.{domain}&output=json) and writes the response
to data/crt_sh.json before running the analysis. Commit the extended
script.
Connects forward¶
The priority targets from this lab — especially vpn.example.com
(FortiGate CVE-2024-21762) and jira.example.com (Confluence
CVE-2023-22515) — are the scope input for module 02 (active scanning).
The subdomains feed module 03 (vuln ID).
Marketable proof¶
"I map an external attack surface passively — CT logs, DNS enumeration, tech fingerprinting, and priority scoring — and deliver a structured recon report the way bug-bounty and red-team engagements actually start."
Stretch¶
- Add email-infrastructure recon: check the DMARC record
(
_dmarc.<domain>), parse the SPF include chain recursively, and assess the spoofing risk level.
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).