Lab 03 — File Systems & Carving¶
Hands-on lab · ← Back to the module concept
Setup¶
This is a reference lab — its environment lives in the companion
plaintext-labs repo:
git clone https://github.com/plaintext-security/plaintext-labs
cd plaintext-labs/forensics/03-file-systems-carving
make up # build the SleuthKit + foremost container
make demo # run the worked example: parse the FAT32 image, recover deleted file
make shell # drop in to explore interactively
make down # stop when done
The container includes sleuthkit and foremost. data/disk.img is a committed 2MB FAT32
image with a planted "deleted" file — small enough to commit (< 2MB). It is a tiny fallback so
the worked make demo runs with no network.
Real evidence (primary artifact). Carve a real drive instead of the synthetic image: this lab
is anchored to the Digital Corpora M57-Patents scenario — the first four weeks (Nov 13–Dec 12
2009) of the fictional-but-real-public M57 Patents company, an outsourced patent-search firm whose
investigations include data exfiltration and illegal activity. It is a genuine, citable public
forensic corpus (Garfinkel et al.; digitalcorpora.org). Run make fetch-data to download a real M57
USB image, then run fls / icat / foremost against that.
- Dataset root: https://downloads.digitalcorpora.org/corpora/scenarios/2009-m57-patents/
- USB images (smallest suitable for carving): https://downloads.digitalcorpora.org/corpora/scenarios/2009-m57-patents/usb/
- Redacted full drive images (larger, richer carving): https://downloads.digitalcorpora.org/corpora/scenarios/2009-m57-patents/drives-redacted/
See PROVENANCE.md in this directory.
Fetching and hash validation are deferred to runner-validation —
make fetch-datais wired but not yet executed here.Everything runs locally against a bundled image you own (or the freely licensed M57 corpus). No external targets, no authorization needed.
Scenario¶
The affected organization's IR team recovered a USB drive from the compromised workstation
(BEACHHEAD-WS01). This mirrors the real Lunar Spider intrusion documented by The DFIR Report,
where a single click on a malicious Form_W-9.js led to a near-two-month compromise and Rclone
exfiltration. Initial triage shows a FAT32 filesystem with what appears to be normal files — but the
analyst suspects important files were deleted before the drive was seized. Your task is to parse the
filesystem metadata, enumerate deleted entries, and recover whatever was removed. The drive image is
data/disk.img (or the M57 image you fetched).
Only examine evidence you are authorised to handle. In a real case, this image would carry a hash-verified chain of custody from Module 01.
Do¶
-
[ ] Identify the filesystem. Run
fsstat data/disk.imgand record: filesystem type, cluster size, root directory inode, and total sectors. These are the basic parameters you'll reference throughout the analysis. Hint:fsstatis the SleuthKit tool that reads the superblock/BPB and reports volume-level metadata. -
[ ] List all files, including deleted ones. Run
fls -r -d data/disk.imgto list the entire directory tree, flagging deleted entries. Which files are marked with*? Note their inode numbers — you'll need them for recovery. Hint:-ris recursive;-dlists deleted entries;*in the output flags a deleted entry. -
[ ] Examine a deleted file's metadata. Pick the deleted file's inode number from step 2 and run
istat data/disk.img <inode>. What does the output tell you about the file? Note: allocated/unallocated status, data units (cluster numbers), and any timestamps present. -
[ ] Recover the deleted file's content. Use
What is in the recovered file? Does the content make sense as evidence in the scenario (data exfiltration, mirroring the Lunar Spider / M57 exfil activity)?icatto extract the content of the deleted inode to a recovered file: -
[ ] Carve unallocated space with foremost. Run
What file types didforemostagainst the disk image to find any additional content in unallocated space:foremostrecover? Check/tmp/foremost-out/audit.txtfor a summary. Are any recovered files different from whatflsfound? This illustrates the difference between inode-based recovery and carving. -
[ ] Document your findings. Write
findings.mdwith: fsstatsummary (filesystem type, cluster size)- List of deleted files found by
flswith their inode numbers - Content of recovered file (or a description)
- What
foremostrecovered and how it compared tofls - Short paragraph: why can
icatrecover this content? What conditions would make recovery impossible?
Success criteria — you're done when¶
- [ ]
fsstatoutput is documented (filesystem type, cluster size, volume parameters). - [ ] At least one deleted file was found with
fls -d. - [ ]
icatsuccessfully extracted the deleted file's content. - [ ]
foremostwas run and its output reviewed. - [ ]
findings.mdexplains why recovery was possible and what would prevent it.
Deliverables¶
Commit findings.md to your fork. Do not commit /tmp/recovered-file or /tmp/foremost-out/ — generated output stays out of the repo.
Automate & own it¶
Required. Write a Python or shell script carve-report.sh that:
1. Takes a disk image path as argument.
2. Runs fls -r -d and parses the output to list only deleted entries.
3. For each deleted entry, runs icat to extract the content and saves it to an output directory named by inode.
4. Produces a Markdown summary report of all recovered files.
Have a model draft the script; read every line and test it against data/disk.img before committing. This is the automation move for disk triage: you'd run this against every seized image to get a first pass of deleted-file candidates before manual review.
AI acceleration¶
Feed your fls output to a model and ask it to explain each line, identify which entries are directories vs. files, and flag which inodes might be most relevant to a data exfiltration investigation. Use it to draft the foremost configuration entry for a new file type (e.g., a custom log format). Verify every inode number and offset it names against the actual tool output — models confabulate specific numeric values.
Connects forward¶
The filesystem layer you work here is the foundation for Module 04 (Windows artifacts live in NTFS structures — $MFT, $LogFile, prefetch files) and Module 07 (plaso ingests disk images and parses filesystem timestamps into the super-timeline).
Marketable proof¶
"I recover deleted files from disk images using SleuthKit's inode-level tools and foremost carving — navigating the filesystem layer model to find what attackers thought they erased."
Stretch¶
- Research the
$UsnJrnlon NTFS: what does it record, and how would you extract it with SleuthKit orMFTECmd? - Try
tsk_recoveron the disk image: how does it differ from your manualicatapproach? What does it miss?
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).