Module 16 — Cloud Incident Response¶

Type 6 · Reconstruct (+ Type 5 · Detonate & Detect) — cloud IR is rebuilding the timeline from an immutable API log, not disk forensics; predict what the responders missed, then reconstruct the LastPass two-stage chain. (Secondary: Detonate & Detect — the track's payoff, run under pressure.) Go to the hands-on lab →

Last reviewed: 2026-06

Cloud & Container Security — the attacker left in the API log; cloud IR is reconstruction from an immutable record, not disk forensics. This is the payoff — everything the track taught, run under pressure.

Difficulty: Intermediate–Advanced · Estimated time: ~4.5–6.5 hrs (study + lab) · Prerequisites: Foundations · Module 14 — Cloud Attack Techniques · Module 15 — Logging & Detection

In 60 seconds

Cloud incident response has no disk to image and no memory to dump — the entire crime scene is an immutable, structured API log. The job is reconstruction: merge CloudTrail, flow logs, and findings on the one field they share (time), tag each event by kill-chain phase, and turn thousands of authenticated calls into a narrative a CISO or a court will accept. Two truths drive every verdict: containment is not eradication (exfiltrated data is a permanent capability gain — LastPass's first breach was the recon for its second), and encryption at rest is silent against a principal you authorized (the attacker steals the key-holder, not the cipher).

The case¶

In August 2022, LastPass disclosed that an attacker had breached its development environment via a compromised engineer's laptop and stolen source code and proprietary technical information. The company investigated, said the attacker's access had been contained, that no customer vault data was taken, and — by early September — declared the incident closed.

It was not closed. In a later, much longer disclosure, LastPass revealed a second intrusion that ran from roughly August into October 2022. The attacker had taken the technical information exfiltrated in the first incident — source code, internal documentation, knowledge of how the systems fit together — and used it as reconnaissance to target a specific senior DevOps engineer, one of only four people who held the keys to the company's encrypted backup storage. They compromised that engineer's personal home computer (via a vulnerable third-party media-player package), keylogged the master password to a corporate vault, and from there reached the decryption keys for the cloud-based backup buckets — and copied out customer vault backups and configuration data. (LastPass laid the full chain out in its security-incident updates.)

So the first incident was contained, in the narrow sense — the attacker was evicted from the dev environment. And yet it directly enabled the second. Before you read on, this is the question the whole module turns on:

The first breach looked "contained." What did the responders miss that let the same actor come back three months later and reach the crown jewels?

Your job¶

By the end of this module you'll do the thing a cloud incident responder is actually paid for: take a raw CloudTrail export (and the flow logs beside it), reconstruct a defensible timeline — who did what, when, in what order — pull the IOCs, scope the blast radius, and contain in the right sequence. Then you'll do the part that makes it repeatable: extend a triage script so the reconstruction is automated — the super-timeline move, sorting heterogeneous events by their one shared key, time. Your deliverable is a real IR artifact: the timeline, the IOC set, and the automation that builds them.

Call it before you read on¶

Don't scroll. Commit a verdict — being wrong here is the teaching event, and you'll test it in the lab.

Q1. The first-incident responders removed the attacker's access to the dev environment and saw no further activity. Why was that not enough — what survives an eviction?

Q2. The crown-jewel data (the customer vault backups) lived in cloud buckets and was encrypted at rest. The attacker walked out with it anyway. (Where have you seen this exact twist before?) What was the actually-failed control?

Q3. You're handed a raw CloudTrail export of a cloud incident. What is the first question you ask of it — before you read a single event?

The reconstruction, revealed¶

Hold your answers against these.

Q1 — containment is not eradication; exfiltrated data is now attacker capability. The responders removed access, which is necessary and feels like the finish line. But the first breach's loot was information — source code and architecture docs — and you cannot revoke information once it's copied out. That stolen knowledge became the recon phase of the next intrusion: it told the attacker exactly which four engineers to target and which one's machine was the soft path to the backup keys. The mental model: once data leaves, treat it as a permanent increase in the adversary's capability, and scope your response to what that data enables, not just to where the attacker currently sits. "We evicted them" answers a smaller question than "what did they take, and what does taking it let them do next?" The LastPass first-incident response answered the small question. (This is the same containment≠eradication gap that lets ransomware actors who were "kicked out" return through a backdoor they planted — the eviction was real; the eradication was not.)

The gotcha

"We evicted the attacker, saw no further activity, closed the incident" answers a smaller question than the one that matters. You cannot revoke information once it's copied out — so scope your response to what the stolen data enables next, not to where the attacker currently sits. LastPass's first-incident team answered the small question; the stolen source code became the recon for a second intrusion three months later.

Q2 — encryption at rest is silent against an authorized principal. You met this exact lesson in Module 01 with Capital One, and it recurs here because it is the most expensive misconception in cloud: encryption protects data from people without the key. The attacker didn't break the encryption — they stole the decryption keys by compromising an engineer who legitimately held them. To the storage layer, the reads looked authorized, because they were. The failed control was never the cipher; it was the blast radius of a single human's credentials reaching both a home machine and the production key material — identity and segmentation, the customer's side of the line. If you predicted "the encryption protected the backups," you just felt the misconception the track has hammered from module one.

Q3 — "is the trail intact?" This is the reconstruction discipline, and it's why cloud IR has a different shape than endpoint forensics. There is no disk to image, no memory to dump — the entire crime scene is an immutable, structured, per-event API log. CloudTrail (and its GCP/Azure equivalents) is your ground truth, so the first move is always to ask whether it's whole: a competent attacker calls StopLogging early, and the gap that creates is itself evidence — its start, end, and duration are data. Once you trust the trail, IR becomes a reconstruction problem: thousands of authenticated calls, your job is to turn them into a narrative a court or a CISO would accept. The technique is the super-timeline — take every heterogeneous source (control-plane CloudTrail, data-plane flow logs, later GuardDuty findings) and merge them on the one field they all share: time. Sort by timestamp, tag each event with its kill-chain phase, filter to the attacker's identity and source IPs, and the raw log becomes a readable order of operations: credential check → enumeration → role assumption → collection → exfil channel opened → trail stopped → persistence key minted. That ordered story is the incident report; the log records are the evidence under it. The methodology is exactly what timeline tools like hayabusa do for Windows event logs (ingest, sort, tag, output a sorted timeline) — the source changes, the move doesn't.

The mental model

The log is the crime scene. Cloud IR is reconstruction from an immutable API record, not disk forensics — so the first question is always "is the trail intact?" (a StopLogging gap is itself evidence: its start, end, and duration are data). Then build the super-timeline: merge every heterogeneous source on time, sort, tag by phase, filter to the attacker — that ordered narrative is the incident report, and the raw records are the evidence beneath it.

flowchart LR
    CT["CloudTrail<br/>(control plane: who called what)"]
    FL["VPC flow logs<br/>(data plane: bytes out, to where)"]
    GD["GuardDuty findings"]
    M{{"merge on time, sort,<br/>tag by kill-chain phase"}}
    T(["super-timeline =<br/>defensible narrative"])
    CT --> M
    FL --> M
    GD --> M
    M --> T

Go deeper: why two planes beat one

Control plane (CloudTrail) tells you what API was called by whom; data plane (flow logs) tells you how many bytes left, to where. Either alone is a lead. An AssumeRole + mass GetObject in CloudTrail that lines up with a hundreds-of-megabyte outbound flow to an external IP is a defensible exfiltration verdict — corroboration across planes is what turns a hunch into a finding that holds up.

The two halves you'll reconstruct in the lab — the timeline from CloudTrail and the exfil corroboration from flow logs — are the two planes every cloud incident lives on. Control plane tells you what API was called by whom; data plane tells you how many bytes left, to where. An AssumeRole + mass GetObject in CloudTrail that lines up with a hundreds-of-megabyte outbound flow to an external IP is a defensible exfiltration finding. Either alone is a lead; together they're a verdict.

AI caveat

A model is a useful first-pass tagger — map a sequence of API calls to ATT&CK-for-Cloud techniques, flag order anomalies. But IR judgment lives exactly where models are weak: temporal reasoning and attribution. A model will happily call a key used six hours later "the same session" when it's the persistence mechanism, and it can't tell an attacker covering tracks from a benign trail rotation. Draft the tags with it; own the sequencing, the gap analysis, and the verdict.

Learn (~3.5 hrs)¶

The track's last module before the capstone — curate a bit more, because IR pulls together identity (02/03), logging (15), and attacker TTPs (14) all at once.

The case — read the primary post-mortem (~45 min) - LastPass — "Notice of Recent Security Incident" + the December update (~30 min) — the breached company's own disclosure of the two-incident chain. Read it as an IR artifact: notice how the first incident's stolen data is named as the second incident's recon. This is your anchor; the first-party RCA is the most credible "what failed" source there is. - UpGuard — The LastPass Data Breach: timeline and key lessons (~15 min, skim) — the engineer's home machine, the Plex keylogger, the four key-holders, the backup decryption keys. The hop-by-hop the verdict rests on.

Cloud IR frameworks (~1 hr) - AWS Security Incident Response Guide (~40 min) — read "Detection and Analysis" and the forensics workflow; skip the org sections. The primary AWS source for how a cloud IR engagement is structured. - CloudTrail — userIdentity element reference (~20 min) — the most forensically rich field in a record. Learn the difference between IAMUser, AssumedRole, Root, and AWSService and what each implies — the IAMUser→AssumedRole transition is the privilege-escalation hop in the lab.

Timeline reconstruction (~1.5 hrs) - Hayabusa — GitHub README (~30 min) — the canonical sort-tag-output timeline tool. It targets Windows event logs, but read it for the methodology (ingest → Sigma-tag → sorted timeline) — that's exactly what you port to CloudTrail in the lab. Don't get lost in the Windows specifics. - The DFIR Report — pick one recent cloud/AWS intrusion writeup (~1 hr) — browse for an AWS-related case; the attack chains are real, the timelines are explicit, and you'll see the super-timeline discipline applied to a genuine incident. Read it asking "what's their join key, and where's their gap?"

Key concepts¶

Containment ≠ eradication: removing access doesn't undo exfiltration — stolen data is a permanent capability gain; scope to what it enables (LastPass incident-1 → incident-2)
The log is the crime scene: cloud IR is reconstruction from an immutable API log, not disk forensics — CloudTrail is ground truth, and "is the trail intact?" is the first question
The super-timeline: merge heterogeneous sources (CloudTrail + flow logs + findings) on their one shared key, time; sort, tag by phase, filter to the attacker — that ordered narrative is the report
Two planes corroborate: control-plane AssumeRole+GetObject lining up with a large data-plane outbound flow = a defensible exfiltration verdict
The StopLogging gap is evidence: its start/end/duration are data, not absence of data
Encryption at rest is silent against a principal you authorized — the attacker steals the key-holder, not the cipher (Capital One, again)
Containment order: revoke credentials → close exfil channels → restore logging → scope impact — and scope beyond the first key (persistence: second keys, new users, modified role trust)

AI acceleration¶

Feed your reconstructed timeline to a model and ask it to map each event to an ATT&CK-for-Cloud technique and flag anything that breaks the expected order. Models are strong at pattern-matching a sequence of API calls to technique descriptions — a genuinely useful first-pass tagger. They are weak exactly where IR judgment lives: temporal reasoning and attribution. A model will happily call a key used six hours later "the same session" when it's the persistence mechanism, and it cannot tell you whether a logging gap is an attacker covering tracks or a benign trail rotation — that's the call you're paid to make. Use it to draft the phase tags and technique IDs; you own the sequencing, the gap analysis, and the verdict on what the stolen data enables next. The timeline you commit is your professional analysis, not the model's.

Check yourself

The first-incident responders removed the attacker's access and saw no further activity. Why was that not eradication — what survives an eviction?
You're handed a raw CloudTrail export of an incident. What is the first question you ask of it, before reading a single event, and why?
Why does an AssumeRole + mass GetObject in CloudTrail become a defensible exfiltration finding only when paired with a flow-log observation?

Comments

Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).