Chapter 32  ·  Part VIII: Shipping
32

Retrospective and Learning

Running a retro that produces insight, not complaints. What to do with the findings. Building your personal execution playbook. And why the engineers who accelerate fastest are the ones who treat every project as a course they're taking on themselves.

~25 min Read time
5 Frameworks
Final chapter Part VIII

You shipped it. Six months of work, three missed milestones, two organizational crises, and one week where it felt like the whole thing might fall apart — and now it's in production. Users are using it. Your manager is happy. The war room is disbanded.

Now what?

Most teams do one of two things. They either skip straight to the next project — there's always a next project — or they hold a retrospective where everyone says something polite, the notes go into a folder that nobody ever opens again, and three months later they're making the exact same mistakes on the next project.

Neither is learning. Both are just moving.

This chapter is about the gap between finishing a project and actually growing from it. That gap is where most engineers stay stuck. They accumulate years of experience without accumulating wisdom, because experience without reflection is just events — things that happened, not things that taught you something.

The engineers who become genuinely formidable — the ones who seem to have a sixth sense about where projects will go wrong, who can walk into a new situation and immediately see the landmines — didn't get that way from natural talent. They got that way from systematically extracting lessons from every project they touched, and then actually applying those lessons next time.

This chapter will show you how to do that.

Why Most Retros Fail

Before we talk about what a good retrospective looks like, let's be honest about what a bad one looks like, because bad retros are far more common — and most teams don't even realize theirs are bad.

There are two failure modes, and they look completely different on the surface but have the same root cause.

The Venting Session. Someone books a meeting. People come in frustrated from six months of hard work. The facilitator writes "what went well" on the left side of the whiteboard and "what could be better" on the right. Within twenty minutes, the right side is covered in sticky notes. "The design kept changing." "We didn't have enough engineers." "The infrastructure team was slow." "Nobody knew the timeline." People feel heard. The meeting ends. The whiteboard gets photographed and filed. Nothing changes. In the next project, the same problems emerge.

The Harmony Theater. The team likes each other. The project was hard but people are proud of what they shipped. Nobody wants to be the person who brings up an awkward topic in front of management. So the retro becomes a celebration. People say nice things. The action items are soft and vague. Everyone leaves feeling good. Nothing changes. In the next project, the same problems emerge.

The Venting Session at least surfaces real frustrations. The Harmony Theater doesn't even do that. But they share the same failure: they generate observations, not learning. They document what happened, not what to do differently.

A retrospective that only produces a list of what went wrong is not a retrospective. It's a complaint log. The test of a good retro is not what people said in the room — it's what changed on the next project.

The root cause of both failure modes is that most retros focus on events instead of patterns. An event is a specific thing that happened: "the API contract changed in week eight and broke our integration work." A pattern is a recurring tendency: "we always discover interface mismatches late because we don't validate contracts early enough." Events make for interesting stories. Patterns are what you can actually act on.

If your retro produces a list of events, you've documented your project's history. If it produces a list of patterns, you've improved your team's future.

What You Need Before the Room Meets

The biggest mistake people make with retrospectives is treating them as a meeting you walk into cold. You don't walk into a design review without reading the doc first. You don't walk into a stakeholder alignment meeting without knowing what each person cares about. But somehow teams walk into a retro expecting that an hour of open discussion will produce sharp insights about six months of complex work.

It won't. You need data.

Before the retro, the facilitator — and this should be someone other than the tech lead, if possible, because the tech lead has a stake in the narrative — needs to put together a project timeline. Not a timeline of milestones, but a timeline of key events: when did the scope change? When did you first realize a dependency was going to be late? When did you discover the architectural problem? When did the team velocity drop?

This matters because human memory is selective and self-serving. People remember the final few weeks vividly and the first few months hazily. They remember where they personally had trouble and forget where others did. They confuse the moment a problem became visible with the moment the problem actually started.

A written timeline corrects for this. It creates shared ground truth before you start interpreting. It's very hard to argue about what happened when it's written in front of everyone.

Prep Checklist — Before the Retro

The silent input step is especially important and almost always skipped. When people write their thoughts before hearing others speak, you get uncontaminated individual perspectives. When you go straight to group discussion, the first person to speak sets a frame that everyone else unconsciously responds to. If that person is a senior engineer or a manager, the frame sticks hard.

Writing first levels the playing field. The quietest engineer and the most senior one have equal weight on the page.

Creating Safety

Here's an uncomfortable truth: most retrospectives don't surface the real problems because the people in the room don't feel safe saying them out loud.

This is not a character flaw. It's a rational response to incentives. If you say "I think the tech lead made poor architectural decisions early on" and the tech lead is sitting across the table, you're taking a social risk. If you say "leadership kept changing the priorities" and your manager is in the room, you're taking a career risk. People aren't paranoid for thinking this way. They've seen what happens when people say difficult things in rooms where it's not safe.

The facilitator's primary job before any insight-gathering happens is to create conditions where people can tell the truth. This is harder than running the meeting itself.

A few things that genuinely help:

Don't have managers in the room for the first part. If the team lead or manager is present, people will manage up rather than speak honestly. Run the team's retro first, then share a summary with leadership. The meeting where you reflect is different from the meeting where you report.

The facilitator speaks last. When the person running the meeting shares their view first, everyone else anchors to it. Speak last. Ask questions first.

Normalize disagreement explicitly. Say it out loud at the start: "We probably won't all agree on everything we discuss today, and that's not just okay — it's what we're here for. If we leave this room all saying the same thing, we haven't pushed hard enough." Most people have never been told this in a retro. It changes the atmosphere immediately.

Attack processes, not people. Establish this rule clearly: the question is never "who made the wrong call" but "what made it easy to make that call and hard to catch it." This isn't about protecting people's feelings — it's about getting to the actual cause. If the answer to every question is "Bob made a mistake," you've learned nothing. If the answer is "our code review process doesn't catch that class of issue," you've learned something you can fix.

The Five-Layer Framework

Here's the framework I've used on dozens of projects, and it consistently produces better retros than the standard "what went well / what could be better" format. The key is the layers. Most retros stop at layer two. The real value is in layers three through five.

Framework

The Five-Layer Retrospective

1
What happened? — Timeline reconstruction
Walk through the project timeline together. Not to assign blame, just to make sure everyone shares the same factual account. This is where the prep work pays off. You're anchoring the group in reality before you start interpreting it. Correct factual disagreements here. Save interpretations for later layers.
2
What surprised us? — The surprise register
What happened that you didn't predict? This is more specific than "what went wrong." Something can go wrong in a completely expected way — that's useful information but not a surprise. A surprise is a gap between your model of the project and reality. Each surprise is a clue that your mental model was incomplete.
3
What patterns do we see? — Theme extraction
This is the most important layer and the one teams most often skip. Look at the surprise register. Are multiple surprises related? Do they share an underlying cause? Maybe three separate problems were all caused by the same thing: unclear ownership. Or late integration testing. Or stakeholder involvement that was too infrequent. Name the patterns explicitly — not the events, the patterns. "We repeatedly discovered interface problems late" is a pattern. "The API broke in week eight" is an event.
4
What would we decide differently? — Decision review
Look at the key decisions log. With the benefit of hindsight, which decisions would you change? Be rigorous here: only count it as "would change" if you would have made the better decision with the information you had at the time, not just information you acquired later. Hindsight bias is powerful and seductive. The goal is not to punish past decisions — it's to identify whether your decision-making process needs improvement, or whether you simply didn't have enough information, which is a different problem.
5
What do we want to remember? — Institutional memory
If a new engineer joined the team tomorrow and was about to start a project similar to this one, what would you tell them? This is the layer where you convert everything you've learned into transferable knowledge. Not just "this project had problems with X" but "projects like this one tend to have problems with X, and here's why, and here's what to do about it." This is the seed of institutional memory.

You will not get through all five layers in one hour. Budget ninety minutes minimum. If the project was long or complex, two hours is not unreasonable. The time is worth it. You spent six months on this project. Spending two hours extracting what it taught you is not an extravagance — it's the highest-leverage thing you can do before the knowledge evaporates.

Separating Signal from Noise

Not everything that went wrong is worth fixing. This sounds obvious but it gets skipped constantly, and it's the reason retro action item lists grow to twenty-seven items and nothing gets done.

Some things went wrong because of one-time circumstances that won't repeat. The week your most experienced engineer was out sick during a critical integration. The partner team that had unusually high turnover this quarter. The external API that had an outage nobody could have predicted. These are real events. They're worth documenting. But they don't need a systemic fix, because the system didn't cause them.

Some things went wrong because of recurring patterns that will absolutely happen again. Your team consistently underestimates how long integration work takes. Your stakeholder communication tends to be too infrequent in the first half of the project and too panicked in the second half. You tend to make architectural decisions before requirements are stable enough to make them well. These patterns are expensive and they will repeat — on every future project — unless you change something.

The test for distinguishing them is simple: Would this happen again on a similar project if we did nothing differently? If yes, it's a pattern. If no, it's noise. Only patterns need fixes.

The Dangerous Middle

Watch out for events that look like one-time noise but are actually patterns in disguise. "The infrastructure team was unusually slow this quarter" sounds like noise — maybe they had a hard quarter, it won't happen again. But it might be masking a pattern: "we always underestimate infrastructure dependencies and schedule work in parallel that should be sequential." The event was a one-time circumstance. The pattern it revealed is real and recurring. Dig one layer deeper before you classify something as noise.

From Observations to Decisions: The Rule of Three

The most common thing I see in retrospective notes is a list of ten, fifteen, sometimes twenty observations. "Communication could have been better." "We needed more alignment earlier." "The architecture review happened too late." "Dependencies weren't tracked carefully enough."

These are not decisions. They are weather reports. They describe the climate of the project without telling you what coat to wear next time.

Every observation on your retro list needs to be converted into a decision: something specific that will actually be different on the next project. Not a vague intention — a concrete, verifiable change in behavior.

Observation (not actionable)
  • "Communication with stakeholders could have been better."
  • "We needed more clarity on requirements early."
  • "Dependencies weren't tracked carefully enough."
  • "The architecture was too complex from the start."
Decision (specific, verifiable)
  • "Starting now, the tech lead sends a written status update every Friday, even if there's nothing new."
  • "No implementation begins until the acceptance criteria are signed off in writing by the PM."
  • "All external dependencies are tracked in a shared doc, reviewed in every weekly sync."
  • "Every new service gets a 30-minute architecture review before any code is written."

Notice the difference. The decisions tell you exactly what will be different. You could check on them in three months and know whether they happened or not. The observations are unfalsifiable — there's no way to know whether "communication was better" on the next project unless you define what better means.

Now here's the hard part: the Rule of Three. If your retro generates more than three decisions, enforce a vote. Rank them. Pick the top three. Drop the rest.

This feels wrong. You worked hard to surface all these problems, and now you're deliberately ignoring most of them. But this is the correct move. Human capacity for changing habits is limited. A team that commits to three genuine changes and follows through is infinitely better off than a team that commits to twelve changes and follows through on none. The goal is not to write a comprehensive list. The goal is to actually improve.

The items that don't make the top three don't disappear forever. They go into a team backlog. On the next project, you review the list and see if any of the deferred items are still relevant. Often they're not — the situation changed, or another fix addressed them indirectly.

The Action Tracking Problem

Retro decisions die in the gap between the meeting room and the next project.

You leave the retro with three good decisions. Everyone agrees. The energy in the room is genuine. And then the next project starts, and there's immediately a crisis that needs attention, and people are in execution mode, and the retro decisions are in some notes document that nobody has opened in three weeks, and by the time the new project reaches the point where those decisions would have mattered, everyone has forgotten about them.

This is not a discipline failure. It's a systems failure. The retro generated decisions but didn't create a mechanism to enforce them.

Fix it the same way you'd fix any other tracking problem: put it in your actual tracking system. If your team uses Jira or Linear or GitHub Issues, create tickets for retro decisions the same day as the retro. Assign an owner. Set a due date — not "on the next project" but a specific date by which the process or practice change will be in place. Review the tickets in your next sprint planning.

The harder version of this problem: the best retro decisions aren't tasks to complete. They're habits to build. "Send a weekly status update" is not a one-time task. It's a practice that needs to become automatic. Tasks have a done state. Habits don't.

For habit-type decisions, you need a different mechanism: a calendar recurring event, a checklist in your project kickoff template, a standard agenda item in your weekly sync. Something that puts the new behavior in front of you at the moment you need it, not in a document you'll never read again.

Making Decisions Stick — The Three Mechanisms

The Post-Mortem Is Different from the Retro

A retrospective is about your team's process. How did you execute? What would you do differently? The audience is your team, and the goal is your team's improvement.

A post-mortem is about a specific failure, and its audience is the entire organization. The goal is not just to improve your team — it's to make sure the same failure doesn't happen to any other team, anywhere in the company, ever again.

These are different documents with different purposes. Conflating them is a mistake that dilutes both.

The post-mortem that doesn't work looks like this: root cause analysis that ends at a person. "The engineer who deployed the change should have been more careful." Or: "The on-call rotation should have caught this sooner." These conclusions are useless, and not because they're mean. They're useless because they don't give you anything to change. The next time a deployment happens, a different engineer will make a different mistake that the process makes equally easy. You've explained the past without protecting the future.

The post-mortem that works starts from a different assumption: the people involved were competent and trying to do the right thing. Given that premise, the question becomes: what in the system made this mistake easy to make and hard to catch? Not who, but what.

Same Incident — Two Post-Mortem Approaches

Approach A (person-focused): "The engineer did not follow the deployment checklist, which led to a missing configuration flag, causing the outage. Going forward, engineers will be reminded to follow the checklist."

Approach B (system-focused): "The deployment checklist is a manual document that engineers must remember to open and follow under time pressure. The configuration flag that was missed has no automated validation. We will add a pre-deployment check that fails the deploy if the flag is not set. We will also explore why the engineer felt time pressure during this deployment — is our deployment process creating urgency that bypasses safety checks?"

Approach A explains the incident. Approach B prevents the next five incidents like it.

The Five Whys is a well-known technique for getting from surface explanation to root cause. Ask "why" five times in succession. "The service went down. Why? Because the database ran out of connections. Why? Because connection pooling wasn't configured. Why? Because the team didn't know the service needed it at that load level. Why? Because load testing wasn't part of the pre-launch checklist. Why? Because load testing is treated as optional on fast-moving projects."

Five levels deep, you've gone from "service went down" to "our organizational culture treats certain safety practices as optional." That's a real insight. The first level was just a description.

A warning about the Five Whys: it implies a single root cause, and real incidents almost never have a single cause. They have contributing factors — usually three to five — that all had to align for the failure to happen. If you remove any one of them, the incident might not have occurred. A good post-mortem maps all the contributing factors, not just the chain that led to the most proximate cause.

Blameless Culture Is Not Soft. It's Engineering Pragmatism.

Every organization says it values learning from failure. Very few of them have actually built the conditions for it.

The evidence from decades of studying high-reliability organizations — nuclear plants, aviation, hospital intensive care units — is unambiguous: teams that punish mistakes have worse safety records than teams that don't, even when the teams that don't punish mistakes appear "softer" on the surface. The reason is information flow. When mistakes are punished, people hide them, minimize them, or deflect blame. When mistakes are treated as data, people report them honestly, investigate them thoroughly, and learn from them systematically.

In software, the same dynamic plays out. Google's Site Reliability Engineering book codified this for the tech industry: a blameless post-mortem culture is not about protecting people from accountability. It's about protecting the information pipeline that lets the organization improve. If an engineer fears that admitting to an error will affect their performance review, they will find a way to describe the incident that doesn't implicate them. The true cause of the incident will remain hidden. The next incident will happen for the same reason.

Blameless does not mean consequence-free. If an engineer repeatedly ignores safety protocols or shows a consistent pattern of poor judgment, that's a performance issue addressed through performance management, not a post-mortem. But the post-mortem itself is not the mechanism for that. The post-mortem is a learning document. Keep these things separate or you'll corrupt both.

The Knowledge Transfer Problem

You run a great retrospective. You write a thorough post-mortem. You create a document that would genuinely help any engineer who reads it before starting a similar project.

And then the document sits in a Google Drive folder called "Project Retrospectives 2024" and nobody ever reads it again.

This is the knowledge transfer problem, and it's bigger than just retros. Institutional knowledge in most engineering organizations lives in documents that nobody reads, wikis that go stale, Confluence pages that haven't been opened since the person who wrote them left the company. The knowledge was captured. It just wasn't transferred.

Think of knowledge transfer as a ladder. Each rung is more effective than the one below it. Most post-mortems are stuck at the bottom rung.

1
Write it down
A document, a wiki page, a post-mortem report. Works for people who specifically go looking for it. Almost everyone else will never see it. Effective for exactly zero people who join the team after the doc was written and aren't aware it exists.
2
Share it in a talk or demo
A team presentation, an engineering all-hands, a tech talk. Works for the people in the room. Fades within weeks as the audience's memory of the specifics erodes. Gets you to maybe 30% retention at three months if the talk was really good.
3
Build it into a checklist or process
A production readiness review checklist, a project kickoff template, a standard agenda item for a recurring meeting. Works because it appears at the moment of relevance, not in a document graveyard. Future teams don't need to know about the incident — they just follow the process that the incident created.
4
Build it into a tool
Automated validation, static analysis, deployment checks, monitoring alerts. The highest rung because you've made the mistake structurally difficult. No amount of forgetting, team turnover, or time pressure can override it. The learning from the incident is now baked into the system. This is how organizations at scale prevent incidents from recurring — not by asking people to remember lessons, but by making the system remember them instead.

When you finish a post-mortem, ask explicitly: what's the highest rung we can reach with this learning? Writing it down is the floor, not the ceiling. If the learning can become a checklist item, make it one. If it can become an automated check, build it. The more you can offload to the system, the less you're depending on the fragile transmission of institutional memory through human minds.

Building Your Personal Execution Playbook

Everything so far has been about your team's learning. This section is about yours.

A personal execution playbook is not a document. It's a set of mental models — compressed representations of situations you've seen before that let you recognize them instantly when they appear again, even in disguise.

Every experienced engineer has some version of this, built up over years of projects, whether they're conscious of it or not. The engineer who, in the second week of a project, says "this feels like the situation we had at the old company where the customer kept changing requirements after design was done — we should get explicit sign-off now before it's too late" — that person is running a mental model. They've pattern-matched the current situation against something they experienced before.

The difference between engineers who build this kind of instinct fast and those who build it slowly is one habit: deliberate reflection after every major project.

It doesn't take long. Thirty minutes, alone, after the project ends. No meeting, no audience. Just you and a blank page. The prompt is specific:

The 30-Minute Reflection Prompt

Write your answer to this: "What would I tell myself on day one of this project, knowing what I know now?"

Not what went wrong — what you now understand about this type of project that you didn't understand before. Not events — patterns. Not "the API broke" — "API contracts between teams should be validated early and often, before either team has written significant code against them."

Keep the output short. One paragraph per insight. If you can't explain it in one paragraph, you haven't understood it well enough yet. Compress until it fits.

Over time, you accumulate a library of these paragraphs. They start to cluster into categories. You notice that most of your cross-team execution problems trace back to a few recurring causes. You notice that your stakeholder communication failures follow a pattern — too much in the first month, too little in the middle, too panicked at the end. You notice that the projects where you estimated accurately were the ones where you'd worked in that domain before and had concrete anchors, and the ones where you were way off were domains you were new to.

These are your patterns. They're worth more than any framework in any book, because they're calibrated to your specific experience, your specific context, and the specific types of problems you tend to encounter.

When Playbooks Become Instincts

A playbook only matters if you actually use it. And the problem with long documents is that you don't consult them in the moments you need them. You're in a meeting, something is happening fast, and your brain is not going to say "let me pause and check my notes." Your brain is going to pattern-match against what it knows, and react.

This is why the goal of the playbook is to internalize the patterns until they become instincts — things you notice automatically, without consciously consulting any reference. The way an experienced doctor looks at a patient and notices something that a resident would have missed, not because they're running a checklist in their head but because they've seen it enough times that it registers as a signal before they've consciously thought about it.

Getting from "written down" to "automatic" requires repetition and active recall. When you start a new project, before you dive in, read your playbook. Not to look things up, but to prime your pattern-recognition. You're pre-loading the relevant models so they're active and ready to fire when you encounter the situations.

Over years, you'll find that most new projects no longer feel new. They feel like variations of projects you've seen before. The category might be new — maybe you've never built a data platform before — but the execution dynamics are familiar. The stakeholder alignment challenges are familiar. The moment at which ambiguity crystallizes into clarity is familiar. The warning signs that a project is starting to drift are familiar. You've seen all of these in different shapes.

This is what fifteen years of experience actually looks like, when it's been accumulated deliberately. Not just a lot of projects, but a lot of learned patterns that you can apply to the next project before it makes its mistakes.

The difference between an engineer with fifteen years of experience and one with one year of experience fifteen times is not how many projects they've shipped. It's how many patterns they've extracted and kept.

Making Learning Part of the Culture, Not the Calendar

Everything in this chapter can be done individually — by one engineer, on their own. But the highest leverage comes when a team treats learning as a continuous practice rather than a post-project ceremony.

Teams that learn fast have a few things in common. They talk about what's working and what isn't during projects, not just after. They do short mid-project check-ins — not to fix everything, but to catch drift before it becomes expensive. They share learnings across teams, not just within them. And they treat the question "what are we learning?" as a standing agenda item, not something you only ask when a project fails spectacularly.

The best teams don't have long retrospectives. They have short, frequent reflections. A fifteen-minute conversation at the end of each sprint about what the team learned this week compounds faster than a two-hour retrospective at the end of each quarter. Because the learning is applied immediately, on the next sprint, while the project is still live. Not six months later, on a different project, by which point the specifics are hazy and the momentum is gone.

The Learning Flywheel

None of these meetings is expensive. Together they add up to maybe two hours per month per engineer. The return is an organization that visibly improves over time, where the same problems stop recurring, where new projects start faster because people know what to do, and where engineers feel like they're genuinely developing rather than just grinding.

The Compounding Return

Here is the closing argument for everything in this chapter.

An engineer who extracts one durable pattern from each major project, and works on three major projects a year, has thirty patterns after ten years. Each pattern lets them recognize a recurring situation faster and respond more effectively. The thirtieth project takes dramatically less time to get right than the first — not because they're working harder, but because they've already made most of the mistakes they need to make.

The engineer who ships projects without reflecting on them also has ten years of experience. But they've had roughly the same year ten times. They might be marginally better — experience does teach even without deliberate reflection, just more slowly — but they haven't compounded. They haven't built the library.

The gap between these two engineers widens every year. By the time they're senior staff or principal engineers, they're not even playing the same game. One of them can look at a new project proposal and immediately identify the three things that will make it hard. The other looks at the same proposal and sees what's in front of them: the written requirements, the stated timeline, the listed dependencies. What's missing, what's ambiguous, what's going to change — they'll discover those things as they happen, the same way they did on every previous project.

This is the difference reflection makes. Not on any single project. Over a career.

The retrospective is not the end of the project. It's the beginning of the next one. It's the moment you stop being someone who worked on this project and start being someone who learned from it. That transition doesn't happen automatically. You have to make it happen, deliberately, every time.

Do that, and every project you take on makes you better at the next one. Stop doing it, and projects are just things you survive — expensive, educational-in-theory, but not actually building the thing that compounds.

The fog of war never fully clears. There will always be ambiguity, misalignment, scope changes, and dependencies that are late. But the more projects you've genuinely learned from, the faster you navigate the fog. Not because the fog gets lighter, but because you've been in it before, and you know where the walls are.