Here is something that happened on a real project. A team spent four months building a new payment processing pipeline. They had weekly standups, detailed tickets, a Gantt chart that reached all the way to launch day. On week fifteen, two weeks before launch, someone asked a question nobody had asked before: "Does our payments vendor support the new API format we designed around?" Three emails later, they had the answer. The vendor had deprecated that API eight months ago. The team had been building to a spec that didn't exist anymore.
That wasn't a technical failure. It was a risk management failure. The assumption — that the vendor's API was stable — was load-bearing. The entire architecture rested on it. Nobody wrote it down. Nobody checked it early. And so four months of work had to be partially redesigned in two weeks.
Risk management, done right, would have caught that in week one. Not with a committee, not with a 40-row spreadsheet, not with a dedicated risk manager. With one simple question asked at the right moment: what are we assuming that we haven't verified?
This chapter is about how to build that kind of discipline into the way you run projects — without turning it into a bureaucratic tax that slows everything down and that nobody takes seriously after the first month.
Why Engineers Are Terrible at Risk Management By Default
Engineers are optimists. This is actually a strength most of the time. Optimism is what lets you commit to building something before you know exactly how to build it. Optimism is what keeps you debugging at 2am when anyone sensible would have given up. Optimism makes engineers productive.
But optimism has a shadow side when it comes to risk. The same mental habit that says "we can build this" also says "that probably won't happen" and "we'll deal with it if it comes up." Risks feel abstract and hypothetical. The task in front of you feels concrete and real. So engineers systematically under-invest in thinking about what could go wrong.
There's a second problem: the formal risk management frameworks that exist in most organizations are so heavy that using them feels like punishment. You've probably seen these. The risk register has seventeen columns. There's a scoring matrix. There are monthly risk review meetings. It takes ninety minutes to add a new risk to the system. The process is so painful that engineers stop adding risks, the document becomes stale in week three, and the whole thing becomes a compliance exercise — a box to check, not a tool to use.
The Trap
Heavy process kills the behavior you actually want
When risk management is burdensome, people stop doing it honestly. They add risks that are safe and generic ("delivery delays," "scope creep") and leave out the real ones that feel too politically sensitive to name, or too embarrassing to admit. You end up with a risk register full of risks nobody is actually worried about, and the dangerous stuff lives in someone's head or in a Slack thread from three months ago.
The goal is not to eliminate risk management. The goal is to make it so lightweight and so obviously useful that engineers do it naturally, keep it updated, and actually act on it. That means stripping it down to just the parts that work.
What Risk Actually Is (And What It Isn't)
Before we get into tools and techniques, let's be precise about what we mean when we say "risk." This matters because most project teams use the word loosely, and that looseness lets important things fall through.
A risk is something that might happen and would, if it did, change your plan. It's a combination of two things: the probability it happens at all, and the impact it would have if it did. A risk that's certain to happen isn't a risk — it's a fact about your project, and you should plan around it. A risk that would be completely harmless if it happened isn't worth tracking.
People confuse risks with several other things:
-
1
Issues
An issue is a risk that already happened. Once a risk materializes, it becomes a problem to solve, not a probability to monitor. Conflating the two muddles your risk register and makes it feel like busywork.
-
2
Assumptions
An assumption is something you're treating as true without having verified it. Assumptions are the parents of risks. If an assumption turns out to be wrong, you have a risk — or already an issue. The most powerful risk management practice is surfacing assumptions early and verifying them.
-
3
Constraints
A constraint is a fixed boundary you have to work within — launch date, budget, team size. Constraints aren't risks. They're the rules of the game. Treating them as risks ("what if we run out of time?") confuses the issue. You don't manage constraints with a risk register. You design your plan around them.
-
4
Dependencies
A dependency is work or a decision that belongs to someone else, which your work is waiting on. Dependencies can become risks — if the dependency is late, your plan breaks — but they have their own tracking mechanism, which we cover in Chapter 18. Don't mix them into your risk register.
Keeping these categories clean isn't pedantic. When everything lives in the same list, the list becomes impossible to action. A risk requires mitigation planning. An issue requires someone to fix it today. An assumption requires verification. Mixing them means none of them get the right response.
The Only Two Axes That Matter
Every risk framework in the world reduces to two questions: how likely is it, and how bad would it be? Everything else is decoration.
Some frameworks use a 1-to-5 scale for each axis. Some use percentages. Some have elaborate weighted formulas. For most engineering projects, three levels are enough: low, medium, and high. Here's why three and not five: on a five-point scale, people agonize over whether a risk is a 3 or a 4. That debate consumes more energy than the risk itself. With three levels, the conversation is fast.
Low Probability Unlikely to happen |
Medium Probability Could go either way |
High Probability More likely than not |
High Impact Project-altering |
Monitor Build contingency |
Mitigate Active plan needed |
Act Now Redesign around this |
Medium Impact Painful but survivable |
Accept Note it, move on |
Monitor Check periodically |
Mitigate Active plan needed |
Low Impact Minor inconvenience |
Accept Not worth tracking |
Accept Note it, move on |
Accept Plan a response |
The matrix gives you four responses: Act Now (this risk is dangerous enough that you should change your plan to avoid or contain it), Mitigate (build a specific plan for what you'll do if this happens), Monitor (check on it regularly and escalate if the probability or impact changes), and Accept (log it and move on — the cost of worrying about it exceeds the expected damage).
Notice that "accept" is a legitimate and often correct response. A lot of risk frameworks implicitly assume you should mitigate everything. That leads to paralysis. You cannot remove all risk from a complex project. If you try, you will spend so much time building safety nets that you never actually move. Deciding to accept a risk is not laziness — it's a judgment call that the cost of mitigation exceeds the expected damage, and that's completely valid.
The Key Insight
Risk management is an investment, not a tax
Every hour you spend on risk management is an hour you're not spending building. So you have to be selective. Focus on the risks that are genuinely dangerous — high probability and/or high impact. Accept the rest and move on. The goal is not zero risk. The goal is zero unpleasant surprises.
The Lightweight Risk Register
A risk register is just a list of risks, kept in one place, with enough information to act on each one. The reason people build overly complicated registers is that they're trying to satisfy a checklist rather than serve the team. Strip it down to what you actually need.
Here is a register that works. Seven columns. No more.
| Risk |
Category |
Probability |
Impact |
Response |
Trigger |
Owner |
| Payments vendor API deprecated before launch |
Technical |
Medium |
High |
Verify API support with vendor by end of week 2. If unsupported, evaluate fallback vendor in parallel. |
No written confirmation from vendor by Fri week 2 |
Alice |
| Mobile team can't absorb SDK changes in Q3 |
Dependency |
High |
Medium |
Meet with mobile lead by week 3 to lock a Q3 slot. If slot unavailable, evaluate self-serve rollout path. |
No confirmed slot from mobile by week 3 |
Bob |
| Key engineer leaves mid-project |
People |
Low |
High |
Cross-train second engineer on critical subsystem. Maintain architecture doc current. |
Attrition event / extended PTO |
Alice |
| Load test environment doesn't match prod at expected scale |
Technical |
Medium |
Medium |
Schedule prod-like load test at 4-week mark. Identify discrepancies early. |
Load test scheduled but not run by week 4 |
Carol |
| Legal review of data retention policy takes longer than expected |
Organizational |
Medium |
Medium |
Submit legal review request in week 1. Build feature behind flag so it can be held back if review delays. |
No response from legal by week 3 |
Bob |
Let's talk about each column.
Risk. Write it as a concrete event that happens, not a vague category. Not "technical uncertainty" — that means nothing. "Authentication service doesn't support OAuth 2.1 before our launch date" — that's a risk. Specificity forces clarity. If you can't state the risk concretely, you don't understand it well enough to manage it.
Category. Technical, organizational, people, dependency, external. Categories help you see patterns. If you have eight technical risks and zero organizational ones, that's information — you might be avoiding the uncomfortable organizational conversations.
Probability and Impact. Low, medium, high. Assess them independently. Don't average them. A risk that's highly probable but low impact is different from one that's low probability but catastrophically high impact — even if they score the same in some average. You respond to them differently.
Response. This is the most important column and the one most often left vague. "Monitor" is not a response. Neither is "talk to the team." A response is a specific action: who will do what, by when, to reduce either the probability or the impact of this risk. If you can't write a concrete response, you don't have a response — you have a wish.
Trigger. This column is the one that makes the register actually work. A trigger is the specific condition that tells you it's time to activate your response plan. Without a trigger, you'll check on risks when you remember to — which means you won't check until it's too late. With a trigger, the register manages itself. "If X doesn't happen by date Y, escalate Z." That's a system, not a worry.
Owner. One person. Not a team, not "Alice and Bob." One person is responsible for watching this risk, activating the response if the trigger fires, and keeping the entry up to date. Shared ownership means nobody owns it.
The Five Categories of Risk That Kill Projects
After working on enough large projects, you start to see patterns in what actually goes wrong. The risks that kill projects don't usually come from the domain people spend the most time worrying about — the technical design. They come from five categories, often in this order of how frequently they're underestimated.
1. Unverified Assumptions
Every project rests on a set of assumptions. Some are technical — the library we're using supports this operation. Some are organizational — the other team has capacity to take this dependency. Some are about the external world — the vendor's SLA is what their documentation says it is. These assumptions are invisible load-bearing walls. When one is wrong, the whole thing can come down.
The discipline here is simple: every two weeks, ask your team to list the top three things the project is assuming that haven't been verified. Write them down. Assign someone to verify each one by a specific date. When you verify an assumption, remove it from the list. This takes about ten minutes in a standup. It has an outsized effect on project outcomes because it's pulling risks forward — finding out about problems when you still have time to respond, rather than when you're two weeks from launch.
War Story
A platform team at a large tech company spent six months building a data pipeline to replace a legacy system. Three weeks before cutover, they discovered the legacy system had been running with a subtle bug for two years: it was de-duplicating events in a way the new system didn't replicate. Downstream consumers had built their systems to compensate for this "feature." The assumption — that the new pipeline just needed to produce correct output — was wrong. Correct output broke things that had silently adapted to incorrect output.
Nobody had asked: "What behaviors does the old system have that consumers might have come to depend on?" That question, asked at week one, would have changed the design fundamentally and saved three weeks of emergency remediation.
2. People and Capacity Risks
The people risks that actually kill projects are not usually "the team is incompetent." They are: the person who owns the most critical part of the system takes a new job, or goes on parental leave, or gets pulled onto another fire. Or a team you depend on gets reorganized and your project is no longer their priority. Or the one person who knows how the legacy system works leaves three weeks before migration.
You cannot prevent these things. But you can reduce their impact. The mitigation for key-person risk is knowledge distribution — making sure the critical information doesn't live entirely in one person's head. That means code reviews, architecture documentation, cross-training, and pairing on the most complex parts. Not because the person is going to leave, but because bus factor is a real project risk.
The mitigation for capacity risk — other teams not having bandwidth — is the same as for any dependency: surface it early. The time to find out that the mobile team can't take your SDK change until Q4 is week one, not week twelve. Lock capacity commitments early, in writing, with the right people.
3. Organizational and Political Risks
These are the risks nobody wants to write down. The VP of the other org doesn't actually want this to succeed because it threatens their team's autonomy. The security team has informally decided to block anything that touches the auth layer until their headcount situation is resolved. The company is about to go through a reorg and your project's sponsorship is one exec change away from evaporating.
Writing these down feels uncomfortable. You're naming something political in a document that others might see. But unnamed risks are the most dangerous ones, because they can't be managed. The way to handle organizational risks is to write them honestly in a form that focuses on the event and the impact, not on the personalities. "Sponsorship of this project is concentrated in one executive — if leadership changes, project mandate may be unclear" is a professional way to document the risk that your sponsor might be managed out.
Organizational risks often have mitigations that look nothing like technical mitigations. They look like: build a second sponsor relationship in a different part of the org. Document the business case at the VP level before the review cycle. Get the security team involved at week two rather than presenting to them at week ten.
4. External and Third-Party Risks
Anything that lives outside your control — a vendor, a regulatory deadline, a third-party API, a cloud provider's roadmap — is a risk you can't eliminate, only hedge. The vendor might change their pricing. The API might get deprecated. The regulation might be finalized differently than the draft you designed around.
The mitigation for external risks is almost always the same: reduce coupling and build optionality. Don't design your architecture so tightly around a specific vendor that switching would require a rebuild. Design the interface first, then implement against it, so the implementation is swappable. Build in a scheduled check on external assumptions — especially long-running dependencies where the external world can change while you're heads-down.
5. Integration and Interface Risks
This category is underestimated more than any other. Teams build components in isolation and assume that when they put them together, they'll work. Sometimes they do. But interfaces are where assumptions from different teams collide. Team A assumed the payload size would never exceed 1MB. Team B has been building something that regularly produces 5MB payloads. Neither team was wrong in isolation. Together, they have an outage.
The mitigation is to test integrations as early as possible, not as late as possible. Don't save integration for the end. Define the interface contract at week two, build a stub of the integration at week four, and run an end-to-end test at week six — even if it's not production-ready. Every week you delay integration testing is a week you're accumulating assumptions that may be wrong.
The Pre-Mortem: The Most Underused Tool in Engineering
A post-mortem is a meeting you have after something goes wrong, to understand what happened and prevent it from happening again. A pre-mortem is a meeting you have at the beginning of a project, to imagine that something went wrong and figure out why.
The psychologist Gary Klein developed this technique after observing that teams were systematically overconfident about their plans. When you ask a team to brainstorm risks before a project starts, they hold back. It feels like you're being negative. It feels like you're doubting the team. People don't want to be seen as the one who assumes failure. So the session produces a polite list of minor risks that everyone already knew about.
The pre-mortem short-circuits this by starting from the premise that failure already happened. You don't ask "what could go wrong?" You say: "It is now six months from today. The project failed spectacularly. What happened?"
This reframe changes the psychology entirely. Instead of imagining failure as a hypothetical, you're treating it as a fact and working backwards. The defensiveness drops. People stop protecting their own areas. You get honest, specific, creative answers about the real ways this project could go badly — and often, those answers surface risks that nobody had articulated before.
How to Run a Pre-Mortem
A 45-minute session that pays for itself ten times over
Setup (5 min): Gather the team. Say: "Imagine it's six months from now. The project failed — completely, publicly, embarrassingly. I'm not asking you to think it will fail. I'm asking you to imagine it did."
Silent writing (8 min): Everyone writes down, individually, the top three to five reasons the failure happened. Silent and individual is critical — group brainstorming primes people to repeat what they already heard and suppresses novel ideas.
Round-robin (20 min): Go around the room. Each person reads one item from their list. No debate, no dismissal. Capture everything on a shared surface. Continue until lists are exhausted or time runs out.
Clustering (10 min): Group similar items. Look for themes. What category of failure keeps coming up? Which items surprised you?
Action (5 min): For the top three to five themes, assign an owner to investigate and report back within a week. These become the seeds of your risk register.
The pre-mortem is not a substitute for ongoing risk management. It's the kickoff. You run it at the beginning of the project when your mental model is fresh and your assumptions are most visible. The risks you surface go straight into your register.
Run a second pre-mortem at the midpoint of the project. This one is different — you now have six to eight weeks of concrete information about how things are going. The assumptions that have turned out to be wrong are visible. New risks have emerged that you couldn't have seen at the start. A midpoint pre-mortem catches the category of risk that only becomes visible once you're actually building.
Decision Triggers: Building a System That Manages Itself
The weakest part of most risk management is the point of action. You identified the risk. You wrote it down. You assigned a response. Then what? Most teams check on risks when they remember to — at weekly standup, maybe, or when the project manager sends a nudge. This is fine until it isn't.
The better approach is decision triggers. A decision trigger is a specific, observable condition that, when it occurs, automatically activates a pre-planned response. You define the trigger when you identify the risk, not when the risk materializes. This removes the most dangerous moment in risk management: the moment when a risk is starting to materialize and you have to decide in real time, under pressure, what to do.
Real-time decisions under pressure are bad decisions. You're missing information. You're stressed. You have conflicting pressures pulling you in different directions. You're tempted to assume the problem will resolve itself. If you've pre-defined your trigger and your response, you don't have to make that decision under pressure — you just follow the plan you made when you were calm and thinking clearly.
The Condition
A specific, observable event or state. Not "if things look bad" — that's subjective and requires judgment you might not want to apply in the moment. "If we haven't received written API documentation from the vendor by Friday, October 4th" — that's a trigger you can evaluate with a yes or no.
The Response
The action taken when the trigger fires. Ideally, this is a pre-authorized action — something you've already gotten sign-off on in advance, so you're not going back to ask permission in the moment of a near-crisis. "Alice begins evaluation of fallback vendor. Project manager escalates to VP by Monday."
The Owner
One person who monitors the trigger. In a weekly sync, they answer one question: has the trigger fired? This can take ten seconds. If the answer is yes, the pre-planned response kicks off immediately.
The Escalation
If the response doesn't resolve the situation, what happens next? Name the escalation path now, before you need it. "If vendor evaluation doesn't produce a viable fallback by October 15th, project scope adjustment is required and must be presented to the program director."
Decision triggers are powerful because they convert risk management from a passive, periodic review exercise into an active, real-time monitoring system. The register is no longer something you read. It's something that watches the project and tells you when to act.
There is also a psychological benefit. When you've pre-defined your triggers and responses, you stop worrying about risks continuously. You've offloaded the worry onto the system. You know that if a certain condition arises, a specific response will happen. You can focus on the work in front of you without that low-level background anxiety of "I hope nothing goes wrong."
The Probability Trap: Why High-Impact Low-Probability Risks Are Especially Dangerous
Human beings are bad at thinking about low-probability, high-impact events. We have this tendency to dismiss risks that feel unlikely, even when the consequences of them happening would be severe. Engineers are especially susceptible to this because we're trained to think about systems under normal conditions — the happy path, the expected load, the intended use case.
But the risks that end projects almost always come from the tail. The scenario you thought was too unlikely to plan for. The system that failed in a way nobody had imagined. The org change that nobody predicted.
Here's a useful mental reframe: instead of asking "how probable is this?", ask "what is the cost of a simple mitigation?" For many tail risks, the cost of a lightweight mitigation — a contingency plan, a design decision that preserves optionality, a conversation with a backup vendor — is trivially low compared to the damage if the risk materializes. When the cost of hedging is low and the damage is high, you should hedge even if the probability is low.
The Investor Analogy
Smart investors don't spend most of their time on likely outcomes. They spend disproportionate attention on tail risks — the low-probability scenarios where everything goes wrong at once. Not because those scenarios are likely, but because they're the ones that end the game entirely.
Project execution works the same way. The risks that make you a week late are annoying but survivable. The risks that make you three months late, or require a full redesign, or cause a production outage that gets to the board — those are the ones worth spending serious time on, even when they feel remote.
How to Keep the Register Alive
The failure mode of every risk register is that it's created conscientiously at the start of the project and then gradually abandoned. After the first month, it's stale. After the second month, it's an artifact of history, not a live tool. This is so common that many engineers have concluded risk registers simply don't work. But the problem isn't the register — it's the maintenance discipline.
Here is the minimum viable maintenance routine:
-
1
Five minutes at the weekly sync
Go through the register. Each owner says: trigger fired or not fired? Any change in probability or impact? Any new risks to add? This is not a deep discussion. It's a status check. If a trigger has fired or a risk has escalated, you schedule a separate conversation — you don't resolve it in the standup.
-
2
New risk intake at any time
Risk identification shouldn't wait for the weekly sync. When an engineer discovers something concerning — a behavior they didn't expect, a conversation that revealed an assumption was wrong, an external signal — they should add it to the register immediately, even if it's rough. The owner can refine it at the next sync. A rough entry today is better than a polished entry next week after the damage is done.
-
3
Monthly deep review
Once a month, spend twenty minutes doing a real review. Retire risks that have passed their window or been fully mitigated. Reassess the probability and impact of persistent risks given what you've learned. Look for new risks in the categories you haven't checked recently — especially organizational and people risks, which change slowly and are easy to miss in a five-minute weekly check.
-
4
The midpoint pre-mortem
At the project midpoint, run another pre-mortem (see above). Merge the output into the existing register. By this point you have real project data — you know which parts are harder than expected, which teams are less cooperative than hoped, which technical assumptions have turned out to be wrong. The midpoint pre-mortem is often more valuable than the kickoff one.
The register works only if it is kept honestly. The biggest enemy of an honest register is political pressure to present a clean picture to stakeholders. If you know that surfacing a risk will cause uncomfortable questions, there's a temptation to downgrade its probability or impact until it falls off the register. Resist this. A sanitized risk register doesn't protect you from the risk — it just means you won't have a plan when it materializes.
Communicating Risk to Stakeholders
There's an art to discussing risk with people above you in the organization. Do it wrong and you either panic them unnecessarily, or you come across as someone who doesn't have their project under control, or — worst of all — you bury the risk and they find out about it later, at which point they're angry both about the risk and about the fact that you didn't tell them.
The rule is: communicate risk early, with a paired response. Never walk into a room and say "we have a serious risk" without also being able to say "and here is what we're doing about it." Stakeholders don't want you to tell them the project might fail. They want to know that the person running the project is aware of the dangers and has thought carefully about how to navigate them. Risk plus response demonstrates competence. Risk alone creates fear.
The Risk
State it concretely. "We depend on the payments vendor supporting API version 3. We haven't confirmed this yet." Not "there's some vendor uncertainty." Concrete risk is manageable. Vague risk is terrifying.
Why It Matters
Connect it to the project outcome the stakeholder cares about. "If the API is deprecated, we'll need to redesign the integration layer. Our current estimate is that would cost three to four weeks of additional work."
What We're Doing
Your mitigation and trigger. "Alice is meeting with the vendor account team on Thursday to get written confirmation. If we don't have confirmation by end of week, we'll begin evaluating a backup vendor in parallel."
What We Need
If you need something from the stakeholder — a decision, a relationship, a pre-approval to take a backup action — ask for it explicitly. Don't let the conversation end with you reporting a risk and them nodding. Leave with clear actions.
There is one class of risk you need to escalate immediately, without waiting for the next status update: risks that cross your authority level. If a risk requires a decision you can't make — reallocating budget, changing a deadline that's been communicated to customers, pulling engineers from another team — you need to surface it to the person who can make that decision as soon as you recognize it. The cost of early escalation is a slightly uncomfortable conversation. The cost of late escalation is a crisis.
When You Can't Know the Risk: Managing True Uncertainty
Everything so far assumes you know roughly what the risks are, and the work is tracking and responding to them. But on the most genuinely novel projects — things nobody has built before, migrations of systems nobody fully understands, projects with significant external dependencies on technology that's still evolving — you often don't know what you don't know.
Unknown unknowns are not irrational fears. They are real. And you can't put them in a risk register, because by definition you haven't identified them yet.
The best tool against unknown unknowns is not risk management — it's the spike. A spike is a short, time-boxed investigation specifically designed to surface what you don't know. You take the part of the project that feels most uncertain, the part where your confidence is lowest, and you spend two or three days just exploring it. Not building a production-quality implementation. Just digging in, trying things, reading the source code of the library you're depending on, talking to someone who's done this before.
The output of a spike is not code. The output is a set of risks that you couldn't have written down before. You now know what you were assuming that's wrong. You know what the hard parts actually are. You know what questions to ask. Spikes convert unknown unknowns into known unknowns — and known unknowns can go on the register.
The other tool against true uncertainty is building in slack — time and optionality in your plan to respond to things you can't predict. This is covered in depth in Chapter 10 on the Living Plan. The connection to risk management is this: if your plan is perfectly efficient, with every week fully packed and no room to absorb surprises, then the first unknown unknown that materializes will blow your schedule. Some slack is not inefficiency — it's insurance against the things you couldn't see coming.
The Deeper Purpose
This chapter has been practical — tools, frameworks, templates, meeting formats. But it's worth stepping back to name what risk management is actually for at a deeper level.
Risk management is fundamentally about maintaining your ability to choose. Every time a risk materializes unexpectedly, it removes options from you. You're forced into emergency mode. You're making decisions under pressure, with incomplete information, with stakeholders watching. Your ability to steer the project is replaced by the project steering you.
When you manage risk well, you preserve optionality. You find out about problems when they're still problems, not crises. You have time to evaluate alternatives. You make decisions with clear heads. You can tell stakeholders what's happening before they have to ask. You are in control — not because nothing goes wrong, but because when things go wrong, you have a plan.
That's the real payoff. Not a clean risk register. Not a successful pre-mortem. The actual payoff is that you spend the second half of the project executing, rather than firefighting. You don't arrive at launch week exhausted and surprised. You arrive having already dealt with the hard stuff, and you ship.
The purpose of risk management is not to prevent bad things from happening. It is to make sure that when bad things happen, you are not surprised, you are not unprepared, and you are not without options.
Chapter Summary
What You Should Take Away
Risk is probability × impact. Track only what clears both thresholds. Low-probability low-impact risks are not worth the overhead — accept them and move on.
Keep the register to seven columns: risk, category, probability, impact, response, trigger, owner. Every column does work. No column is decoration.
The five categories that kill projects: unverified assumptions, people and capacity risks, organizational and political risks, external and third-party risks, and integration interface risks. Cover all five.
Run a pre-mortem at kickoff and at the midpoint. The kickoff pre-mortem surfaces assumptions. The midpoint pre-mortem surfaces the risks that only became visible once you were building.
Decision triggers convert passive monitoring into active management. Pre-define the specific condition that activates each response. Decide now what you'll do, so you don't have to decide under pressure later.
Communicate risk with a paired response. Never surface a risk without a mitigation. Risk alone creates fear. Risk plus response demonstrates competence.
Against unknown unknowns: run spikes and build in slack. Spikes convert the unknowable into the knowable. Slack gives you room to respond when reality deviates from plan.
The Principle in One Sentence
Identify your assumptions before they become surprises, assign every risk an owner and a trigger, and spend ninety percent of your worry on the five percent of risks that would actually end the project.
Three Questions for Your Next Project
- What are the top three things this project is assuming that we haven't verified — and who is responsible for verifying each one by when?
- For the highest-impact risks on the register, do we have a specific trigger defined — or are we relying on someone remembering to check?
- If I ran a pre-mortem right now and said "the project failed spectacularly — why?", what would my team say that isn't currently in the risk register?