The Harder Parts of Blameless Culture
I recently read a great piece on blameless culture that covers the fundamentals well—assume good intentions, focus on systems not people, reward those who speak up. If you haven’t read it, it’s worth your time.
But after years of running post-mortems, I’ve noticed two patterns that don’t get discussed as often. They’re subtler, and in some ways harder to fix.
Self-Blame Is Not Blameless
Most conversations about blameless culture focus on not blaming others. That makes sense—it’s the obvious failure mode. But there’s an opposite side that’s just as damaging: blaming yourself.
I once worked with a leader I deeply respected. A servant leader in the truest sense. When things went wrong, he’d shield the team from any finger-pointing. Great instinct. But he had a habit of turning that blame inward. “That was my fault,” he’d say. “I should have caught that.”
At first, it felt noble. Protective, even. Over time, I realized it was just as problematic as blaming someone else.
When someone takes blame—even voluntarily—the post-mortem shifts. It becomes about a person again, not a system. The team feels relieved (“well, at least we know whose fault it was”) and the harder questions go unasked. Why did the system allow this? What guardrails were missing? What would prevent this from happening to anyone?
Self-blame short-circuits the process. It gives everyone an easy out, and the real improvements never happen.
The best post-mortems I’ve been part of take a third-person view. They look at the system as if no one in the room was involved. It sounds cold, but it’s actually kinder—it means no one has to carry the weight, and the team gets solutions that actually work.
Post-Mortems Are Often Too Shallow
Here’s an example. Years ago, I was facilitating a post-mortem for an incident I hadn’t been directly involved in. A database migration had taken an unexpectedly heavy lock on a table, freezing the production database for several minutes and impacting customers.
The team had a solution ready: document the migration pattern, add it to a list of things to avoid, share the learning. Move on.
It felt tidy. I was honestly ready to move on too.
Then someone on the team suggested a broader fix: stress testing migration code before it reaches production. The real problem wasn’t that someone wrote a bad migration. It was that no one had checked how it would behave under load, against real data volumes. The specific lock issue was just one symptom.
The room got quieter. That’s a much bigger thing to build than a wiki page. It takes time, tooling, process changes.
But that suggestion stuck with me, because that’s exactly the kind of fix that compounds. It doesn’t just prevent the incident you had. It prevents the next five incidents you haven’t had yet. The shallow fix addresses one symptom. The deeper fix addresses the category.
I’ve noticed this pattern especially now, in the age of AI-assisted development, where it’s tempting to ask for a quick fix and move on. The easy answer is rarely the right one when it comes to systemic reliability.
Sometimes you have to encourage the team to go deeper. Not every fix needs to ship tomorrow. But the right fix, even if it takes longer, pays off in ways that a quick patch never will.
It’s Still Worth It
I don’t want to leave the impression that post-mortems are broken. Quite the opposite—I’m a big believer in blameless culture and structured post-mortems. I’ve been doing them for years and couldn’t imagine working on a living product any other way.
They just get better when we watch out for these subtler traps. Don’t let self-blame substitute for system analysis. Don’t let a tidy action item substitute for a real fix.
The goal isn’t a clean report. It’s a more resilient system.