Blog/Why Wachd doesn't stop at the first plausible culprit
May 21, 2026·5 min read

Why Wachd doesn't stop at the first plausible culprit

A recent deploy is always the first suspect. It's often right — but not always. When the first answer is wrong and an engineer has already acted on it, you've added 40 minutes to your MTTR and burned the trust that should have been the tool's to earn.

The fastest thing an AI tool can do with a production alert is look at the most recent change and blame it. Deploy happened 6 minutes ago, error rate is up, the model says "probable cause: recent deployment." It sounds authoritative. It's often right.

The problem is what happens when it's not right. Your on-call engineer rolls back a perfectly good deploy, the error rate doesn't recover, and now they're 25 minutes into the incident with less confidence than when they started. The tool gave them a wrong answer presented as a correct one, and they acted on it.

This is the trust failure that kills AI tooling adoption on engineering teams. Not the tool getting it wrong — every tool gets it wrong sometimes. The failure is the tool presenting a guess with the same confidence as a finding.

Three signals, not one

When an alert fires, Wachd collects three independent sources of evidence before forming a conclusion: the last commits to the affected service, the error logs from the 30 minutes around the alert time, and the metric history for the same window.

Each signal is queried independently. The correlator then looks for convergence. If the log spike, the metric anomaly, and the commit timeline all point to the same change — that's a strong signal. If the logs and metrics point to an upstream dependency failure but the commit history shows nothing unusual — the conclusion is different.

The key word is "convergence." A single signal producing a plausible story is not the same as three independent signals converging on the same explanation. The difference is what separates a finding from a guess.

What happens when signals conflict

Sometimes they don't converge. Metric anomaly in service A, but error logs point to a timeout in service B, and no commits in either in the last two hours. That pattern looks like an external dependency or an infrastructure issue — not a code change.

When signals conflict, Wachd says so. It reports what each signal shows and flags that they don't point to the same cause. It doesn't pick the most confident-sounding story and present it as a conclusion.

This is more useful than false confidence. An engineer seeing "metric anomaly in checkout-api, but error logs show timeouts in payment-service, no recent deploys in either" knows to look upstream. That's a different investigation than chasing a deploy that wasn't the cause.

Showing your work is not optional

The analysis Wachd delivers always includes the evidence it used. Not just the conclusion — the specific log lines that showed the error pattern, the commit hash that introduced the change, the metric values before and after the anomaly.

This lets engineers verify the reasoning. If the conclusion is wrong, they can see which part of the evidence led the analysis astray and correct it in under a minute. If the conclusion is right, they have everything they need to act without opening another tab.

Tools that show only a conclusion are asking engineers to trust a black box. That works until it doesn't — and when it doesn't, the tool is the first thing to get turned off. Transparency is how you build the kind of trust that survives the incident where the AI was wrong.

Why this matters for teams, not just incidents

Over time, the incidents Wachd has seen accumulate into a pattern. The correlator can recognise when a new alert looks like a past incident — not just by alert title, but by the shape of the evidence. Same service, same metric pattern, similar log signature. When that happens, the analysis includes a link to the prior incident and what resolved it.

That institutional memory only works if the underlying analysis is accurate. If Wachd is confidently wrong in incident 12, the similarity match to incident 47 six months later will route your team toward the same wrong conclusion. The quality of the diagnostic layer compounds over time — in both directions.

Getting the correlation right from the start, requiring convergence before calling something a finding, and surfacing conflicting evidence rather than hiding it — these are not edge cases in the product design. They are the foundation the whole thing sits on.