You have Datadog.
You still don't know why it fired.
Datadog and New Relic are good at collecting data and routing alerts. What they don't do is tell you what caused the incident — that part is still your job. Wachd is the layer that closes that gap: it receives their alerts and automatically produces a plain-English root cause before your on-call engineer picks up the phone.
Wachd does not replace Datadog or New Relic
Your observability stack is already doing its job: collecting metrics, storing logs, and firing alerts when thresholds are crossed. That part works.
The problem is what happens next. The alert reaches your on-call engineer with a title like HighErrorRate firing: checkout-api — and they spend the next 45 minutes switching between Datadog, GitHub, Slack, and a Confluence runbook trying to understand what actually happened.
Wachd runs that investigation automatically. It receives the alert from Datadog or New Relic, collects the relevant context from your stack, and produces a diagnosis before the page reaches anyone.
What you get with Wachd alongside your existing stack
| Datadog / New Relic | + Wachd | |
|---|---|---|
| Alert fires | ✓ Routes the alert to you | ✓ Routes the alert + explains why it fired |
| Log context | ✓ Stores logs in their platform | ✓ Pulls logs, correlates with alert time, summarises in plain English |
| Metric history | ✓ Charts available in dashboard | ✓ Fetches metric timeline around alert automatically — no manual digging |
| Commit context | – You open GitHub separately | ✓ Reads last N commits from affected service automatically |
| Plain-English diagnosis | – You write the summary yourself | ✓ AI produces a 2-sentence probable cause before you open a terminal |
| On-call routing with context | Partial — alert title only | ✓ SMS/voice/Slack with diagnosis already attached |
| Data stays in your cluster | – Sent to Datadog/NR cloud | ✓ Processed in-cluster — nothing forwarded to a third party |
| Self-hosted | – | ✓ Runs on your Kubernetes cluster, Apache 2.0 |
| Per-user fees | ✓ Per host / per user pricing | Free |
Connect your existing stack in minutes
Wachd receives alerts via webhook — the same mechanism your tools already support. No agents, no SDKs, no schema changes.
Create a Webhook notification channel in Datadog. Point it at Wachd's webhook URL for your team. Done.
Add a Webhook destination in New Relic Alerts. Paste the Wachd webhook URL. All alert conditions flow through.
Add a Webhook contact point in Grafana Alerting. Wachd validates the HMAC signature automatically.
Add a webhook_config to Alertmanager pointing at Wachd. Works with any Prometheus-compatible stack.
What the AI analysis actually produces
When an alert fires, Wachd automatically collects the last 10 commits from the affected service, 30 minutes of error logs, and metric history around the alert time. It strips PII, then runs correlation across all three signals to produce:
- →A two-sentence probable cause in plain English
- →The most likely contributing factor — recent deploy, config change, dependency failure, or external signal
- →A suggested action: rollback, hotfix, escalate, or investigate a specific service
- →A link to the most similar past incident, if one exists in your team's history
PII is stripped before the AI sees anything. Works with Ollama (in-cluster, no outbound calls), Claude, OpenAI, or Gemini — your choice.
Add the missing layer to your existing stack
Keep Datadog or New Relic for what they do well. Add Wachd for what they don't: a plain-English explanation of why the alert fired, delivered before your on-call engineer starts digging. Deploys in under 30 minutes. Apache 2.0, free.
Questions? sales@wachd.io or Discord.