"Something's wrong with our AI" can describe three different failures. They have different causes, different owners, and different fixes.

Quality. The model's output is wrong, biased, off-tone, or hallucinated. The API is healthy and latency is normal; the product just said something it shouldn't have. Customer-facing.

API. The provider's endpoint is erroring, timing out, or down. Whatever feature depends on that call — fraud detection, summarization, a support bot — breaks or degrades. Customer-facing.

Tooling. The AI tools your engineers use to build — Claude Code, Copilot, Cursor — are degraded. Production is fine; your team's output for the day quietly isn't. Developer-facing, and usually invisible to anyone outside engineering.

Here's how Seismo gives you visibility into the second and third.

The API layer: minutes, not status-page minutes

This is the most direct case, and it's the same shape as monitoring Stripe or Auth0: a vendor dependency with a probe-able endpoint and an availability signal.

Anthropic's own status page tells the story plainly: from June 1 to June 15, 2026, 12 of those 15 days had at least one "elevated errors" incident on a Claude model — 25 incidents in total. Most resolved within an hour, several within fifteen minutes. None were headline outages, but each was long enough to break whatever was calling that model at the time.

Seismo probes your endpoints every 60 seconds. When one of them degrades and an AI provider's API is declared as a dependency for it, Seismo checks that dependency in the same cycle. The alert that reaches you says whether the vendor is degraded globally, whether your own recent deploy is a more likely cause, and what to do next. A lengthy investigation becomes a 2-minute confirmation.

The tooling layer: visibility you can turn on

Claude Code runs on these same models, so across June 1 to June 15, 2026, something Claude Code depends on was degraded on most of those days — even when the incident wasn't labeled "Claude Code." One day it was: on June 3rd, a status page incident specifically named Claude Code's security reviews, code reviews, routines, and web sessions as degraded, for roughly three hours overnight. Nothing in production broke, so no client endpoint degraded and no alert fired. The cost was real, just invisible: every engineer using those features that night had a slower night, and almost none of them connected the two.

Claude Code's status is part of Seismo's global vendor monitoring — the same coverage that watches Anthropic's and OpenAI's APIs. That global view isn't pushed to a client's Slack by default, but it can be. Clients who want "is our slow morning the tooling or just Monday" answered can have it turned on.

Outages are normal. Diagnosis is the cost.

No system runs without failures, and the table below reflects June 1 to June 15, 2026 for each provider. It isn't a list of things going wrong — it's closer to a normal two weeks for any of them. Most of these incidents resolve in under an hour. The outages themselves aren't the expensive part.

Provider	Where to check	What you'll find
Anthropic	status.claude.com	One dedicated page with per-model detail. 25 "elevated errors" incidents across 12 days, June 1 to June 15, 2026.
OpenAI	status.openai.com	One dedicated page, but it mixes API and model errors with consumer ChatGPT issues like login, file uploads, and checkout. June 3rd alone logged three separate incidents.
Google Gemini	No single page	Gemini API status lives on a separate AI Studio page from Google Cloud's overall status, where Vertex Gemini incidents sit alongside hundreds of other GCP services. status.gemini.com — the page most people would guess first — is a cryptocurrency exchange.
GitHub Copilot	githubstatus.com	Bundled into GitHub's overall status. On June 8th, Copilot Chat and VS Code reported that Claude Opus 4.7 was unreliable because of a problem on the model provider's side — a cross-vendor dependency made visible only because GitHub chose to disclose it.

What's expensive is the time spent figuring out what's happening while it's happening: which of four different pages to check, whether the cause is upstream or in your own code, and in the Copilot/Claude case, whether the real cause is a step removed from where you're even looking. That diagnosis time is what adds up, incident after incident, across a team and a year.

Seismo's role is to shorten that step. When your endpoint degrades and an AI provider is declared as a dependency, Seismo checks it in the same cycle and isolates whether the cause is upstream — regardless of how many different status pages that upstream might otherwise span.

When a quality issue traces back to an infrastructure blip

Sometimes a "the AI seems off today" report has an infrastructure cause underneath it: a fallback to a secondary model or path, triggered by an endpoint degradation Seismo already flagged. When that happens, Seismo's alert about it already exists, timestamped, often in an engineering or on-call channel, while the quality report surfaces somewhere else — support, product, a QA review. Same incident, two audiences. Connecting them turns a multi-day investigation into a lookup.

The bottom line

For two of the three ways your AI stack can fail, that's the difference between finding out from a status page (or a customer) and finding out ahead of one.

About Seismo

Seismo is a managed SRE platform built by Seismograph. It monitors endpoints, cloud infrastructure, CDN health, ISP quality signals, and SaaS dependencies — including AI provider APIs — and correlates signals across all of them to deliver trustworthy, actionable alerts before customers notice.

When something breaks, Seismo tells you whether it is your problem or a vendor's problem, whether a recent deployment is involved, and what to do about it — in the same alert, within minutes of detection.

seismograph.ai | hello@seismograph.ai

Three Ways an AI Stack Can Fail and Where Seismo Helps