Ship fast.
Break nothing.
Agents that find bugs before your users.
Bring your own LLM · Open-source coming soon
One command to start.
MCP server installed automatically. Use Kery directly from your editor or agent:
Kery Cloud
Everything you love about Kery, fully managed. No infra, no config — just tests that run.
Sound familiar?
Hover any card to see what Kery changes.
You shipped. You have no idea if the happy path still works.
01Kery crawls every route, runs every flow, and hands you a report with no test code written.
You spend more time fixing broken selectors than writing features.
02Kery navigates by accessibility tree and intent, not CSS selectors. Nothing to maintain.
Every test runner dies at the login page.
03Form login, Clerk, Supabase, and OAuth work natively. Kery authenticates as part of the run.
You click through the app before every release. An hour. You still miss things.
04Point Kery at a URL. It maps the app, picks the flows, files a structured bug report.
"Something looks off on checkout." No URL, no screenshot, no repro steps.
05Every finding: screenshot, bounding box, severity, and the exact step that triggered it.
You have no idea which routes have ever been exercised by a test.
06Route dashboard shows every URL: clean, issues, stale, or untested.
You shipped. You have no idea if the happy path still works.
01Kery crawls every route, runs every flow, and hands you a report with no test code written.
You spend more time fixing broken selectors than writing features.
02Kery navigates by accessibility tree and intent, not CSS selectors. Nothing to maintain.
Every test runner dies at the login page.
03Form login, Clerk, Supabase, and OAuth work natively. Kery authenticates as part of the run.
You click through the app before every release. An hour. You still miss things.
04Point Kery at a URL. It maps the app, picks the flows, files a structured bug report.
"Something looks off on checkout." No URL, no screenshot, no repro steps.
05Every finding: screenshot, bounding box, severity, and the exact step that triggered it.
You have no idea which routes have ever been exercised by a test.
06Route dashboard shows every URL: clean, issues, stale, or untested.
You shipped. You have no idea if the happy path still works.
01Kery crawls every route, runs every flow, and hands you a report with no test code written.
You spend more time fixing broken selectors than writing features.
02Kery navigates by accessibility tree and intent, not CSS selectors. Nothing to maintain.
Every test runner dies at the login page.
03Form login, Clerk, Supabase, and OAuth work natively. Kery authenticates as part of the run.
You click through the app before every release. An hour. You still miss things.
04Point Kery at a URL. It maps the app, picks the flows, files a structured bug report.
"Something looks off on checkout." No URL, no screenshot, no repro steps.
05Every finding: screenshot, bounding box, severity, and the exact step that triggered it.
You have no idea which routes have ever been exercised by a test.
06Route dashboard shows every URL: clean, issues, stale, or untested.
Writing automated tests is a separate skill. Your team doesn't have bandwidth.
07Describe what to test in plain English. Kery plans and executes the flow end-to-end.
The same false positive keeps appearing. Real bugs get buried in noise.
08Agent memory tracks known false positives and bug patterns, so findings get sharper over time.
Your pipeline goes green. Users still find visual breaks in prod.
09Review Agent and Filmstrip Reviewer catch visual, functional, and UX regressions, not just crashes.
Confusing copy, missing feedback states, and a11y gaps don't throw errors.
10Kery's UX bug category catches unclear flows, missing affordances, and accessibility gaps.
Bugs surface in production, in support tickets, in Slack from your CEO.
11Kery runs before you ship. 8 bug categories, 3 severity levels, evidence on every finding.
AI testing tools lock you into their cloud, their models, their pricing.
12Bring your own LLM: OpenAI, Anthropic, Gemini, or OpenRouter. Every agent role configurable.
Writing automated tests is a separate skill. Your team doesn't have bandwidth.
07Describe what to test in plain English. Kery plans and executes the flow end-to-end.
The same false positive keeps appearing. Real bugs get buried in noise.
08Agent memory tracks known false positives and bug patterns, so findings get sharper over time.
Your pipeline goes green. Users still find visual breaks in prod.
09Review Agent and Filmstrip Reviewer catch visual, functional, and UX regressions, not just crashes.
Confusing copy, missing feedback states, and a11y gaps don't throw errors.
10Kery's UX bug category catches unclear flows, missing affordances, and accessibility gaps.
Bugs surface in production, in support tickets, in Slack from your CEO.
11Kery runs before you ship. 8 bug categories, 3 severity levels, evidence on every finding.
AI testing tools lock you into their cloud, their models, their pricing.
12Bring your own LLM: OpenAI, Anthropic, Gemini, or OpenRouter. Every agent role configurable.
Writing automated tests is a separate skill. Your team doesn't have bandwidth.
07Describe what to test in plain English. Kery plans and executes the flow end-to-end.
The same false positive keeps appearing. Real bugs get buried in noise.
08Agent memory tracks known false positives and bug patterns, so findings get sharper over time.
Your pipeline goes green. Users still find visual breaks in prod.
09Review Agent and Filmstrip Reviewer catch visual, functional, and UX regressions, not just crashes.
Confusing copy, missing feedback states, and a11y gaps don't throw errors.
10Kery's UX bug category catches unclear flows, missing affordances, and accessibility gaps.
Bugs surface in production, in support tickets, in Slack from your CEO.
11Kery runs before you ship. 8 bug categories, 3 severity levels, evidence on every finding.
AI testing tools lock you into their cloud, their models, their pricing.
12Bring your own LLM: OpenAI, Anthropic, Gemini, or OpenRouter. Every agent role configurable.
Four agents. One run.
Each agent has one job. Bugs from all sources are merged and deduplicated.
Drives a real Chromium session. 15 action types. Flags broken UI, failed submissions, and state drift in real-time.
Post-run pass over the full flow. Catches functional gaps and state mismatches the navigator couldn't see mid-action.
Compares screenshots across the entire journey. Flags layout shifts, broken nav, and visual regressions.
Extracts navigation paths and bug patterns after each run. Proposes memory entries with confidence scores.
Not pass/fail.
Actual bugs.
Screenshot, bounding box, step number, severity. Every finding is actionable.
8 bug categories · 3 severity levels
Buttons with no effect, forms that silently fail, broken state transitions, missing network calls.
Element overlap, layout shifts, broken responsive design, z-index stacking, missing images.
Dead click zones, missing loading states, empty states not handled, broken pagination.
No focus ring, missing aria labels, keyboard traps, interactive elements unreachable.
4xx / 5xx responses, CORS failures, auth token expiry, API timeouts surfaced silently.
Slow time-to-interactive, render-blocking requests, memory growth, excessive re-renders.
Wrong values rendered, stale cache shown, missing fields, type errors surfacing in the UI.
Uncaught exceptions, unhandled promise rejections, React warnings, silent errors logged.
Your login page is not a wall.
Most testing tools stop at login. Kery authenticates natively, including multi-factor, so you can test real app flows behind real auth.
Supports form-based login, Clerk, and Supabase Auth.
More providers coming soon
Short answers.
How does Kery find bugs without any test scripts?
The Navigator agent drives a real Playwright browser using the accessibility tree and screenshots, not CSS selectors. It maps your app, plans flows, executes them, and reports bugs. Nothing to write, nothing to maintain.
What happens when the DOM changes?
Stagehand self-healing kicks in. When an element moves or the layout shifts, Kery finds it by intent rather than a stale selector. Tests don't break when your UI changes.
Which LLM providers are supported?
OpenRouter (recommended), OpenAI, Anthropic, and Google Gemini. Navigator, Review, Auxiliary, and Stagehand roles are each configurable independently so you can optimize cost and quality per task.
Does it work behind authentication?
Yes. Kery supports form-based login, Clerk, and Supabase Auth out of the box. Set credentials once in the dashboard and every run authenticates automatically.
Can I run Kery in CI?
Yes. Use the TypeScript client SDK or trigger runs via the REST API from any pipeline. Results come back as structured JSON you can parse, diff, or fail a build on.
What can I do with MCP?
Scan your app, run tests, pull bug reports, and browse route coverage directly from Claude, Cursor, or any MCP-compatible editor. Over 20 tools available including kery_scan, kery_get_bugs, kery_list_routes, and more.
Does Kery get smarter over time?
Yes. The memory curator agent learns successful navigation paths and records known false positives after each run. Confidence scores decay over time so memory stays fresh, not compounding.
Where does my data live?
Kery is fully self-hosted. Screenshots, run history, and bug reports live in your own PostgreSQL instance. Nothing leaves your environment except the prompts sent to your LLM provider.
Is Kery open-source?
Open-source is coming soon. Watch the GitHub repo for updates.
Join the community
Ask questions, share workflows, follow what's being built, and shape where Kery goes next.
Join DiscordShip fast.
Break nothing.
Bring your own LLM · Open-source coming soon