antithesis-triage
community>
name: antithesis-triage description: > Triage Antithesis test reports to understand what happened in a run: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details. Load after a run completes or when investigating a failure. compatibility: Requires snouty (https://github.com/antithesishq/snouty), agent-browser (https://github.com/vercel-labs/agent-browser), and jq. metadata: version: "2026-04-29 1cd5f5a"
Antithesis Report Triage
Use this skill to read and triage Antithesis test reports.
Reference files: This skill's references/ directory contains detailed guides for specific tasks. Do NOT read them all up front — only read a reference file when you are told to. Each reference file is mentioned by name at the point where it is needed.
Prerequisites
- DO NOT PROCEED if
snoutyis not installed. Seehttps://raw.githubusercontent.com/antithesishq/snouty/refs/heads/main/README.mdfor installation options. - DO NOT PROCEED if
agent-browseris not installed. Seehttps://raw.githubusercontent.com/vercel-labs/agent-browser/refs/heads/main/README.mdfor installation options. - DO NOT PROCEED if
agent-browseris older than versionv0.23.4. You can upgrade withagent-browser upgrade. - DO NOT PROCEED if
jqis not installed. Seehttps://jqlang.org/download/for installation options.
Gathering user input
Before starting, collect the following from the user:
- Report URL or Tenant Name (required) — A full triage report URL like
https://TENANT.antithesis.com/...or just the tenant name. If neither is provided, check the$ANTITHESIS_TENANTenvironment variable. Only ask the user if you can't guess the tenant name. - What they want to know — Are they investigating a specific failure? Getting a general overview? Comparing runs? This determines which workflow to follow.
Session management with agent-browser
agent-browser has two session variables:
--session: the name of an unique, isolated browser instance--session-name: auto-save/restore cookies by name
Every triage run MUST use a unique --session value. Generate this variable once and reuse it whenever you see $SESSION referenced by this skill.
SESSION=`antithesis-triage-$(date +%s)-$$`
Use --session-name antithesis on the FIRST agent-browser command that references a new $SESSION. This creates the session and restores saved cookies. Subsequent commands for the same $SESSION do not need --session-name — the session already exists.
Make sure you close the unique live session when triage is complete.
agent-browser --session $SESSION close
Authentication
Do NOT navigate to the home page just to check auth. Instead, navigate directly to your target URL (report, runs page, etc.) using the session-creation command:
agent-browser --session "$SESSION" --session-name antithesis open "$TARGET_URL"
agent-browser --session "$SESSION" wait --load networkidle
agent-browser --session "$SESSION" get url
If the URL starts with https://$TENANT.antithesis.com then you are authenticated. If it redirected to a login page, you need to authenticate — read references/setup-auth.md.
Runtime injection
The triage skill makes heavy use of an injected runtime API. Inject the runtime into the current page after navigation completes:
cat assets/antithesis-triage.js \
| agent-browser --session "$SESSION" eval --stdin
The runtime registers methods on window.__antithesisTriage. Call those methods with agent-browser eval.
Method call pattern:
agent-browser --session "$SESSION" eval \
"window.__antithesisTriage.report.getRunMetadata()"
agent-browser eval awaits Promises automatically, so async and sync methods
use the same call pattern.
Error handling: Runtime methods throw on error, which causes
agent-browser eval to return a non-zero exit code. Check the exit code
to detect failures — no output parsing required. The error message describes
what went wrong (e.g. wrong page, element not found, timeout).
If window.__antithesisTriage is missing, inject assets/antithesis-triage.js and retry the method call.
NEVER run agent-browser calls in parallel. They are stateful calls with side-effects, thus parallel calls can break or return confusing results.
Navigation and loading
Each Antithesis page loads in content async. After navigation to any Antithesis page, follow this pattern:
First, wait for networkidle:
agent-browser --session "$SESSION" wait --load networkidle
Then, check the url to see if you got redirected to an authentication page:
agent-browser --session "$SESSION" get url
If you hit an authentication page, stop and reauthenticate before continuing.
Then, inject the runtime:
cat assets/antithesis-triage.js \
| agent-browser --session "$SESSION" eval --stdin
Finally, eval the page-specific wait function to wait for all asynchronous chunks to finish loading:
- Report page:
window.__antithesisTriage.report.waitForReady() - Logs page:
window.__antithesisTriage.logs.waitForReady() - Runs page:
window.__antithesisTriage.runs.waitForReady()
Each wait method polls for up to 60 seconds by default. On success it
returns { attempts, waitedMs }. On timeout, the method throws causing
agent-browser eval to return a non-zero exit code.
Use the lower-level boolean checks when you need a one-shot probe:
- Report page:
window.__antithesisTriage.report.loadingFinished() - Logs page:
window.__antithesisTriage.logs.loadingFinished() - Runs page:
window.__antithesisTriage.runs.loadingFinished()
If the report page still does not become ready, inspect status:
- Report page:
window.__antithesisTriage.report.loadingStatus() - Logs page:
window.__antithesisTriage.logs.loadingStatus() - Runs page:
window.__antithesisTriage.runs.loadingStatus()
Handling error reports
After every report waitForReady() call, check result.error. If it is
present, read references/error-reports.md for the error report workflow.
Workflows
Summarize recent runs
Read references/run-discovery.md to get a list of recent runs. Then summarize them in a report.
Looking up a specific run
To lookup a specific run (report), read references/run-discovery.md. Then continue with other workflows as needed.
Make sure NOT to filter by text or status unless explicitly asked. If you are trying to find the most recent run for a project, just look at recent runs with any status first. Only filter by text or status if you can't find what you are looking for.
Triage a run
- Read
references/run-info.mdto load information on a run - Read
references/properties.mdto load properties - Cross reference failed properties with findings, review passed/failed counts
- Build a detailed summary of the run including a review of all failures as well as flagging any new failures.
Investigate failed properties
- Read
references/properties.md- usegetPropertyExamples()to extract properties with their examples and learn how to download logs - Read
references/logs.mdto learn how to understand logs - For each property to investigate:
a. Pick the first failing example
b. Call
getExampleLogsUrl(propertyName, index)to get the example's log URL c. Download the example's log usingdownload-logs.shd. Analyze the downloaded log locally e. If you aren't certain what caused the issue, consider downloading another example's log from the same property. Passing logs can be useful to compare against. - Cross-reference the log against the source code of the system under test (SUT) if you have access to it.
- Deeply investigate the failure to develop an understanding of the timeline of events which led up to and potentially caused it.
- Report your findings.
Important: Make sure you download and review example logs and the source code of the SUT if you have access to it. The property status and assertion text alone are not sufficient — the logs provide the actual runtime context needed to understand the failure.
Verify cascade vs independent failures
When you suspect a failure might be a cascade from an earlier failure (e.g.,
property X always fails after property Y), do not rely on a handful of
examples from the triage report. A few examples can mislead — use the
antithesis-query-logs skill to test the hypothesis across all timelines:
- Use
antithesis-query-logsto count total failures of the target property - Run a temporal query ("not preceded by" the suspected upstream failure)
- Compare counts: if the count drops, the difference is cascade failures; if it stays the same, the failures are independent
- Report the actual numbers — e.g., "53 total failures, 53 remain after filtering out upstream-X → failures are independent" or "53 total, 7 remain → 46 are cascades from upstream-X"
Do not generalize from a small sample. If you inspect 2-3 examples in the triage log viewer and they all show the same upstream failure, that does not mean all instances are cascades. The temporal query gives you the true count.
General guidance
- Always ensure you are authenticated first.
- Use disposable sessions. Generate a unique
SESSIONfor each triage run. - Inject the runtime after navigation. After every
open, after link clicks that may change pages, and after reopening the report from a finding route, wait untilnetworkidle, injectassets/antithesis-triage.js, then use the matching*.waitForReady()method before continuing. - Never run
agent-browsercalls in parallel. - Retry missing-runtime errors by reinjecting. If a command fails because
window.__antithesisTriageis undefined or missing, inject the runtime and rerun the same method. - Keep report evals on the main report view. If you click into another page by accident, reopen the original report URL before using report queries again.
- Download log files for local analysis. Whenever possible try to download log files locally rather than using the web-ui log viewer.
- Review logs before concluding on failures. When a failed property has example rows with log links, download + analyze the logs before declaring a root cause. Some properties have no examples or logs — for those, the status alone is the evidence.
- Prove cascade hypotheses with log queries, not samples. If you suspect a failure is a cascade from an earlier failure, use the
antithesis-query-logsskill's temporal queries to determine the true scope. Do not conclude from a few triage examples — the Logs Explorer searches all timelines and gives exact counts. - Present results clearly. When reporting property statuses, use a table or list. When reporting log findings, include the virtual timestamp, source, container, and log text.
Self-Review
Before declaring this skill complete, review your work against the criteria below. This skill's output is conversational (summaries, tables, analysis), so the review should happen in your current context. Re-read the guidance in this file, then systematically check each item below against the answers and analysis you produced.
Review criteria:
- Every property status reported (passed, failed, unfound) was extracted from the actual triage report, not inferred or assumed
- Findings reference specific data from the report — property names, assertion text, log lines, timestamps
- Failed properties with available logs include actionable context: the assertion text, relevant log lines, and timeline context. Conclusions about failures are grounded in log evidence when logs exist
- The summary distinguishes between what the report shows and what you interpret or recommend
- If comparing runs, differences are grounded in data from both reports, not just one