VYASA ASSURANCE · COMPREHENSION TESTING FOR AI

Before AI can be governed, comprehension must be measurable.

We run standardized tests on your AI models and agents to check whether they actually understood what they were doing. You get a report and a score.

Get started See how it works →

01 / WHAT IT IS

Three tests. One score.

Three tests (AGT, CDCT, DDFT), run offline before deployment or live before an agent acts. You get a pass/fail, a score, and a report.

02 / WHO IT'S FOR

Anyone who has to prove it was checked.

If you have to sign a conformity declaration or answer for what an agent does on its own, this is your evidence.

03 / HOW IT WORKS

One test you can watch run.

Does the agent act when it understands, and hold when it doesn't? A jury of three models scores each run.

REQUEST

“Approve this claim?”

→

AGENT

Reads, reasons, prepares to act

→

COMPREHENSION GATE

—/ 1.00

AWAITING INPUT

→

ACTION ✓

Decision executed

→

WITHHOLD ✕

Returns for clarification

AGT · MULTI-JUDGE JURY ACROSS THREE MODEL FAMILIES · κ 0.72

04 / WHAT EACH TEST CHECKS

Three tests, three failure modes.

Published, citable methods, not a black box.

AGENT OVERSIGHT

AGT

Does the agent withhold action when it doesn't understand? Peer-reviewed.

ROBUSTNESS FLOOR

CDCT

Does comprehension hold up under compression? A floor score, not a peak.

FABRICATION BOUND

DDFT

Does it fabricate under drill-down pressure? Bounds the hallucination risk.

05 / WHAT YOU GET

Two products, same tests.

Run the full battery offline before deployment, or a fast version live before every action.

TIER 1 · OFFLINE BATTERY

The Comprehension Conformity Report

Run before deployment or after a major change. This is the evidence you file.

TIER 2 · RUNTIME GATE

The comprehension gate

A lightweight scorer, called before every agent action fires.

06 / HOW IT'S DIFFERENT

We don't build the models we test.

Most eval tooling is built by the same vendors selling the models. We're not, and we plug into your existing GRC stack.

→

Three-model jury, no single-model bias (κ 0.69–0.75).

→

Versioned, disclosed thresholds. Reproducible, auditable.

→

A verification ID on every report and gate call.

07 / REGULATORY FIT

Evidence for whichever regime applies to you.

There's no single global AI law. Same tests, different filing depending on where you operate.

EU AI Act

Binding law. No notified body for most high-risk AI, so you self-assess. Maps to Annex III/IV and Articles 14, 15, 72.

State laws + NIST AI RMF

No federal law, a patchwork of state laws (Colorado, California, New York). Our reports support a NIST AI RMF-aligned program.

INDIA

MeitY AI Governance Guidelines

No standalone AI Act. Governed via IT Act, DPDP Act, and MeitY's voluntary guidelines toward self-certification (ISO/IEC 42001).

TESTEVIDENCE ARTIFACTMAPS TO

AGT

Human-oversight effectiveness report

EU Art 14 · NIST RMF Govern

CDCT

Robustness-floor test report

EU Art 15 · NIST RMF Measure

DDFT

Accuracy & known-limitations record

EU Art 13·15 · State adverse-outcome notices

GATE

Per-decision attestation + monitoring feed

EU Art 72 · Ongoing monitoring, any regime

Get independent test evidence for your AI system.

Start with a free comprehension report on one model.

Get started Request a sample report

VYASA ASSURANCE · COMPREHENSION ASSURANCE · VYASALABS

MethodThe reportContact