Live Benchmark

Benchmark: Can the analyzer tell
good agents from bad?

Five real ERC-8004 agents on Base Sepolia, each evaluated live through the verdict API. The analyzer reads on-chain reputation, feedback history, and validation attestations to assign DELEGATE / WATCH / AVOID verdicts with evidence-backed reasoning.

DELEGATE (70+)
WATCH (40-69)
AVOID (<40)

How It Works

  • 1.Each agent is queried via /api/verdict?agentId=N
  • 2.The API reads on-chain identity, feedback, and validations from Base Sepolia contracts
  • 3.Composite trust score is computed from quality, uptime, and accuracy dimensions
  • 4.Verdict is assigned: DELEGATE / WATCH / AVOID
  • 5.AI reasoning explains the verdict using concrete evidence

All data is read live from Base Sepolia (chain 84532). No mock data. Results may vary as on-chain state changes.