benchmarking
Loading
Past Benchmark Runs
Start New Run
Birdtest1
Started on Jan 26, 2026, 07:08:48
Birdtest1
Model Endpoints
flageval-flagjudge
Cookbooks
hallucination
Percentage of Prompts sent
1
Started on Jan 26, 2026, 07:08:48