Skip to main content

HealthBench

By Jessica Hagen | 12:14 pm | May 15, 2025
The offering measures AI's real-world performance and safety around handling realistic medical conversations, using physician-created rubrics and GPT-4.1 scoring.