Arora, Rahul K., et al. "HealthBench: Evaluating Large Language Models Towards Improved Human Health." arXiv preprint arXiv:2505.08775 (2025).
Obermeyer, Ziad et al. “Dissecting racial bias in an algorithm used to manage the health of populations.” Science (New York, N.Y.) vol. 366,6464 (2019): 447-453. doi:10.1126/science.aax2342