Original Article
Performance of Large Language Models on Diagnostic Radiology Board–Style Questions: A Comparative Evaluation of GPT-4o, Perplexity AI, and OpenEvidence
Objective: The objective of this study was to compare the diagnostic accuracy and internal consistency of GPT-4o (Generative Pre-Trained Transformer-4 omni), Perplexity AI (artificial intelligence), and OpenEvidence when applied to text-based, specialty-level radiology board questions. Methods: A total of 161 text-based multiple-choice questions from the American College of Radiology (ACR)…