New research reveals that AI chatbots can still give incorrect answers to medical questions.
Every day, millions of people use AI chatbots like Claude, Gemini, and ChatGPT to inquire about their health. However, obtaining accurate information from these chatbots can be more challenging than it seems, despite their confident responses. Recent studies highlight the unreliability of large language models.
One study found that chatbots struggled to detect health misinformation. In another study, researchers discovered that ChatGPT Health, launched in January, “under-triaged” over half of the cases presented, including emergencies requiring immediate care. Co-author Dr. Girish N. Nadkarni advises extreme caution when using chatbots for health advice.
Despite AI chatbots’ ability to pass medical exams, real interaction with users exposes unpredictability due to human-chatbot dynamics. To improve accuracy, experts recommend users test models with misinformation, provide precise cues, consider their expertise level, and verify responses with reliable sources.
OpenAI, however, insists its GPT-5 models can accurately refer emergency cases nearly 99% of the time. Nonetheless, recent findings suggest a need for AI to act more like a “good doctor,” engaging users to gather relevant information before diagnosing or advising actions.
This article is for educational purposes only and not intended as medical advice. Always consult a healthcare provider with medical questions.
