Medicine Defeated 'Dr. LLM'—But the AI Battle Isn’t Over!"
- Ozzie Paez
- Mar 19
- 2 min read
Medicine should take well-earned bows for rejecting the notion that software could be trained to be almost human. It was the medical community—including many technologists—who stood firm in asserting that doctors are much more than trained large language models (LLMs). Meanwhile, LLM owners including OpenAI should feel grateful—if not a bit humbled—that the medical community helped expose their LLMs’ limitations. After all, you can’t improve a technology without first understanding its shortcomings and limitations.

Still, let’s give credit where credit is due. AI systems capable of composing poetry in the styles of Shakespeare, Browning, and Dickinson marked an unprecedented leap in sophistication. But the real shock came when ChatGPT-4 outperformed doctors on the United States Medical Licensing Exam (USMLE). Some doctors saw it as a threat to their profession. Others viewed it as a potential solution to worsening physician and nursing shortages. Many dismissed it outright as inadequate, unproven, and risky for patients. Then, after the initial shock and exuberance, the pendulum began to swing back as problematic LLM behaviors including hallucinations became evident.
Now, two years after ChatGPT-4’s release in March 2023, it’s clear that both early optimism and later reactionary skepticism missed the mark. Relentless attacks on AI and LLMs—often from individuals who don’t understand the characteristics of deep learning architectures and artificial neural networks—are undermining much needed innovation. That’s unfortunate because healthcare consistently ranks as one of the least innovative industries of the modern economy. It’s an unsustainable trajectory that invites disruption from innovative outsiders willing to exploit AI, LLMs, and other cutting edge technologies to deliver more compelling patient value and higher returns to investors.
Summary and Implications
The medical community prevailed in challenging the idea that ChatGPT-4 and other LLMs could replace doctors. The question is–What’s next?
ChatGPT-4 was a wake-up call for an industry beset with legacy technologies, care delivery models, and conservative mindsets. Yet, its current response is to focus these remarkable technologies on reducing clinician administrative and overhead burdens—a helpful but insufficient step to address growing care delivery problems.
In the meantime, millions of patients continue using ChatGPT-4 as a surrogate clinician to evaluate and explain symptoms, diagnose health problems, and suggest treatment options. Patient surveys report that many judge ChatGPT-4 as competent and more accessible, convenient, informative, and empathetic than their own doctors and care teams. Our evaluations concluded that relying on LLMs for medical advice is risky for poorly trained patients.
The reality is stark: while the medical community prevailed in the technical arguments over LLMs replacing doctors, millions of patients aren’t listening. There is growing patient distrust of healthcare while ChatGPT-4 is perceived as competent and trustworthy. Patients celebrate its superior availability, accessibility, low cost, and ease of use, which contrast sharply with healthcare’s frustratingly low quality of service. This isn’t a technical argument over LLM’s clinical limitations—it’s a broader debate about who and what will shape the future of patient care.
Are you considering introducing AI technologies like ChatGPT-4 into your practice? What about complementary technologies like continuous patient monitoring? Please reach out to discuss their risks, benefits, and implications.
Comentários