“It is much more important to know what sort of a patient has a disease than what sort of a disease the patient has.”
—Dr William Osler
True story. A young girl complained of right wrist pain following a motor vehicle accident. Unconvinced by an urgent care physician and radiologist that there was no fracture, the girl’s mom uploaded a photo of the x-ray to Grok (the X.com version of AI), which disagreed and did see a fracture. A subsequent visit to an orthopedic surgeon resulted in a cast.
How you feel about this story probably depends on who you are. Patients present it as proof that doctors are sometimes wrong, and the miracle of AI allows them to disintermediate us for fast, precise care. Physicians present it as proof that a person who diagnoses and treats himself with AI has a fool for a doctor. A thousand comments on X.com about this case show not only how passionately we all feel, but also how far apart we are. The findings on the x-ray are either of a growth plate, vindicating the urgentologist and radiologist (with an orthopod happy to cast it and free up a room for the next TKR), or of a Colles fracture, demonstrating how overworked doctors sometimes give incompetent care. I won’t say any more about it than I’d opine on whether the internet dress was white and gold. Decide for yourself.
But the question of whether AI is good for patients is moot: There would be no way to stop patients using it even if it were not. As the articulate patient advocate Dave deBronkart says simply in his Substack of the same name, “Patients use AI.” He and other advocates are compiling and sharing stories of how patients are using tools such as ChatGPT, Perplexity, Claude, and Grok to diagnose themselves, summarize their data, track their outcomes, and participate in their care in an educated way. Clearly, generative AI is very good at connecting clinical dots. (Whether uploading personal health information into these systems is prudent is another question altogether). It can also be powerful in a patients’ hands at finding relevant clinical studies and current best practice.
Perplexity AI, for example, when asked about abdominal pain, delivers differential diagnoses and best treatments like a med student cramming for boards. The danger isn’t inaccuracy, it’s absolutism. Algorithms deal in probabilities; bodies need nuance. Envision Perplexity summarizing the latest oncology trials as the physician gauges whether the widow before her has the heart to endure another round of chemo. Compiling data is great but isn’t the whole story.
We’ve also been warned of AI hallucinations; thankfully, those are becoming rare. New ways of reasoning where AIs check their work as they go (test-time compute) and retrieve new relevant data outside the model’s training (retrieval-augmented generation) have improved their accuracy and reliability. This will only get better.
For example, even a year ago, AI-generated answers to medical questions were reviewed as impressively good. Those replies that were seen as falling short were often incomplete, not wrong. Interestingly, I re-ran some of those same queries and found the answers are better today, more complete and more concise.
Yet, for every miraculous story of a patient who used deep research from OpenAI to discover a novel treatment for a recurrent breast cancer, we have stories like those of patients using AI to prove stretch marks are evidence of bartonella — a diagnosis I don’t expect based on her history or exam. The salience and frustration of these few patients can make it feel like all patient-generated AI is junk. The “Dr Google” story reprised. That’s not the case.
AI is already about as good as physicians at making diagnoses, just as it is at writing code and solving PhD-level math. But making diagnoses is only part of what we do. We also provide knowledge of the patient seeking help. In our office this week, we had a patient write asking why we prescribed Efudix as opposed to Efudix plus Vectical. “The data clearly show the latter is more effective,” he argued based on his ChatGPT query. Yet we know he had a lot of difficulty tolerating Efudix. Adding Vectical would cause even more inflammation. “Got it.” (Score one for the doctor). Additionally, AI struggles with asking those questions that yield crucial diagnostic information. They are clearly better at interpreting findings than at obtaining a good history. And patients love to diagnose themselves.
Okay, (could be a) true story. A young man walked into Dr William Osler’s office in 1888 and complained of a cancer on his abdomen, which he himself diagnosed based on Dr A.W. Chase’s book Dr. Chase’s Recipes; or, Information for Everybody, published in 1863. The patient was drinking a tea of green sheep sorrel as recommended by Chase in his health recipe book. Osler diagnosed him instead with a large cyst which he incised and drained, resolving the patient’s complaint.
Almost 150 years later, not much has changed. Patients have always wanted and needed to actively participate in their care. What’s new is that the tools they use are much better.
Patient-generated AI work has the potential to help them and to reduce burdens for us. But we are in need of some rapprochement first. Remember to put aside your indignation and respond to patients, as Osler advised in his 1889 lecture to University of Pennsylania students, with “infinite patience and an ever-tender charity toward these fellow-creatures; have they not to exercise the same toward us?”
It has always been our job to be the doctor. We ought to let patients help. Fortunately for us, however accurate AI is, medicine still needs the Dasein of a real person to know just what sort of patient has the disease. Seems likely to always be true.
Jeffrey Benabio is chief of dermatology at Kaiser Permanente San Diego. The opinions expressed in this column are his own and do not represent those of Kaiser Permanente. Dr Benabio is @Dermdoc on X.
COMMENTARY
Osler’s Advice and AI
DISCLOSURES
| February 19, 2025“It is much more important to know what sort of a patient has a disease than what sort of a disease the patient has.”
—Dr William Osler
True story. A young girl complained of right wrist pain following a motor vehicle accident. Unconvinced by an urgent care physician and radiologist that there was no fracture, the girl’s mom uploaded a photo of the x-ray to Grok (the X.com version of AI), which disagreed and did see a fracture. A subsequent visit to an orthopedic surgeon resulted in a cast.
How you feel about this story probably depends on who you are. Patients present it as proof that doctors are sometimes wrong, and the miracle of AI allows them to disintermediate us for fast, precise care. Physicians present it as proof that a person who diagnoses and treats himself with AI has a fool for a doctor. A thousand comments on X.com about this case show not only how passionately we all feel, but also how far apart we are. The findings on the x-ray are either of a growth plate, vindicating the urgentologist and radiologist (with an orthopod happy to cast it and free up a room for the next TKR), or of a Colles fracture, demonstrating how overworked doctors sometimes give incompetent care. I won’t say any more about it than I’d opine on whether the internet dress was white and gold. Decide for yourself.
But the question of whether AI is good for patients is moot: There would be no way to stop patients using it even if it were not. As the articulate patient advocate Dave deBronkart says simply in his Substack of the same name, “Patients use AI.” He and other advocates are compiling and sharing stories of how patients are using tools such as ChatGPT, Perplexity, Claude, and Grok to diagnose themselves, summarize their data, track their outcomes, and participate in their care in an educated way. Clearly, generative AI is very good at connecting clinical dots. (Whether uploading personal health information into these systems is prudent is another question altogether). It can also be powerful in a patients’ hands at finding relevant clinical studies and current best practice.
Perplexity AI, for example, when asked about abdominal pain, delivers differential diagnoses and best treatments like a med student cramming for boards. The danger isn’t inaccuracy, it’s absolutism. Algorithms deal in probabilities; bodies need nuance. Envision Perplexity summarizing the latest oncology trials as the physician gauges whether the widow before her has the heart to endure another round of chemo. Compiling data is great but isn’t the whole story.
We’ve also been warned of AI hallucinations; thankfully, those are becoming rare. New ways of reasoning where AIs check their work as they go (test-time compute) and retrieve new relevant data outside the model’s training (retrieval-augmented generation) have improved their accuracy and reliability. This will only get better.
For example, even a year ago, AI-generated answers to medical questions were reviewed as impressively good. Those replies that were seen as falling short were often incomplete, not wrong. Interestingly, I re-ran some of those same queries and found the answers are better today, more complete and more concise.
Yet, for every miraculous story of a patient who used deep research from OpenAI to discover a novel treatment for a recurrent breast cancer, we have stories like those of patients using AI to prove stretch marks are evidence of bartonella — a diagnosis I don’t expect based on her history or exam. The salience and frustration of these few patients can make it feel like all patient-generated AI is junk. The “Dr Google” story reprised. That’s not the case.
AI is already about as good as physicians at making diagnoses, just as it is at writing code and solving PhD-level math. But making diagnoses is only part of what we do. We also provide knowledge of the patient seeking help. In our office this week, we had a patient write asking why we prescribed Efudix as opposed to Efudix plus Vectical. “The data clearly show the latter is more effective,” he argued based on his ChatGPT query. Yet we know he had a lot of difficulty tolerating Efudix. Adding Vectical would cause even more inflammation. “Got it.” (Score one for the doctor). Additionally, AI struggles with asking those questions that yield crucial diagnostic information. They are clearly better at interpreting findings than at obtaining a good history. And patients love to diagnose themselves.
Okay, (could be a) true story. A young man walked into Dr William Osler’s office in 1888 and complained of a cancer on his abdomen, which he himself diagnosed based on Dr A.W. Chase’s book Dr. Chase’s Recipes; or, Information for Everybody, published in 1863. The patient was drinking a tea of green sheep sorrel as recommended by Chase in his health recipe book. Osler diagnosed him instead with a large cyst which he incised and drained, resolving the patient’s complaint.
Almost 150 years later, not much has changed. Patients have always wanted and needed to actively participate in their care. What’s new is that the tools they use are much better.
Patient-generated AI work has the potential to help them and to reduce burdens for us. But we are in need of some rapprochement first. Remember to put aside your indignation and respond to patients, as Osler advised in his 1889 lecture to University of Pennsylania students, with “infinite patience and an ever-tender charity toward these fellow-creatures; have they not to exercise the same toward us?”
It has always been our job to be the doctor. We ought to let patients help. Fortunately for us, however accurate AI is, medicine still needs the Dasein of a real person to know just what sort of patient has the disease. Seems likely to always be true.
Jeffrey Benabio is chief of dermatology at Kaiser Permanente San Diego. The opinions expressed in this column are his own and do not represent those of Kaiser Permanente. Dr Benabio is @Dermdoc on X.
Any views expressed above are the author's own and do not necessarily reflect the views of WebMD or Medscape.
TOP PICKS FOR YOU