Internal medicine resident and AI healthcare researcher Dr. Peter Brodeur joins Dr. Michael Jerkins for a grounded, practical conversation on how artificial intelligence is actually showing up in medicine today and where it’s headed next. From AI scribes and automated workflows to clinical decision support, Dr. Brodeur breaks down what’s delivering real value versus what’s still more promise than payoff.
The discussion explores AI’s impact on physician burnout, workflow efficiency, and patient engagement, while unpacking critical concerns around safety, liability, automation bias, and the risk of deskilling clinicians. Dr. Brodeur also shares insights from recent research on where AI performs best, why “human-in-the-loop” systems matter, and how medical education may evolve alongside these tools.
Can AI truly improve care without compromising clinical judgment? And how should doctors think about adopting new technology without losing the human side of medicine?
Dr. Brodeur closes with a look at what responsible AI integration could mean for the future of healthcare and why cautious optimism, not blind adoption, is the path forward.
Here are 5 main takeaways from the discussion with Dr. Peter Brodeur:
1. The true power of AI in healthcare lies in augmenting human judgment, not replacing it
AI can streamline tasks like documentation, intake, and routine decision-making, but the core of medical practice—trust, relationship, and nuanced judgment—remains inherently human and irreplaceable.
2. The implementation of AI presents a spectrum from incremental workflow improvements to a potential for systemic reinvention
Short-term gains focus on integrating AI into existing workflows for efficiency, but transformative change requires reimagining healthcare delivery models driven by AI as a foundational partner.
3. Human-AI collaboration is constrained by the fundamental theorem of informatics
Humans plus machines only outperform humans alone under specific modulated interaction strategies; otherwise, AI can diminish human skill, highlighting the need for deliberate design in workflow integration.
4. AI’s impact on clinical skills is double-edged, risking both automation bias and de-skilling
Doctors tend to over-rely on AI outputs, reducing their diagnostic reasoning capacity; without safeguards, future clinicians may lose critical expertise.
5. Relationship and communication remain the most vital elements of effective healthcare
Despite AI’s capabilities, human connection—empathy, rapport, and trust—cannot be replicated by machines and is essential for patient-centered care.
Transcript
Peter Brodeur:
You could imagine these tools fitting into current workflows, whether it’s intake tools or helping you write notes and discharge summaries. The other lens, which is the one I’d like to take in my future career, is that this is probably an opportunity to use these technologies to reinvent the healthcare system.
Michael Jerkins:
Welcome back to another episode of The Podcast for Doctors (By Doctors). I’m your host, Dr. Michael Jerkins. Unfortunately, once again, Dr. Ned Palmer is absent today.
I’m especially excited about this conversation because today’s guest is not only a physician, but someone working at the bleeding edge of AI research and its application in healthcare. Everyone’s talking about AI right now. Beyond the Super Bowl ad wars between GPT and Claude and others, what really matters is this: What is AI’s impact on patients? What is its impact on doctors? Is all the anxiety around it justified? What are we missing? And what’s coming next?
Today on The Podcast for Doctors (By Doctors), we’re joined by Dr. Peter Brodeur, a current third-year internal medicine resident and rising cardiology fellow at Harvard Medical School’s Beth Israel Deaconess Medical Center. He attended the Warren Alpert Medical School of Brown University. His research focuses on human-computer interaction and large language model clinical reasoning. He’s a reviewer for Nature Medicine and NEJM AI, holds a master’s degree in economics, and previously worked as a life sciences strategy consultant.
Welcome to the podcast.
PB:
Thanks for having me. I appreciate it.
MJ:
That’s a lot. You’ve done a lot already. What’s your ultimate goal? After residency and cardiology fellowship, what do you want your life to look like?
PB:
As you mentioned, I’m heading into cardiology fellowship and definitely want to be an academic clinician. I’ll always be a doctor first. That’s why we went into medicine.
That said, my interests are broad. Right now, I’d love to see AI implemented into the practice of medicine from an evidence-based standpoint. That means continuing to do research and helping guide how these technologies are adopted in clinical care.
MJ:
Given your background, do you think the U.S. healthcare system is broken, or is it just misaligned?
PB:
When I worked as a strategy consultant, I gained insight into how complex the U.S. healthcare system really is. I focused a lot on drug pricing, rebates, and pharmacy benefit managers, and saw just how interconnected and complicated everything is.
Whether it’s broken or not, I think we’re at a crossroads. There are two ways to look at AI implementation. One is incremental—using AI within existing workflows for things like intake tools, notes, and discharge summaries.
The other lens, which interests me more, is using these technologies to reinvent how we deliver care. AI could function as an extension of care teams and fundamentally reshape healthcare delivery. So instead of labeling the system as broken or not, I’d say we have a real opportunity to change the landscape.
MJ:
It’s easy to build small AI tools that tweak workflow, but structural change is much harder. I recently saw data suggesting AI scribes may reduce burnout subjectively, but don’t significantly reduce documentation time. How should we think about that?
PB:
There have been several studies on AI scribes. An early one in JAMA looked at subjective burnout and documentation burden. Physicians reported feeling better with AI scribes.
Subsequent randomized studies in NEJM AI showed similar subjective improvements. However, objectively, the time savings were minimal—about 20 seconds per note, roughly equivalent to what human scribes can achieve.
So there’s a cognitive load benefit. Editing a draft note feels easier than writing one from scratch. But for meaningful objective gains, companies are now integrating workflow tasks—like handling prior authorizations in real time during visits instead of after the fact. That’s where we may see real impact.
MJ:
As a primary care med-peds physician, I’d argue primary care is one of the hardest jobs in healthcare. There simply aren’t enough hours to cover every USPSTF recommendation in a visit. How can AI help us spend more time with patients?
PB:
We’re already seeing some changes in encounter dynamics. AI scribes allow physicians to face patients rather than screens.
There was a study in China showing that AI intake tools improved specialty referral coordination and reduced consultation time. That’s a productivity gain.
Looking forward, AI could act as an extension of the care team. For example, Utah partnered with a company called Doctronic to automate prescription renewals. Many inbox tasks are routine and below the top of a physician’s license. Those could potentially be delegated safely to AI systems.
MJ:
Let’s talk about Doctronic. My understanding is that Utah launched a pilot allowing AI to automate refill requests for stable patients meeting certain criteria.
Reactions seem split. Some say it’s crossing a line. Others say it’s a necessary evolution. What’s your read?
PB:
It raises regulatory questions. States oversee medical licensure, while the FDA regulates medical devices. If an AI tool is effectively practicing medicine, whose jurisdiction does it fall under?
From an implementation standpoint, I assume they’re starting with low-risk medications and stable patients. They’ve argued this addresses physician shortages, especially in primary care.
The bigger concern is safety and scope. Where does it stop?
MJ:
I know primary care physicians who would welcome it. Imagine supervising multiple nurse practitioners and suddenly inheriting hundreds of refill requests daily. AI could meaningfully reduce that burden. But yes, the slippery slope concerns are real.
PB:
Exactly. Safety oversight is critical. Early deployments will likely require a “human in the loop,” meaning every AI action is reviewed by a physician.
That’s labor-intensive but probably necessary in the short term. Ultimately, we want AI to improve patient outcomes and physician well-being. And we must uphold the core ethical principle of medicine: do no harm. Getting safety right is essential as we integrate these tools into real clinical workflows.
MJ:
I totally agree with that. And speaking of harm, how should doctors think about the liability of these AI tools when they make mistakes? My understanding is that Doctronic was able to secure some sort of insurance to shore up some of their risk. How should I think about liability when AI is applied to clinical care delivery?
I’ll bring up a case from medical school as an example. This was long before AI was implemented into healthcare workflows. A co-medical student saw a patient and, as we do, signed their note. The attending then attested it as a progress note. Wanting to show off their knowledge, they included a differential with about 30 different diagnoses. Within that list was a zebra diagnosis that the patient ultimately had—but it was missed. That later became a litigation issue because it was charted. The question became: if it was in the differential, why didn’t the attending consider it?
As I think about AI, one of the challenges is that it’s often very verbose. If you’ve used these systems, you know they can generate long differentials. If that becomes integrated into the chart, and it’s documented in the patient record, I think the physician will have some liability for what the AI is saying.
At least in the near term, I think it will be similar to radiology. Radiologists don’t get sued because the CT scanner malfunctioned. Ultimately, the physician is still responsible. I suspect AI will be viewed similarly—we’ll remain liable for the final decision-making.
The biggest use I see among practicing clinicians right now is with referencing and information-gathering tools like OpenEvidence, Doc Sembode’s model, and I believe even UpToDate has something similar. I couldn’t afford the subscription anymore.
But now there are also more direct-to-consumer models like GPT Health and Claude for Healthcare.
MJ:
What models do you see doctors using—especially in your hospital—and how effective are they at influencing and improving care?
The first thing I’ll say is that I don’t endorse putting patient data into these systems. They are not HIPAA-compliant environments. From that standpoint, you shouldn’t be entering private patient information. However, if you’re providing hypothetical information or asking a general question—like, “What’s the evidence on which medication to start first in a patient with HFrEF?”—tools like OpenEvidence can provide solid answers with references to guidelines.
If you came to our hospital and saw our internal medicine admitting room, you’d notice we have two screens. One is Epic. The other is typically something like OpenEvidence. In previous years, it may have been UpToDate, but that’s shifted over time. I see people getting information faster with OpenEvidence. UpToDate can feel static and overwhelming. You might open an article that’s extremely long, scroll down to the summary and recommendations, and still not find exactly what you need. Today’s tools offer more tailored responses to specific questions.
As for ChatGPT, I generally try to think first as a physician. I develop my assessment, plan, and differential diagnosis. Then—without using real patient information—I might create a hypothetical case and input it into ChatGPT to see if it brings up something I hadn’t considered. I use it almost like a second opinion. I think that’s where some physicians are finding it most helpful.
We could go deeper into human-computer interaction as a field, but that’s a whole separate discussion.
PB:
What’s funny is that before UpToDate, it was Epocrates. There’s always another resource that becomes more efficient. I’m curious—are you finding that patients are also coming in having used these direct-to-consumer models and asking questions based on them?
Yes. When we talk about patient-facing AI, I’m actually a big proponent of patients coming in with thoughtful questions about their condition. These tools allow patients to take more ownership of their health. They can engage more deeply because they’re able to get answers on their own.
We’ve all had the experience of leaving an appointment and thinking, “I should have asked that.” Now, patients can input that question into ChatGPT and get an answer. With the advent of GPT Health, there’s more endorsement of that concept.
That said, there was a study in the New England Journal of Medicine AI that showed patients couldn’t reliably distinguish between responses from a large language model and responses from a physician. Whether the advice was good or bad, patients often couldn’t tell. That’s concerning. We can’t assume patients can provide oversight in this process.
PB:
So we need to ensure these systems are safe. You could argue that some products may be premature in being marketed as healthcare tools that can answer urgent medical questions.
Right. But at the same time, if patients come in with more informed questions, we can have deeper discussions. Instead of spending time explaining basic medication side effects—something they could have looked up—we can discuss more nuanced topics, like comparing two medications rather than just reviewing one.
MJ:
Is there evidence yet that doctors are getting worse clinically because of AI use?
PB:
There are two related concepts. The first is automation bias. That’s when doctors defer to the AI’s reasoning and abandon their own thought process. There was a study where physicians were asked to perform diagnostic reasoning. Researchers inserted incorrect reasoning traces into some of the AI outputs. Physicians exposed to incorrect reasoning performed worse than those using correct outputs. That suggests automation bias is real, and we need to be cautious.
The second concept is de-skilling. Whether that’s good or bad is debatable. There are probably tasks we’re comfortable giving up. A good analogy is GPS. Twenty years ago, most people could navigate without one. Now, I probably couldn’t get to a supermarket two miles away without GPS. That’s de-skilling.
There have been studies in gastroenterology, particularly in endoscopy, looking at adenoma detection rates before and after AI implementation. Interestingly, in the group that had been exposed to AI but wasn’t actively using it in certain cases, there was about a six percent absolute reduction in adenoma detection rates. That suggests some degree of skill degradation. It highlights the importance of carefully optimizing human-computer interaction.
MJ:
Are you worried about de-skilling the physician workforce further?
From my standpoint, I’m less concerned about current attending physicians.
PB:
I’d say I’m much more worried about the next wave of physicians who are exposed to these products. With social media, our attention spans are already so short. I worry about automation bias because it creates a shortcut—if you put every difficult case into ChatGPT, it gives you an answer and you run with it. But did you actually learn anything?
Some of the smartest doctors I’ve come across are primary care physicians and internists who have a high threshold for consulting other services. They push themselves to figure things out rather than immediately relying on a consultant. If medical students start using ChatGPT for every challenging case, you can imagine how their clinical reasoning skills might be blunted ten years from now. That worries me.
Walk me through the head-to-head studies. I know they’re limited, but in terms of clinical diagnostic accuracy—AI alone, AI plus human, human alone—how should I think about it? There have been a lot of studies. Is there a summary?
MJ:
There are generally two ways we study this: diagnostic reasoning and management reasoning. We separate them because management reasoning is more nuanced. You can imagine two physicians managing a patient with type 2 diabetes very differently, yet both appropriately. Diagnostic reasoning, on the other hand—whether someone has diabetes or not—is more binary. So those reasoning skills are distinct.
In management reasoning, there was a study in Nature Medicine comparing physicians using GPT-4 versus physicians using conventional resources like UpToDate. In that study, physicians using GPT-4 outperformed the group using traditional resources.
PB:
The same type of study was done from a diagnostic reasoning standpoint by a group out of Stanford, and they didn’t see that same signal. That raised questions about whether modifying the human-computer interaction could improve outcomes. Ideally, you’d want a complementary effect—human ability plus AI ability. I know when I’m wrong. I know when AI is wrong. In theory, we should complement each other. But that wasn’t consistently true in the literature.
Researchers have tried to modulate the interaction. For example, the human goes first, then GPT-4 provides a critical assessment of the physician’s reasoning trace. In a follow-up study from the same Stanford group, that approach did show the expected improvement. So there may be ways to structure the interaction that produce better outcomes than humans alone.
Another interesting concept is the fundamental theorem of informatics, which states that humans plus machines should outperform either humans alone or machines alone. That’s what we hope to achieve. But in some of the management reasoning studies comparing humans plus GPT-4, humans plus conventional resources, and GPT-4 alone, it didn’t always turn out that the combination outperformed the model alone. That challenges the fundamental theorem. A lot of my research focuses on modulating those interactions to achieve a more complementary state.
PB:
What are some of the modulating factors you’re exploring to get closer to that ideal combination?
MJ:
I don’t think we fully know the answer yet—that’s why we’re studying it. Physicians, especially PCPs, already face high burnout rates. If we add another tool in the EMR that increases alert fatigue or forces them to use an external system, that’s too much. There has to be a better way to introduce these technologies without worsening burnout.
One key question is timing. When is the right time to interrupt a physician? If you’re using clinical decision support, is it after the history and physical? After the differential is formed? There was a large study last year at Penda Health with OpenAI where an AI system ran in the background and provided stoplight-style alerts if it thought the physician’s reasoning, documentation, or differential might be off. The big discussion is: where in the workflow should AI intervene—if at all? That’s an area I’m particularly interested in.
PB:
Is there any part of a typical outpatient or inpatient visit that you think AI will make obsolete?
MJ:
AI is already quite good at history-taking. Google has tools in development, and there are studies out of the K Health telehealth system. Pre-intake history-taking may become partially obsolete in terms of triage. These models are good at gathering structured histories and presenting information to physicians.
Note-writing is already becoming less burdensome with ambient scribes. I also hope to see administrative and workflow tasks reduced—things like discharge summaries. Many of those integrations are already happening. AI models are especially strong at text-based tasks, so anything documentation-heavy is likely to be streamlined.
PB:
I agree with that, but especially in outpatient primary care, so much of care is based on the relationship between two human beings. If you remove the history and jump straight into decision-making, it’s like going on a date and skipping the first three dates—you start talking about getting engaged before establishing rapport. If I walk in and immediately say, “You need an MRI for your headache,” based only on an AI intake, we may lose something important. How do we balance efficiency with the humanness of those interactions?
MJ:
I’ll clarify that AI intake tools aren’t widely implemented yet. I just see research signals suggesting they could evolve in that direction.
On the flip side, AI could allow for deeper discussions. Instead of spending five minutes asking broad questions like “Tell me about your symptoms,” you could say, “I saw that you reported rhinorrhea with your headaches. Can you tell me more about that?” That allows you to focus on more meaningful conversation during an already busy visit. But getting that balance right will require careful research and human oversight.
In terms of relationship-building, I do think that’s where AI has limitations. AI can be an extension of the care team, but I don’t think it will replace physicians for that reason. Patients may think they want the convenience of AI-based care, but much of medicine involves telling another human how you feel and having that person relate to you. That shared human experience—being understood and empathized with—is something I don’t think AI will fully replace.
MJ:
Yeah, it’s a deeply human experience. You’re already seeing a bit of this with how accessible we’re making direct-to-consumer clinical services—online labs, pharmacies, and other platforms that cut out much of the human interaction. Anecdotally, and I believe there’s some data to support this, it may actually drive more people into the health system. They get a flood of information they don’t know how to interpret, and they can’t access someone who understands them and can contextualize it. So they come to me with 400 lab results from an online service and ask what to do next. Do you see that too?
PB:
What comes to mind is patient expectation-setting. I can imagine a scenario where an AI system tells someone with a headache that they need a stat CT scan. They then show up saying, “I need a CT right now.” You can de-escalate that, but it may create friction. AI could set expectations that the healthcare system can’t meet—whether because of cost, resource limitations, or clinical appropriateness.
Right now, we have siloed environments. AI doesn’t understand the context of the healthcare system, your specific practice, or your relationship with the patient. It may recommend something you simply can’t fulfill. As we integrate these systems into our teams, we have to make sure they’re not escalating patient expectations in ways that overburden the system. Of course, if someone needs a scan, you want to provide it—but you don’t want unnecessary strain on resources.
MJ:
I want to switch gears for a minute because we also have dental listeners and veterinary colleagues. Do you see impactful AI applications in clinical decision-making for dentists?
PB:
It’s not my primary area, but as a journal reviewer, I’ve reviewed work in dental radiology—looking at enamel erosion and other dental conditions, and whether AI could augment decision-making around procedures. Radiology is a major use case across the board—dentistry, veterinary medicine, and human medicine. Imaging is one area where AI seems broadly applicable and where I’ve personally seen promising research.
MJ:
I’ve talked to dental colleagues about tools like OpenEvidence, and there doesn’t seem to be a clear equivalent for dentistry yet—nothing like an UpToDate specifically tailored to them. There’s extensive academic research in dentistry, and practice patterns vary widely, especially in private practice. But I don’t see a standout evidence-based tool yet—at least not one that’s gained traction.
PB:
That may reflect broader trends. In medicine, surgery often lags behind internal medicine in AI research. Internal medicine covers a wide variety of conditions, so we may need clinical decision support more urgently. Subspecialists tend to have a narrower and deeper focus and may not require as much broad decision support.
There’s also the documentation piece. Internists joke that we judge our work by the length of our notes. Surgeons don’t write long notes, and some specialties—like ophthalmology—use documentation styles that feel like another language. So AI tools that focus on note-writing and broad decision support may naturally gravitate toward internal medicine first.
MJ:
That’s a good point. Given what you’ve mentioned about learners, what do you think AI might make obsolete in medical education?
PB:
Studies have already looked at student sentiment around lectures and textbooks versus learning with large language models. Medical students haven’t been consistently attending lectures for years, so that shift was somewhat predictable. Students appreciate AI tools because of the interactive and asynchronous nature—they can access them anytime.
This raises questions about how we teach clinical reasoning. Educators should be cautious but proactive. AI should be integrated thoughtfully into reasoning education. We don’t want students taking shortcuts on every difficult case.
Medical education already leans heavily on board exams and multiple-choice questions, which don’t always correlate with real-world, open-ended clinical practice. There’s a concept called the “multiple biopsy” approach—using OSCEs, multiple-choice exams, script concordance testing, and real clinical clerkships to assess reasoning from multiple angles. AI should be integrated into that framework rather than treated as an afterthought.
MJ:
That makes sense. Education already looks very different from when I was in school. It’ll be interesting to see whether clinical reasoning deteriorates—or perhaps improves—if learners get more simulated reps and structured case exposure.
Before we go into rapid fire, I have one more question. You review a lot of research. What’s an application of AI that most people aren’t expecting but is closer than we think?
PB:
We’ve talked a lot about diagnostic and management reasoning, but what excites me most right now is risk stratification. There are multimodal models that can interpret imaging—echocardiograms, CT scans—and not only generate diagnoses but also predict who will respond to treatment or who is at risk for cancer recurrence.
Epic has a model called Comet that tokenizes medical events rather than words. Instead of predicting the next word in a sentence, it predicts the next medical event in a patient’s timeline. With Epic’s vast dataset, they’ve built highly accurate next-event generators. That opens the door to personalized medicine in a way that’s much more dynamic than traditional risk calculators.
Historically, we’ve had tools like sepsis predictors, but many haven’t been widely adopted. If these newer models become highly accurate, they may be too powerful to ignore—especially when it comes to screening decisions and treatment selection.
MJ:
That could significantly impact screening guidelines, making them more personalized rather than broad, one-size-fits-all recommendations.
PB:
Exactly. There was a large mammography study last year involving hundreds of thousands of women. Radiologists could review a safety-net AI read when they initially interpreted a study as normal but the AI flagged it as suspicious. That approach increased breast cancer detection rates without increasing false positives or unnecessary biopsies. That’s a powerful signal—improving accuracy without adding harm.
MJ:
That is super interesting. It’ll also be interesting from a business standpoint. The large EMRs that own massive amounts of data will likely build superior predictive models, which could widen their competitive moat. It’s a little uncomfortable to think about, but it’s part of the reality.
Let’s do a rapid fire. I’ll give you a statement—true or false—and then we’ll wrap up.
First: Medical training will be shortened because of AI’s effect on education.
PB:
Right now, false. Even with AI, I would still want physicians trained to reason independently and not take shortcuts. That ties back to de-skilling. Just because AI exists doesn’t mean doctors should train less or know less. We still need physicians who can practice autonomously. Training is already long, and there are arguments for shortening it for other reasons—but AI isn’t strong enough to justify that yet.
MJ:
I tend to agree.
Next: Doctors will see more patients per day in five years because of AI and technological advances.
PB:
Probably true. Some studies—like a recent one out of China looking at specialty intake tools—suggest efficiency gains. Even if they didn’t directly show increased patient volume, you can imagine that greater efficiency could open room for an additional patient.
That said, I hope it doesn’t worsen burnout. If AI filters out the simpler cases—like the quick URI visits that sometimes give you a breather—and leaves you with only complex cases all day, that could be exhausting.
MJ:
Next: Elon Musk was right, and people should not go to medical school.
PB:
False. For the human aspect alone, it’s an incredible privilege to care for patients and connect heart-to-heart. I don’t think AI will replace that experience. Beyond that, we love the science, the detective work, the reasoning—that’s why I chose internal medicine. Even if AI can do parts of it better, if you love it, you should pursue it.
MJ:
Now a few more. By the end of your lifetime, would you feel confident riding in a self-driving car?
PB:
Yes. I haven’t yet, but I’d feel confident doing so.
MJ:
Would you feel confident receiving prescription refills from an autonomous AI agent?
PB:
Yes, definitely.
MJ:
Would you feel confident having a wound sutured by an autonomous surgical robot?
PB:
Probably, yes. I’m not deeply involved in that field, but we’ve already seen robots perform things like carotid ultrasounds very well. Fine motor precision is clearly possible.
MJ:
Last one: Would you feel confident receiving a diagnosis from an autonomous AI model without a human in the loop?
PB:
Yes, eventually. I’m not sure which conditions that would apply to yet, but with enough human oversight and validation, we’ll likely identify specific scenarios that can be safely managed autonomously and others that should remain human-led.
MJ:
We’ll close with the question we ask every guest: What’s one thing you’ve recently changed your mind about?
PB:
I’ll give you two.
First—very much a resident answer—you don’t need eight hours of sleep to function effectively.
MJ:
You need nine?
PB:
Maybe. But residency pushes you to your limits. We’ve talked so much about AI, but humans are remarkable too—what we can do on little sleep, how we respond to rapidly changing information. That’s something we still do better than AI. I’ve realized I don’t need quite as much sleep as I once thought.
MJ:
What’s ideal for a resident?
PB:
I feel good at six hours.
MJ:
Six feels like the realistic optimal answer.
PB:
We’ve both had nights well over 24 hours awake.
MJ:
Absolutely. So what’s your second answer?
PB:
It’s more scientific. The concept of AGI—artificial general intelligence—and superintelligence. There’s ongoing debate about whether we’ve reached AGI. I recently read a Nature editorial arguing that we may have, depending on how you define it.
If AGI means performing at expert level across every domain, that’s one definition. But maybe it’s about adaptability—being able to move between domains and perform at a baseline human level. Just because models improve yearly doesn’t mean we should constantly redefine intelligence to preserve human exceptionalism.
There may also be an emotional or evolutionary discomfort with the idea that we might not be the most intelligent entities. So perhaps we shift the goalposts. I’ve changed my mind a bit—I’m more open to the idea that, by some definitions, we may already be there.
MJ:
That’s a whole other podcast.
PB:
Ultimately, it may not matter. Like calculators—I study AI, but I don’t necessarily care what’s underneath as long as it’s reliable and improves patient outcomes. Whether we call it AGI or not is secondary to whether it’s safe and effective.
MJ:
Did you like all the AI commercials during the Super Bowl?
PB:
It’s interesting. The big labs are all competing, each with different strengths and weaknesses. At this point, they’re roughly at parity. Whether you use Anthropic, ChatGPT, or Gemini, it’s mostly preference. I wouldn’t strongly endorse one over another.
Some of the naming conventions, though—version 2, 3.5—it’s confusing. But they make good commercials.
MJ:
Where can people follow your research?
PB:
I’m part of the ARISE Network, a Stanford–Harvard collaboration. We recently published a State of Clinical AI report reviewing over 100 studies from the past year—covering scribes, human-computer interaction, medical event prediction models, and more.
You can search for the ARISE Network or “State of Clinical AI 2026” to find the report and follow our work. We’ll be releasing more research this year.
MJ:
Very cool. Thanks so much for your time—this was awesome.
PB:
This was great. Thanks for having me.
MJ:
You can catch The Podcast for Doctors (By Doctors) on Apple, Spotify, YouTube, and all major platforms. If you enjoyed this episode, please rate and subscribe. Next time you see a doctor, maybe prescribe this podcast. See you next time.
Check it out on Spotify, Apple, Amazon Music, and iHeart.
Have guest or topic suggestions?
Send us an email at [email protected].
Feeling Disappointed On Match Day? What’s Next? – Match Day 2026
According to the NRMP 2024 Main Residency Match Results and Data, less than half of all Match Day applicants were matched with their first choice...
What To Do If I Didn’t Match: SOAP Tips & More
Every year, thousands of medical students apply and interview for residency. In 2025 alone, 47,208 applicants submitted a certified rank order list of their preferred...
What Happens If I Didn’t Match Into A Residency Program?
If after completing SOAP you are still unmatched, it is important to take a moment to rest. Though you will not be entering residency this...