Back to Guides
GuideSafety & EthicsBeginnerEssential

What About Patient Safety? What Every Clinician Needs to Know About AI Trust

Ramez Kouzy, MD 5 min

What you'll learn

  • Never put PHI into consumer AI tools without a BAA
  • AI outputs always need human verification
  • Clinical AI tools vs consumer AI assistants -- different standards
  • How to check and follow institutional AI policies
  • The responsible adoption gradient: low-stakes to clinical

The Stakes Are Different Here

RiskWhat It Looks LikeHow to Mitigate
HallucinationFabricated citations, invented statisticsAlways verify against primary sources
Data privacyPHI entered into public modelsNever enter identifiable patient data
Automation biasTrusting AI without checkingTreat AI as a first draft, not final answer
SycophancyAI agreeing with your wrong assumptionAsk it to challenge your reasoning

In most industries, an AI mistake means a bad email draft or a wrong spreadsheet formula. In medicine, an AI mistake can mean a wrong dose, a missed diagnosis, or a treatment plan built on fabricated evidence. The stakes are categorically different, and our approach to AI adoption needs to reflect that.

This does not mean avoiding AI. It means adopting it responsibly, with the same rigor you apply to any new tool in the clinic.

The principles that matter:

Rule 1: Never Put PHI Into Consumer AI Tools

This is the non-negotiable starting point. ChatGPT, Claude, Gemini - these are consumer products. Unless your institution has a Business Associate Agreement (BAA) with the provider, any patient data you enter is potentially being processed, stored, and used in ways that violate HIPAA.

This means:

  • Do not paste clinical notes with patient identifiers into ChatGPT
  • Do not upload imaging reports with names and MRNs into Claude
  • Do not describe a patient's case with identifiable details in any consumer AI tool

If you want to use AI for clinical reasoning, de-identify first. Change the name, remove the MRN, alter non-essential demographics. Or use AI tools that your institution has vetted and approved for clinical use with appropriate data agreements in place.

Some platforms offer enterprise tiers with BAAs - OpenAI's Enterprise plan, for example, or institutional deployments of Claude. Check with your IT and compliance teams about what is available and approved at your institution.

This is not paranoia. This is the same standard you already apply to texting about patients, emailing clinical details, and discussing cases in public spaces. AI tools are no different.

Beyond PHI: Protecting Research and Institutional Data

PHI gets all the attention, but it's not the only sensitive data you handle. When you use a public LLM, the company potentially has access to everything you input. For clinical work, that's a HIPAA issue. For research work, it's an intellectual property and competitive advantage issue.

What you should never paste into public LLMs:

  • Unpublished research findings and preliminary data
  • Grant proposals before submission (your innovative ideas are intellectual property)
  • Manuscripts under peer review (yours or others')
  • Proprietary institutional methods or protocols
  • Confidential institutional information

Most public LLM providers state they don't train models on consumer inputs anymore, but their privacy policies typically allow human review of conversations for safety and quality purposes. That means a contract worker could theoretically read your pre-submission R01 grant. More importantly, policies change. Trust is not a data protection strategy.

For sensitive research work, use local LLMs:

  • Ollama (free, open source): Run models like Llama 3 or Mistral entirely on your computer. Nothing leaves your machine.
  • LM Studio (free): User-friendly interface for local models. Good option if you're not technical.

Local models aren't as capable as GPT-4 or Claude Sonnet, but they're sufficient for drafting, editing, and brainstorming with sensitive content. Use public LLMs for generic tasks. Use local LLMs when confidentiality matters.

Rule 2: AI Is a Co-Pilot, Not an Autopilot

Every AI-generated output in a clinical context needs human verification. Every single one. There are no exceptions to this rule, regardless of how good the model is or how confident the output sounds.

This applies to:

  • Treatment recommendations: verify against guidelines and your clinical judgment
  • Literature summaries: check that the cited studies actually exist and say what the model claims
  • Calculations: confirm the math independently
  • Clinical documentation drafts: review for accuracy before signing
  • Patient-facing materials: read everything before sharing

The analogy I find most useful: AI is like a very well-read resident presenting a case. They might give you an excellent summary and thoughtful plan. They might also miss something critical or get a key detail wrong. You would never sign off on a treatment plan from a resident without reviewing it yourself. Apply the same standard to AI.

The evidence base for this caution is well-documented. The paper "One Shot at Trust: Leveraging AI Responsibly in Health Care" makes a compelling case that medicine gets essentially one opportunity to establish public trust in clinical AI tools. A high-profile failure - a wrong diagnosis, a harmful recommendation - could set the field back years. We should be early adopters, not reckless adopters.

Rule 3: Understand the Two Different Worlds of Clinical AI

There are two very different categories of AI in medicine, and they operate under different standards:

Consumer AI assistants (ChatGPT, Claude, Gemini) are general-purpose tools that you use for thinking, drafting, and brainstorming. They are not regulated as medical devices. They have no clinical validation. They carry no liability. They are productivity tools, and you are responsible for everything you do with their output.

Clinical AI tools (auto-contouring software, dose prediction systems, FDA-cleared diagnostic aids) are purpose-built for specific clinical tasks. They have undergone validation, may have regulatory clearance, and are integrated into clinical workflows with appropriate oversight.

These are different things requiring different levels of trust. Using Claude to help draft a manuscript introduction is fundamentally different from using an AI system to generate a treatment plan. The former is a productivity tool; the latter is a clinical decision support system that should meet a much higher evidence bar.

Do not conflate them. A model that writes excellent prose is not necessarily a model you should trust for dose calculations.

Rule 4: Check Your Institutional Policies

Before you integrate any AI tool into your clinical or research workflow, find out what your institution allows. Many academic medical centers and hospital systems now have formal AI policies covering:

  • Which AI tools are approved for use
  • Whether enterprise agreements with BAAs are in place
  • What data can and cannot be entered into AI systems
  • Requirements for disclosing AI use in publications or clinical documentation
  • IRB considerations for AI-assisted research

These policies exist for good reason. They protect patients, protect you, and protect the institution. Following them is not optional, even if you think the risk is low.

If your institution does not have an AI policy yet, that is worth raising with your leadership. The absence of a policy does not mean anything goes - it means the guardrails have not been built yet.

Rule 5: Start Low-Stakes, Build Confidence Gradually

The responsible adoption path is not zero-to-clinical-deployment overnight. It is a gradual escalation:

Start with personal productivity. Use AI to summarize papers, draft emails, brainstorm research ideas, organize your thinking. These are low-stakes tasks where errors are easily caught and consequences are minimal.

Move to education and training. Use AI to help prepare lectures, generate study questions, explain complex concepts. The output gets reviewed before reaching learners, and the risk profile is manageable.

Then explore clinical support - carefully. Use AI as a sounding board for clinical reasoning, but always with your own expertise as the final check. Never let AI output reach a patient without your explicit review and approval.

Stay current on institutional tools. As validated, regulated AI tools enter your clinical workflow (auto-contouring, plan quality checks, decision support), engage with them from a position of understanding rather than blind trust or blind skepticism.

This gradient - from low-stakes personal use to higher-stakes clinical integration - lets you build genuine competence with the technology while maintaining patient safety at every step.

The Bottom Line

AI safety in medicine is not about fear. It is about the same professional discipline you already practice every day. You verify drug doses. You double-check treatment plans. You get second opinions on complex cases. Apply those same habits to AI output.

The clinicians who adopt AI responsibly - starting with low-stakes tasks, building expertise, maintaining verification habits, and following institutional guidelines - will be the ones who capture the genuine productivity benefits of these tools without putting patients at risk.

Do not be afraid of AI. Be rigorous with it. That is what your patients deserve.

Enjoyed this guide?

Subscribe to Beam Notes for more insights delivered to your inbox.

Subscribe