GuideBeginnerRamez's Pick

My AI Toolkit: What I Actually Use Every Day

Ramez Kouzy, MD 12 min

What you'll learn

  • Which AI to use for everyday tasks vs deep reasoning vs web search
  • Practical tools for dictation, presentations, image generation, and coding
  • Medical-specific AI: OpenEvidence, MedNet AI, Doximity AI
  • The Council of AI: why you should never rely on a single model
  • How Notion + AI becomes a physician's second brain

Last updated: February 2026

The landscape of artificial intelligence moves so quickly that "best" is always a temporary title. Every month brings a new flagship model, a fresh set of benchmarks, and a wave of hype that can make it impossible to know where to actually invest your time. I have spent the last year testing nearly every tool that hits the market, and what I have learned is that the most effective workflow isn't built on a single platform. Instead, it is a curated stack of specialized tools, each chosen because it solves a specific problem better than anything else.

A couple of rules of thumb: newer, bigger models from the three big labs are usually the best. Anthropic, OpenAI, and Google dominate the space right now. Some people also like Grok. I'm personally not a fan (see recent sagas online).

The Foundation: Never Trust One Model

The Multi-Tool Approach

No single AI tool does everything well. I use different tools for different tasks -- Claude for writing, ChatGPT for code, Perplexity for cited answers, NotebookLM for deep paper analysis. The skill is knowing which to reach for.

Before looking at the specific tools, it is important to understand the core principle that governs how I use them: I never trust a single model with a high-stakes clinical or research question. Instead, I pose the same problem to Claude, GPT, and Gemini simultaneously. This is what Andrej Karpathy calls the "Council of AI." The consensus is where I find confidence, but the disagreements are where the real insight happens. Think of it as consulting multiple specialists, each with their own strengths.


1. The Daily Driver: Claude (Anthropic)

Claude is my daily driver. It actually feels more mature for the kind of work I do - scientific writing, logic, research synthesis. While other models are flashier, Claude feels the most capable of handling nuance, especially in clinical contexts. Its ability to follow complex, multi-layered instructions (like maintaining a specific clinical tone while tightening statistical reporting) is unmatched.

What I really like about Claude is the ability to set up Projects, Skills, and Connections. This is how you elevate your game. Skills and connections are really how you stop re-prompting the same context every single time. You set up a project once, with your instructions, your documents, your preferences, and every conversation within that project inherits that context automatically. If you're only going to learn one advanced feature, make it this one. For a deeper dive on setting up projects and using context effectively, check out our guide on prompt engineering for clinicians.

Try this: Build a Manuscript Reader project that critiques your drafts. It takes 10 minutes to set up and permanently changes how you review papers.

The other feature worth highlighting is Extended Thinking. For complex logical problems (study design, statistical reasoning, multi-step analysis), Claude's extended thinking mode allows the model to "pause" and work through a reasoning chain before outputting an answer. When a problem requires deep logic, this mode is currently the gold standard.

For models: I use Sonnet 4.5 for about 90% of my daily tasks. It's fast, it's sharp, and it handles most things well. I switch to Opus 4.5 for any coding tasks or anything that requires much more raw power.

2. The Power Researcher: ChatGPT (OpenAI)

GPT-5.2 is really good now. They have the ability to toggle depending on the task, which is great. You can switch between speed and depth without leaving the conversation.

I usually use ChatGPT for deep research and agent mode. When I need to exhaustively search a topic, ChatGPT's agent mode is still pretty good at this, superior to a standard search. In its "Thinking Mode," it is exceptional at solving multi-step computational problems.

Another quite useful thing is the Canvas, where you can co-write together. It basically gives you a canvas like a Word document where you can highlight sections and ask the AI to work on particular parts while you write on other parts. This is genuinely useful for manuscripts and long-form writing.

Voice is where ChatGPT really shines. They have the best voice model available right now. Out of all the premium subscriptions, it feels quite natural compared to Gemini, which is honestly not that good for voice. There's also the recording button where you can use stream of consciousness if you don't want to use a separate app like SuperWhisper. You just click on the microphone and talk, and it uses Whisper to transcribe and actually build up your thoughts into structured text.

The iPhone app for ChatGPT is the best, and I use it the most in terms of connecting it to Apple Shortcuts. It is the most "connected" tool in my kit, serving as the interface for my mobile workflow. You can trigger complex workflows, like summarizing a case from a voice note or drafting an email, directly from your lock screen or home screen.

3. The Big Context Specialist: Gemini (Google)

I use Gemini quite a bit. For information that requires grounding in the real world - latest clinical trial results, updated guidelines, or verifying a source - Gemini 2.5 Pro is my primary choice. By integrating Google's search infrastructure directly into the model, it provides real-time referencing that is significantly more reliable than its competitors. It offers citations that actually check out, making it an indispensable medical search engine.

My hunch is that it has more world experience than Claude or OpenAI due to the training data that Google has access to that other proprietary labs don't. The sheer scale of what Google indexes gives it a different kind of knowledge.

2.5 Flash is really good for data extraction and data wrangling in general. Nice thing about Gemini: if you have a .edu email, you can get a subscription for free. You can use it to summarize YouTube videos if you don't want to watch a full lecture. Or summarize emails, documents, whatever you need.

The context window (the amount of text an AI can process in a single conversation) on Gemini is much larger, so if you are working with larger files, larger PDFs, or larger documents, it's much better to use Gemini compared to ChatGPT or Claude. This matters when you're trying to synthesize across multiple papers or work with long datasets.

Additionally, with one subscription you also get access to NotebookLM, which is one of the best AI products I've consistently used over the past couple of years. You can create notebooks based on your own PDF documents and chat with them, grounded in YOUR materials, not the internet. You can find reasonable resources, build study guides, and even generate audio overviews for learning on the go. More on how I use NotebookLM here.

I think everybody should experiment with NotebookLM already. It has a ton of new features and is actually one of the wins for the Google team.

4. The Input Revolution: Dictation

This is a paradigm shift, not just another tool category. I rarely type anymore.

SuperWhisper is on all of my devices, and it's my main form of communication with computers nowadays. It runs OpenAI's Whisper model locally on my machine, allowing me to dictate into any application - emails, clinic notes, or drafts - at three to four times the speed of typing. Not for clinical notes (check your institution's policies), but for everything else. You can set up modes: for example, an email mode where you just talk and it transcribes with the formatting done for you. A note mode for quick thoughts. The ability to just speak and have polished text appear is genuinely transformative once you commit to it.

An alternative worth mentioning is Whisper Flow, which does similar things if SuperWhisper isn't your style.

I also use Granola not infrequently for meetings. It records and summarizes while you stay present and actually engage in the conversation instead of frantically taking notes.

5. The Second Brain: Notion

This is a little more intermediate to advanced, but Notion is one of my main daily drivers for project management and my second brain. Everything lives here: research projects, meeting notes, personal knowledge base, task tracking. Every project, deadline, and research note is captured there.

I've been converted on the use of Notion AI, which honestly felt like a pre-baked, marketing-driven feature originally. But it now really has a lot of context and can do a lot of things. The synergy between Notion and Claude is a particular force multiplier, allowing me to pipe project context directly into a conversation for deeper analysis. Especially if it's linked to Opus 4.5: I'm able to add pages, dump stream-of-consciousness notes, and have it rearrange and clean them for me. It builds databases and structures that are intuitive. The combination of Notion as an organizational system plus AI that actually understands your workspace is powerful.

6. Clinical Pillars

For clinical decision support, I've shifted away from general-purpose search entirely. Purpose-built tools for physicians are significantly more reliable.

OpenEvidence is one of my pillars. I've shifted entirely from using any other way of finding papers for clinical questions. It grounds everything in medical literature and gives you citations you can actually trust. It provides clinical answers backed by the medical literature with traceable citations.

Honorable mentions: Scholar Labs by Google, which seems to have better search in some cases. And Doximity GPT, which does quite a good job with access to non-open source papers, the kind behind paywalls that general AI tools can't reach.

Both Doximity and OpenEvidence now have scribe options. Here's my strong recommendation: I would recommend against using any of those if you're in an academic practice. Even if they say they are HIPAA compliant, these might not necessarily be in compliance with your institution's specific rules. HIPAA compliance and institutional compliance are not the same thing. Check twice. Ask your institution about what resources are actually available to you before you start recording clinical encounters with any AI tool.

7. Presentations and Visual Content

For presentations, I use Gamma a lot. You describe what you want and it creates polished slide decks with layout, imagery, and structure. The combination of Gamma for slides plus Gemini or NotebookLM for content synthesis is incredibly powerful: use NotebookLM to synthesize your source material, then feed that synthesis into Gamma to generate the deck. That combo is genuinely one of the most efficient workflows I've found. It has replaced my struggle with PowerPoint - I can drop in an outline and receive a professionally designed slide deck in minutes.

For image generation, Nano Banana Pro is my go-to right now. It's phenomenal, but it needs extensive prompting: you can't just say "make me a diagram." The trick is to match your prompt detail to the complexity of what you're trying to create. The more vivid and specific your description, the better the output.

One workflow I use: dictate what I want into SuperWhisper, describing the image as vividly as possible (for example, "a BioRender-style infographic showing different types of MRI sequences with labeled anatomy and color-coded signal characteristics"). Then pass that description through Gemini to refine and expand it into a proper image generation prompt. This two-step process, describe then refine, consistently produces better results than trying to write the prompt from scratch.

A few tips from experience:

  • Beware of quality degradation. If you keep iterating back and forth with the model ("change this, now change that"), the image quality degrades over time. Get your prompt right first, then generate.
  • Generate components, not final products. Instead of asking for one complex image, generate individual icons, diagrams, or subsets of images. Save those and import them into Canva or BioRender where you have more editing control over layout and annotation.
  • It's a starting point, not a finished product. The AI gives you 80% of the way there. The last 20% (layout, labels, positioning) is often better done in a proper design tool.

8. Developer Tools

For the technical side of my work, Cursor and Claude Code have transformed how I build clinical tools, allowing me to write and debug code using natural language. These tools make it possible to create custom solutions for research and clinical problems without being a professional programmer.


Quick Reference

Core Assistants

  • Writing and Logic: Claude (Sonnet 4.5) with Projects/Skills for persistent context, Extended Thinking for hard problems
  • Coding and Raw Power: Claude (Opus 4.5) when Sonnet isn't enough
  • Deep Research and Agents: ChatGPT (GPT-5.2) agent mode, Canvas for co-writing
  • Voice and Mobile: ChatGPT iPhone app, best voice model, Apple Shortcuts integration
  • Data Extraction: Gemini (2.5 Flash) for data wrangling and structured extraction
  • Large Documents: Gemini (Pro) with the biggest context window, YouTube summaries, free with .edu
  • Document Synthesis: NotebookLM to chat with your own PDFs, audio overviews

Specialization

  • Dictation: SuperWhisper for voice-first input with formatting modes (3x speed)
  • Meetings: Granola to record and summarize while you stay present
  • Second Brain: Notion + AI (Opus 4.5) for project management and knowledge base
  • Clinical Search: OpenEvidence for evidence-grounded paper finding with citations
  • Paywalled Papers: Doximity GPT for access to non-open source literature
  • Image Generation: Nano Banana Pro with extensive prompting, import components to Canva/BioRender
  • Presentations: Gamma plus Gemini/NotebookLM for content-to-slides pipeline
  • Coding: Cursor and Claude Code for AI-powered clinical tool development

The landscape changes monthly. This is what's working for me right now. Experiment, find what fits your workflow, and don't get locked into one tool. The best stack is the one you'll actually use.

New to all of this? Start with our Beginner's Path.

My AI Toolkit: What I Actually Use Every Day