Back to Guides
GuideResearch

AI for Research Ideation and Brainstorming

Ramez Kouzy

The hardest part of research isn't running experiments or writing papers. It's asking the right question.

Coming up with novel, feasible, important research questions requires creativity constrained by reality. You need to understand the literature deeply enough to identify gaps, but think divergently enough to see connections others miss.

AI helps with this. Not because it has original ideas — it doesn't. But because it forces you to articulate your thinking, challenges your assumptions, and surfaces alternatives you wouldn't have considered.

I use AI as a brainstorming partner. I come with the domain knowledge and judgment. AI comes with pattern recognition across millions of papers and the ability to generate variations endlessly.

AI suggests, you evaluate. Every idea needs your critical judgment.

For a complete overview of using AI across the research lifecycle, see the LLM Research Guide.

When AI Brainstorming Works (and When It Doesn't)

AI Is Excellent For:

1. Generating variations — Given one idea, produce 20 related approaches 2. Identifying gaps — Spotting unstudied populations, settings, or interventions 3. Connecting concepts — Linking ideas from different fields 4. Study design feedback — Finding methodological weaknesses 5. Hypothesis generation — Turning observations into testable questions 6. Literature-based ideation — Synthesizing themes into new directions

AI Is Useless For:

1. Judging importance — Can't tell meaningful questions from trivia 2. Assessing feasibility — Doesn't know your resources, access, skills 3. Original insight — Recombines existing ideas, doesn't create fundamentally new ones 4. Domain-specific judgment — Can't evaluate clinical relevance without your input 5. Knowing cutting-edge work — Training data is old, misses recent breakthroughs

Rule: Use AI to generate possibilities. You supply judgment, feasibility assessment, and importance ranking.

The Rubber Duck Debugging Approach

In programming, "rubber duck debugging" means explaining your code to an inanimate rubber duck. The act of articulating the problem often reveals the solution.

AI works the same way for research questions. Explaining your idea to AI forces clarity. AI's questions reveal assumptions you haven't examined.

Workflow:

  1. Describe research idea to AI (be specific)
  2. AI asks clarifying questions
  3. You refine based on questions
  4. AI suggests alternatives or concerns
  5. You iterate

The value isn't AI's answers — it's the process of articulation and iteration.

Practical Workflow 1: From Observation to Hypothesis

You observe: Patients treated with concurrent medication X seem to have worse outcomes.

Prompt to Claude or ChatGPT-4:

I'm a clinical researcher. I've noticed that patients receiving Drug X 
while undergoing Treatment Y seem to have worse outcomes than patients not 
receiving Drug X.

Help me develop this observation into a testable research hypothesis:
1. What are plausible biological mechanisms?
2. What confounders might explain this association?
3. What would a well-designed study look like?
4. What outcomes should I measure?
5. What's already known about Drug X and Treatment Y interactions?

AI will generate:

  • Potential mechanisms (you evaluate plausibility)
  • Confounders you might not have considered (you assess relevance)
  • Study design options (you judge feasibility)
  • Outcome suggestions (you pick clinically meaningful ones)

Critical: AI doesn't know if Drug X-Treatment Y interaction has been studied. Verify literature yourself.

What you do next:

  1. Search literature on Drug X-Treatment Y (use AI literature tools)
  2. If understudied: refine hypothesis with AI
  3. If well-studied: pivot to understudied aspect (different population, setting, mechanism)

Practical Workflow 2: Identifying Research Gaps

You have: 10-15 key papers in an area

Prompt:

I'm researching [TOPIC]. I've reviewed these key papers:

[List 10-15 citations with 1-sentence summaries]

Based on this literature:
1. What questions remain unanswered?
2. What populations or settings are understudied?
3. What methodological approaches haven't been tried?
4. What contradictions or inconsistencies appear?
5. What adjacent fields might provide relevant insights?

For each gap, suggest a specific, feasible research question.

AI excels at pattern recognition across papers. It will identify:

  • Populations excluded from studies
  • Outcomes measured inconsistently
  • Mechanistic questions left unaddressed
  • Design limitations repeated across studies

Example output (edited):

Gap 1: Most studies focus on high-resource settings. Low-resource 
implementation is understudied.
→ Research question: "Can this intervention be adapted for 
resource-limited settings with comparable efficacy?"

Gap 2: Studies measure short-term outcomes (30-90 days) but long-term 
durability is unknown.
→ Research question: "Does the intervention effect persist at 1-year 
and 2-year follow-up?"

Gap 3: Mechanisms remain speculative. No studies directly measure 
hypothesized pathway.
→ Research question: "Does the intervention affect [biomarker] as 
hypothesized?"

You evaluate:

  • Which gaps are important (not all are)
  • Which are feasible with your resources
  • Which align with your expertise and interest

Practical Workflow 3: Designing a Study

You have: A research question you want to study

Prompt:

I want to study: "[SPECIFIC RESEARCH QUESTION]"

Help me design a rigorous study:

1. What study design is most appropriate (RCT, cohort, case-control, 
cross-sectional)?
2. What are the key inclusion/exclusion criteria?
3. What's a reasonable sample size and how should I justify it?
4. What are potential confounders I need to measure or control?
5. What are the biggest threats to validity?
6. What outcomes should I measure (primary and secondary)?
7. What's the analysis plan?

For each choice, explain the rationale and trade-offs.

AI provides:

  • Design options with pros/cons (you choose based on feasibility)
  • Potential confounders (you assess which matter in your context)
  • Validity threats (you decide which are addressable)

What AI gets wrong:

  • Sample size calculations (often oversimplified)
  • Feasibility (doesn't know your setting)
  • Clinical relevance of outcomes (you must judge)

Iteration:

Option 1 (RCT) isn't feasible because we can't randomize ethically. 
Option 2 (prospective cohort) would take 3 years and we don't have funding.

Given these constraints, refine the study design. Consider:
- Retrospective cohort using existing data
- Case-control design to reduce sample size needs

AI adapts to constraints you specify. You provide domain knowledge and feasibility assessment.

Practical Workflow 4: Grant Idea Development

You have: A vague grant idea

Prompt:

I want to write a grant studying [BROAD TOPIC]. I'm interested in 
[specific angle] but haven't crystallized a specific question yet.

Help me develop this into a fundable grant idea:
1. What are 3-5 specific aims I could pursue?
2. What's the clinical/scientific significance?
3. What innovation does this represent?
4. What preliminary data would strengthen the proposal?
5. What are potential pitfalls and how could I address them?

Consider that I have: [YOUR RESOURCES: patient population, collaborators, 
equipment, data, etc.]

AI generates structured grant outline:

  • Multiple specific aims (you pick most compelling)
  • Significance arguments (you refine for relevance)
  • Innovation angles (you verify they're actually novel)

Critical: AI doesn't know what's been funded recently. Check NIH RePORTER or NSF awards database to verify novelty.

Iteration:

Aim 2 has already been studied (citation). Aim 3 isn't feasible with our 
sample size. Refine the grant around Aim 1 and suggest a new complementary aim.

Practical Workflow 5: Hypothesis Generation from Data

You have: An unexpected finding in preliminary data

Prompt:

In preliminary analysis, I found [UNEXPECTED RESULT]. 

For example: "Patients with higher baseline biomarker X had worse outcomes, 
but only in the treatment group, not in controls."

Help me generate hypotheses that could explain this:
1. What biological mechanisms could explain this pattern?
2. What confounders might create this spurious association?
3. How could I test each hypothesis?
4. What additional data would I need?
5. What's the clinical significance if this finding holds up?

AI brainstorms mechanisms:

  • Plausible biology (you evaluate with domain knowledge)
  • Confounders (you check if measured in your data)
  • Testable predictions (you assess feasibility)

This workflow helps you: Move from observation → hypothesis → study design

Practical Workflow 6: Borrowing Ideas Across Fields

You're stuck: Your field seems saturated, all questions seem studied

Prompt:

I work in [YOUR FIELD]. I'm looking for novel approaches. 

What methodological or conceptual innovations from other fields could 
apply to [SPECIFIC PROBLEM]?

Consider approaches from:
- Machine learning / AI
- Economics / behavioral science
- Engineering / signal processing
- Social sciences / network analysis
- Other medical specialties

For each, explain how it could be adapted and what new questions it 
would enable.

AI has broad pattern recognition: It can connect ideas across disciplines you might not be familiar with.

Example:

In social network analysis, researchers use centrality measures to 
identify influential nodes. This could be applied to identify "hub" 
proteins in signaling pathways or "hub" clinicians in referral networks.

You evaluate:

  • Is the analogy meaningful or superficial?
  • Is the method feasible to implement?
  • Does it answer a question anyone cares about?

Prompts That Actually Work

For Generating Variations

I'm studying [TOPIC]. Give me 15 variations on this research question 
that change:
- Population (different age, disease stage, setting)
- Intervention (different dose, timing, combination)
- Outcome (different measurement, timepoint, definition)
- Comparison (different control, active comparator)

For each variation, note what new insight it would provide.

For Challenging Assumptions

I plan to study [RESEARCH QUESTION]. 

What assumptions am I making that might be wrong? For each assumption, 
suggest how I could test it or design around it.

For Methodological Refinement

Here's my study design: [DESCRIBE DESIGN]

What are the three biggest threats to validity? How could I strengthen 
the design to address each?

For Positioning in Literature

My research question is: [QUESTION]

This builds on [KEY PAPERS]. How is my question different from what's 
been done? What unique angle or contribution does it make?

What Good Brainstorming Looks Like

You: "I want to study AI in radiology"

AI: "That's broad. What specific question about AI in radiology?"

You: "Whether AI can reduce false positives in lung cancer screening"

AI: "That's better. Have you considered: (1) Does AI reduce false positives without missing cancers? (2) Does AI-reduced false positive rate decrease unnecessary biopsies? (3) Do radiologists agree with AI recommendations? Which outcome matters most for your clinical context?"

You: "Clinical outcome is fewer unnecessary biopsies. But I don't have prospective intervention data."

AI: "Could you use retrospective data to estimate potential impact? Compare: (1) historical biopsy rate, (2) what rate would have been if AI recommendations were followed, (3) validate on held-out set. Then project clinical impact."

You: "That's feasible. What confounders should I measure?"

AI: "Consider: radiologist experience, patient smoking history, nodule size/characteristics, scanner type, patient age. Also examine if AI performance differs by subgroup — you don't want AI that works for some patients but not others."

Productive. You're refining, AI is challenging and suggesting. You maintain control.

What Bad Brainstorming Looks Like

You: "Give me research ideas in oncology"

AI: [Generates 20 generic ideas]

You: "Tell me more about idea #7"

AI: [Elaborates vaguely]

You: "Write a grant proposal on that"

Useless. No constraints, no iteration, no judgment. You're outsourcing thinking instead of enhancing it.

Tool Selection: Claude vs ChatGPT vs Perplexity

Claude Opus (Best for Deep Brainstorming)

Strengths:

  • More thoughtful, nuanced responses
  • Better at following complex constraints
  • Less likely to hallucinate citations
  • Longer responses with structured thinking

Use Claude when: You want deep exploration of an idea, need careful reasoning, or have complex constraints

ChatGPT-4 (Best for Rapid Iteration)

Strengths:

  • Faster responses
  • Good at generating many variations quickly
  • More creative (sometimes too creative)
  • Better for lists and structured brainstorming

Use ChatGPT when: You want volume of ideas, need fast iteration, or want divergent thinking

Perplexity Pro (Best for Literature-Grounded Brainstorming)

Strengths:

  • Cites sources while brainstorming
  • Good for "what's known about X?" questions
  • Searches recent literature as part of ideation

Use Perplexity when: You want ideas grounded in recent literature or need citations for background

My workflow: Start with ChatGPT for volume → refine with Claude for depth → verify with Perplexity for literature grounding

The "Yes, And" vs "Yes, But" Mindset

In brainstorming phase: "Yes, and"

  • AI suggests idea → you build on it
  • Generate volume of possibilities
  • Don't evaluate too early
  • Suspend judgment temporarily

In evaluation phase: "Yes, but"

  • Apply feasibility constraints
  • Assess importance and novelty
  • Reality-check assumptions
  • Choose what to pursue

Common mistake: Evaluating too early. Generate broadly, then narrow.

Journaling Your Brainstorming

Keep a research ideas log. When AI generates interesting angles, save them.

Format I use:

Date: 2026-02-02
Topic: AI for lung cancer screening
Brainstorm session: ChatGPT
Key insights:
- Could focus on reducing false positives specifically
- Subgroup analysis by nodule characteristics
- Economic angle: cost of unnecessary biopsies
Questions raised:
- What's baseline false positive rate in our institution?
- Do we have retrospective data with AI scores?
Follow-up:
- [ ] Check literature on AI false positive rates
- [ ] Talk to radiology about data access
- [ ] Consider grant feasibility

Over time, patterns emerge. Ideas you dismissed months ago become feasible with new collaborations or resources.

Limitations and Warnings

  1. AI doesn't know what's important — You judge clinical/scientific significance
  2. AI doesn't know what's feasible — You assess resources, access, skills
  3. AI doesn't know what's been done recently — Training data lags, verify novelty
  4. AI generates plausible-sounding nonsense — Critical evaluation required
  5. AI can't replace domain expertise — Brainstorming requires your knowledge as foundation

When to Brainstorm with Humans Instead

AI is useful, but humans are essential. Use human brainstorming for:

  • Early-stage ideation — Colleagues know context, resources, politics
  • Interdisciplinary questions — Humans bring genuine expertise from other fields
  • Feasibility assessment — Mentors know what's realistic
  • Importance judgment — Peers know what the field needs

Best approach: Brainstorm with AI individually, then bring refined ideas to human collaborators for feedback.

Once you have a solid research question, use AI to help with literature review and grant writing.

Practical Example: Complete Brainstorming Session

Starting point: "I want to improve outcomes for heart failure patients"

Round 1: Clarify focus

Prompt: "I want to improve outcomes for heart failure patients. This is 
too broad. Help me narrow to specific angles worth exploring."

AI response: 
- Medication adherence interventions
- Transitions of care (hospital to home)
- Remote monitoring and early intervention
- Palliative care integration
- Caregiver support programs
Which resonates with your clinical experience and access to patients?

Round 2: Deepen on chosen angle

You: "Transitions of care. We have high 30-day readmission rates."

Prompt: "Focus on reducing 30-day readmissions in heart failure. What are 
evidence-based intervention components? What's understudied?"

AI response:
- Structured discharge planning (well-studied)
- Early post-discharge follow-up (well-studied)
- Medication reconciliation (well-studied)
- Patient education (well-studied)
- Caregiver involvement (understudied, especially for elderly patients)
- Addressing social determinants (understudied)

Gap: Most interventions assume patients can attend clinic. What about 
patients with transportation barriers?

Round 3: Design study

You: "That's our population — elderly with transportation barriers."

Prompt: "Design a pragmatic trial testing a home-based transition 
intervention for elderly heart failure patients with transportation 
barriers. Include: study design, intervention components, outcomes, 
feasibility considerations."

AI response: [Detailed study design]

Round 4: Assess feasibility

You: "Home visits by nurses won't be reimbursed. Refine with telemedicine 
instead."

Prompt: "Revise using video visits and phone calls instead of home visits."

AI response: [Revised design]

Total time: 30 minutes. You now have a feasible, novel study design targeting an understudied gap.

Next steps (human required):

  • Discuss with heart failure team for buy-in
  • Check if telemedicine infrastructure exists
  • Estimate sample size with statistician
  • Draft preliminary grant or pilot proposal

Key Takeaways

  • AI is a brainstorming partner, not an idea generator — you provide judgment and domain knowledge
  • Use AI to generate variations, not original ideas — it recombines, you evaluate
  • The "rubber duck" effect is real — articulating ideas to AI clarifies thinking
  • Claude for depth, ChatGPT for volume — choose tool based on brainstorming phase
  • Generate broadly ("yes, and"), then evaluate critically ("yes, but")
  • AI identifies gaps in literature — but you judge importance and feasibility
  • Study design feedback from AI is useful — but verify assumptions and consult statisticians
  • Keep a research ideas log — capture insights for later
  • AI doesn't replace human collaboration — use both, iteratively
  • Every AI-generated idea requires your critical evaluation — plausibility ≠ validity

Enjoyed this guide?

Subscribe to Beam Notes for more insights delivered to your inbox.

Subscribe