From Prompts to Practice: What Multi-Institutional Studies Teach

Blog

From Prompts to Practice: What Multi-Institutional Studies Teach

Aug 29, 2025

CIIP

Podcast

From Prompts to Practice: What Multi-Institutional Studies Teach

Radiology has no shortage of data. Every day, millions of CTs, MRIs, and X-rays are captured and reported on. The real challenge isn’t volume, it’s curation. How do we turn narrative radiology reports into structured, labeled data that can drive model training, clinical decision support, and patient-facing tools?

A recent multi-institutional study led by Mayo Clinic and colleagues across UCSF, MGH, Emory, UCI, and Moffitt Cancer Center set out to test whether large language models (LLMs) could be used for this task. Instead of training custom models, the team focused on prompt engineering—creating carefully structured instructions to guide off-the-shelf LLMs in annotating reports.

Why Prompt Engineering Matters

Traditional natural language processing (NLP) approaches often fail when confronted with the variability of radiology reports. Some institutions rely heavily on structured templates, others lean toward free-text narratives. Even within structured environments, radiologists frequently insert personal style or nuance that complicates downstream analysis.

LLMs, by contrast, are inherently better at adapting to this variability. The study showed that with a well-optimized prompt, models like Llama 3.1 70B could achieve impressive accuracy across sites. In some cases, performance approached human-level accuracy, particularly for well-defined findings such as cervical spine fractures.

What stood out was how much the nature of the finding affected performance. Conditions that hinge on clear radiologic evidence, like fractures, were reliably captured. More ambiguous findings, like pneumonia—where radiologists often hedge with “possible” or “probable”—were less consistent. This highlights that the limitation isn’t just the model, but also the inherent uncertainty in medical communication.

Structured Reporting vs Narrative Style

One of the study’s side observations was that structured reporting does help, but it isn’t a silver bullet. Centers with strong template use tended to show higher accuracy, but radiologists often still added narrative phrasing that threw off results. This suggests that while templates improve extractability, they won’t fully solve variability.

Interestingly, prompts engineered at one site sometimes performed better at another. This reflects how local style and institutional context play as much of a role as the model itself.

Lessons on AI Errors

Another important finding was the behavior of models when they got things wrong. Instead of simply answering “yes” or “no” as instructed, models sometimes generated paragraphs of explanation. Even when the content was technically correct, these outputs were counted as failures because they ignored the prompt format.

The team also noted that chat-based models struggled more than instruction-tuned versions. Chat interfaces often aim to be conversational, which led to irrelevant elaborations instead of concise answers. Instruction-tuned prompts yielded far more reliable outputs.

The Hallucination Problem

One of the most persistent challenges the researchers faced was hallucination. When uncertain, the models often abandoned the simple yes/no format and produced verbose, off-topic replies. These outputs couldn’t be trusted, so they were discarded from the results.

This is a reminder that LLMs don’t always “fail gracefully.” Instead of signaling uncertainty, they can overcompensate with confident but irrelevant text. For clinical applications, this kind of behavior is not just inconvenient—it’s potentially dangerous.

Here’s a short clip from the conversation where Dr. Mana Moassefi explains how her team managed these hallucinations:

Why This Matters

The implications go beyond annotation. If LLMs can reliably label reports, they can create massive, curated datasets that fuel the next generation of AI imaging models. They could also be extended into patient portals, offering lay-friendly summaries of findings while flagging when clinical correlation is required.

Looking further ahead, the concept of agentic AI—where multiple specialized models collaborate, such as one extracting diagnoses, another quantifying uncertainty, and another communicating risk—could reshape how radiology findings are shared with both clinicians and patients.

Transparency will remain a challenge. AI reasoning doesn’t mirror human logic, and sometimes validation will matter more than interpretability. But as this study shows, careful design can move us closer to trustworthy, reproducible outcomes.

The Future of AI in Radiology

Radiologists often worry about AI as a replacement. A more constructive framing is that AI will shift the field from simply diagnosing toward improving screening, consistency, and patient communication. Humans remain essential, but tools like LLMs can extend their reach.

This work demonstrates that big insights can come not from building bigger models, but from designing better prompts, running multi-institutional collaborations, and tackling real-world data variability head on.

If you want to hear a deeper dive into the study and its broader implications, you can check out my conversation with Dr. Mana Moassefi on Imaging Informatics Unplugged.

Recent Blogs

Webinar: Transforming Healthcare - Opportunities with LLMs & VLMs

Fresh from the bustling corridors of RSNA, it's almost impossible not to be buzzed by the pervasive resonance of 'AI' in every conversation. It's clear that the healthcare industry is on the brink of a transformative era, driven by advancements in Artificial Intelligence. This is why our upcoming webinar, "Everything is Everything: How Large Language, Multi-Modal, and Visual Language Models (Could) Change Health IT," is a must-attend event for anyone keen to understand this seismic shift.

Webinar: Transforming Healthcare - Opportunities with LLMs & VLMs

CIIP Chronicles: Mastering IT in Medical Imaging

Welcome to the CIIP Chronicles - no fluff, just facts. Today, we're slicing through the IT fog in medical imaging. Here's the deal: IT is not just a slice, but a hefty chunk of the CIIP exam, a full 16 out of 130 questions. It's not just about passing an exam, it's about wielding the tools that drive this industry. We're not just navigating; we're mastering this space. Think of IT as the backbone of medical imaging - no IT, no clarity. Let's dive in and own this domain, preparing you not just for a test, but for a titan's career in medical imaging.

CIIP Chronicles: Mastering IT in Medical Imaging

CIIP Chronicles: Navigating the Maze of Image Management

In the ever-evolving landscape of imaging informatics, the CIIP exam stands as a beacon of professional competence, challenging candidates with a broad spectrum of questions, of which Image Management is a crucial piece. This installment of the CIIP Chronicles dives deep into the Image Management domain, unraveling its complexities and casting light on its 23 questions - a significant segment in the 130-question labyrinth of the CIIP exam. We are here to guide you through this maze, offering clear, concise insights and practical wisdom to not only help you prepare for the CIIP exam but also to enhance your understanding of the critical elements of Image Management in imaging informatics.

CIIP Chronicles: Navigating the Maze of Image Management

Share this post:

Bring clarity to your craft

Start your journey with a course that strengthens your skills — or begin with the free CIIP Practice Exam to find your starting point.

Start the Free CIIP Practice Exam

From Prompts to Practice: What Multi-Institutional Studies Teach

From Prompts to Practice: What Multi-Institutional Studies Teach

Why Prompt Engineering Matters

Structured Reporting vs Narrative Style

Lessons on AI Errors

The Hallucination Problem

Why This Matters

The Future of AI in Radiology

Categories

Recent Blogs

Recent Blogs

Webinar: Transforming Healthcare - Opportunities with LLMs & VLMs

Webinar: Transforming Healthcare - Opportunities with LLMs & VLMs

CIIP Chronicles: Mastering IT in Medical Imaging

CIIP Chronicles: Mastering IT in Medical Imaging

CIIP Chronicles: Navigating the Maze of Image Management

CIIP Chronicles: Navigating the Maze of Image Management

Bring clarity to your craft

Bring clarity to your craft