Grounded AI That Knows Its Ground: A New OCT-Powered Coach Elevates PCI Planning Beyond General Models

CA-GPT anchors AI reasoning to OCT images and uses Retrieval-Augmented Generation (RAG) to continuously assess PCI decisions in real time. This post distills the COMPARE study, showing CA-GPT outperforming ChatGPT-5 and junior operators across tasks, with the biggest gains in complex scenarios.
1st MONTH FREE Basic or Pro • code FREE
Claim Offer

Grounded AI That Knows Its Ground: A New OCT-Powered Coach Elevates PCI Planning Beyond General Models

Percutaneous coronary intervention (PCI) is a life-saving procedure, but getting it right hinges on reading inside the artery with optical coherence tomography (OCT) and translating that image into precise device choices. That’s a tall order even for seasoned operators, and it’s exactly where artificial intelligence (AI) is stepping in as a real-time decision helper. A new study introduces a domain-specific, RAG-augmented AI-OCT system named CA-GPT, designed to plan and assess OCT-guided PCI. The headline: CA-GPT outperforms a general-purpose model (ChatGPT-5) and junior operators across a range of decision tasks, especially in complex scenarios.

If you’ve ever wondered how AI could be tamed to work with the fast-moving, high-stakes world of interventional cardiology, this study is a compelling read. Here’s what it means, in plain language, and what it could mean for the practice and training of PCI going forward.


What’s the big idea here?

OCT gives a detailed, color-by-number view of plaque, calcium, and stent fit inside the artery. But interpreting OCT isn’t trivial. There’s notable variability among readers, especially for less-experienced clinicians, and that variability can affect the quality of PCI — from device sizing to ensuring the stent sits squarely and fully expands.

Enter CA-GPT: a purpose-built AI-OCT system that combines two key ideas:

  • A small, specialized OCT analysis layer that does concrete measurement tasks (lumen segmentation, plaque characterization, stent apposition, OCT-based FFR computation, etc.).
  • A large-language-model (DeepSeek-R1) layer that reasons over OCT outputs and guidelines, but with a crucial twist: it uses retrieval-augmented generation (RAG). RAG grounds AI reasoning in a curated knowledge base that includes current guidelines and tens of thousands of annotated PCI cases, reducing “hallucinations” and keeping outputs evidence-based.

In short: CA-GPT is designed to be an OCT intuition engine, plus a grounded knowledge backbone that keeps recommendations aligned with real-world guidelines and data.


How the study was set up (the quick version)

  • Setting: A single center (Tangdu Hospital, Fourth Military Medical University, China) analyzed 96 patients who underwent OCT-guided PCI, covering 160 lesions.
  • Comparators:
    • CA-GPT: The domain-specific AI-OCT system.
    • ChatGPT-5: A general-purpose AI model used as a baseline comparator.
    • Junior physicians: Interventional cardiologists with 1–5 years of PCI experience, interpreting OCT on their own.
  • Reference standard: The actual, expert-operated procedural records adjudicated by senior PCI experts.
  • What was measured: Ten predefined decision metrics split into pre-PCI (planning) and post-PCI (assessment) phases, with each metric scored 0 or 1 (total possible score 5 per phase). This gave a total agreement score per case per method (0–5).
  • Primary endpoint: Overall agreement score against the expert standard.
  • What was compared: The agreement scores across CA-GPT, ChatGPT-5, and junior physicians, plus performance on individual metrics and subgroup analyses (e.g., lesion location, ischemia by OCT-FFR, ACS vs SCAD, calcium severity).

Key point: This is retrospective and single-center, so it’s a strong signal about potential, not a definitive multi-center validation yet.


The headline results (what actually happened)

Pre-PCI planning (the “let’s plan this case” phase)

  • CA-GPT led the pack: median total pre-PCI agreement score of 5 (IQR 3.75–5).
  • ChatGPT-5: median 3 (IQR 2–4).
  • Junior physicians: median 4 (IQR 3–4).
  • Statistically, CA-GPT outperformed both comparators (P < 0.001 for CA-GPT vs both others).
  • Specific metrics where CA-GPT shined:
    • Pretreatment device type: 73.6% agreement for CA-GPT vs 37.5% for ChatGPT-5 and 61.1% for juniors.
    • Pretreatment device sizing: 70.8% vs 40.3% (ChatGPT-5) and 61.1% (juniors).
    • Stent diameter: 90.3% agreement for CA-GPT vs 63.9% (ChatGPT-5) and 72.2% (juniors).
    • Stent length: 80.6% vs 54.2% (ChatGPT-5) and 52.8% (juniors).

Post-PCI assessment (the “how did we do after the work is done?” phase)

  • CA-GPT again led in overall agreement: median 5 (IQR 4.75–5).
  • ChatGPT-5: median 4 (IQR 4–5).
  • Junior physicians: median 5 (IQR 4–5) — still solid, but CA-GPT was significantly higher than ChatGPT-5 and showed superiority over juniors in certain metrics.
  • Notable metrics:
    • Minimum stent area (MSA): 100% agreement across CA-GPT and ChatGPT-5; juniors were at 95.5%.
    • Stent expansion: CA-GPT 78.4% vs ChatGPT-5 33.0% (significant difference) vs juniors 84.1% (comparable to CA-GPT for this metric).
    • Stent apposition: CA-GPT 93.2% vs juniors 76.1% (CA-GPT outperformed juniors; no difference vs ChatGPT-5).
    • Severe dissection and significant tissue prolapse: CA-GPT performed at very high levels, comparable to or better than both comparators.

Subgroups (where CA-GPT’s strengths were most evident)

  • Across subgroups, CA-GPT’s superiority over ChatGPT-5 persisted in pre-PCI planning.
  • Compared to juniors, CA-GPT’s advantage was more pronounced in LCx/RCA lesions, ischemia-defined lesions (OCT-FFR ≤ 0.80), ACS presentations, and mildly calcified lesions.
  • In post-PCI assessments, CA-GPT’s edge over juniors was most evident in LCx/RCA lesions; in LAD or ACS subgroups, the difference was smaller or not statistically significant, but CA-GPT still outperformed ChatGPT-5.

Representative case

  • The paper includes a case where CA-GPT’s integrated plan (pre-treatment strategy, device sizing, and post-dilation steps) matched the expert procedure across all metrics, illustrating how the system can coordinate OCT findings with guideline-grounded decision logic in a real-world scenario.

What this all means in plain terms: CA-GPT isn’t just faster; it consistently aligns with expert choices on a range of important PCI decisions, and it does so more reliably than a general AI model and better than junior clinicians in several challenging situations.


Why CA-GPT seems to perform better (the guts of the approach)

Three core ideas power this performance:

1) Domain-specific design
- CA-GPT isn’t a generic “talking brain.” It has a dedicated OCT analysis layer that executes 13 core tasks, including precise lumen segmentation and stent appraisal, plus an OCT-FFR calculation. This is the hands-on, image-reading side of things.

2) RAG grounding
- The retrieval-augmented generation framework ties the AI’s reasoning to a knowledge base that includes current guidelines and a large library of annotated PCI cases. This grounding helps prevent the AI from making up facts or misapplying guidelines, a known risk with plain LLMs.

3) End-to-end, workflow-aligned decision support
- The system is designed to produce structured, evidence-backed recommendations for each PCI stage, not just a vague summary. It’s built to slot into the real-time decision-making rhythm of a cath lab: read the OCT, compare to guideline anchors, and suggest concrete device choices and optimization steps.

In contrast, ChatGPT-5, while linguistically fluent, is a general-purpose model that isn’t tuned to OCT specifics or anchored to a live, domain-specific knowledge base. The study notes how Western guideline emphasis and English-language focus could also create gaps when compared to a domain with Chinese expert consensus and locally practiced decision paths.

The take-home here is not that “AI is bad.” It’s that for high-stakes, image-driven procedures like OCT-guided PCI, a grounded, domain-specialized AI with a reliable knowledge backbone can do a better job translating data into reliable, guideline-consistent decisions.


What this could mean for practice, training, and workflow

1) Enhanced consistency and reduced learning curve
- For junior operators, a system like CA-GPT can serve as an authoritative guide, offering transparent reasoning tied to guidelines and real cases. This could shorten the steep learning curve of OCT interpretation and PCI decision-making.

2) Time savings and smoother workflows
- The study notes that AI-driven outputs could reduce interpretation time substantially, from minutes to seconds in some settings. That speed matters in busy cath labs and can free up clinicians to focus on patient-specific nuance and immediate procedural steps.

3) Safer, more standardized care in complex cases
- The strongest gains appeared in complex lesion contexts (e.g., significant calcification, multivessel considerations, ACS presentations). In these settings, AI-grounded guidance can help align decisions with evidence when human readers might diverge due to experience gaps or cognitive load.

4) Educational value and feedback loops
- CA-GPT’s explainable chain of reasoning, supported by RAG, offers a transparent basis for feedback. Trainees can see why a recommendation was made and how it aligns with guidelines, which is valuable for competency-based training.

5) Potential roadmap for integration
- The authors frame this as an end-to-end decision support system, which means future steps could include real-time integration into cath-lab information systems, multicenter validation, and prospective outcome studies to see if standardized decisions translate into fewer complications or better long-term results.

Of course, it’s important to temper enthusiasm with realism: this study is retrospective and single-center. External validation across diverse patient populations, imaging systems, and operator teams is needed before broad clinical rollout. Long-term outcomes (MACE, mortality) weren’t reported here, as the focus was on decision-making consistency.


A few practical prompts and prompts-design tips (for those curious about prompting AI in this space)

If you’re a clinician or a researcher thinking about using RAG-based AI for OCT-guided PCI or similar tasks, here are ideas inspired by how CA-GPT is structured:

  • Grounded-task prompts

    • “Given the OCT outputs [list specific measurements], provide a pretreatment plan including device type, sizing, and justification anchored to the Chinese Expert Consensus on OCT in PCI.”
    • “Assess post-PCI images for MSA, expansion, and apposition. Flag any parameters that fall outside guideline targets and propose a post-dilation strategy with rationale.”
  • Evidence-backed outputs

    • “Cite the guideline clause or evidence source for each recommendation.” Encourage the model to return sources from the knowledge base used in RAG.
    • “If a discrepancy arises with the expert record, explain the difference and how guideline anchors would resolve it.”
  • Uncertainty handling

    • “If the data are borderline or imaging quality is suboptimal, provide a confidence score and suggest additional imaging or tests to resolve uncertainty.”
  • Subgroup awareness

    • “Stratify recommendations by lesion location (LAD vs LCx/RCA), calcification severity, and ACS vs SCAD presentation to tailor guidance to the patient’s context.”
  • Educational mode

    • “Explain each decision in plain language and show how it aligns with a specific guideline—intended for trainee review.”
  • Prompt hygiene

    • Keep prompts concise, structured, and anchored to quantifiable outputs (numbers, thresholds, and clearly defined metrics) to minimize ambiguity.

These are not “one-size-fits-all” prescriptions, but they illustrate how robust, explainable, guideline-grounded AI outputs can be designed to be useful in the Cath Lab or training room.


Limitations to keep in mind

  • Single-center, retrospective design: The findings are promising but may not generalize to all centers, systems, or patient populations.
  • Not long-term outcomes yet: The study focused on decision agreement, not on clinical endpoints like MACE or mortality.
  • OCT-imaging gaps: Not every patient had both pre- and post-PCI OCT imaging, which can influence the completeness of the metrics.
  • Real-world integration: How CA-GPT performs in real-time, daily practice across multiple operators and equipment will require broader testing and workflow integration.

The authors themselves call for multicenter, prospective studies to validate these results and to explore long-term outcomes and broader implementation strategies.


Real-world implications: what changes, if any, could we expect?

  • If validated broadly, RAG-grounded, domain-specific AI-OCT systems could become a standard assistant in OCT-guided PCI, particularly helping less experienced operators reach expert-level decision quality more consistently.
  • Training programs might incorporate AI-driven feedback loops, letting trainees compare their decisions to guideline-based AI recommendations and learn from discrepancies.
  • Hospitals could see more standardized PCI planning and post-procedural assessment, potentially reducing variability in device sizing and post-PCI optimization.

All of this would be aimed at safer, faster, and more consistent care for patients with complex coronary disease.


Key Takeaways

  • Domain-specific, grounded AI (CA-GPT) paired with OCT analysis and a retrieval-augmented generation framework outperformed a general-purpose AI (ChatGPT-5) and junior operators in pre-PCI planning and post-PCI assessment across ten PCI decision metrics in a retrospective single-center study.
  • The CA-GPT system combines a dedicated OCT analysis layer with a knowledge base anchored to current guidelines and thousands of PCI cases, reducing AI hallucinations and increasing evidence-based recommendations.
  • In pre-PCI planning, CA-GPT showed higher agreement with expert records, particularly in device type, sizing, stent diameter, and stent length. In post-PCI assessment, CA-GPT achieved higher agreement on stent expansion and apposition, with universally strong performance on MSA.
  • Subgroup analyses suggest CA-GPT’s advantages are most pronounced in LCx/RCA lesions, ischemia-defined lesions, ACS presentations, and mildly calcified lesions; gains in other subgroups were still evident but more modest.
  • While the results are promising, they come from a single center and retrospective design without long-term outcomes. Multicenter validation and prospective studies are needed before broad clinical adoption.
  • Practical implications include potential improvements in consistency, efficiency, and training for OCT-guided PCI, with careful attention to integration into clinical workflows and ongoing validation.
  • If you’re designing prompts for similar AI-augmented imaging tools, grounding outputs in guidelines, providing explicit metrics, and citing sources can help improve reliability and educational value.

And finally: for clinicians and educators, the big idea here is not to replace expertise with machines, but to use a grounded AI system as a high-fidelity, explainable partner that standardizes core decisions, highlights uncertainties, and accelerates learning. That combination could help ensure that more patients receive precise, guideline-consistent PCI planning and evaluation — even in busy, high-pressure cath lab environments.


If you want, I can tailor a version of this for a specific audience—patients curious about AI in heart care, cardiology trainees, hospital administrators evaluating AI tools, or AI researchers interested in prompting strategies and evaluation metrics.

Frequently Asked Questions

Limited Time Offer

Unlock the full power of AI.

Ship better work in less time. No limits, no ads, no roadblocks.

1ST MONTH FREE Basic or Pro Plan
Code: FREE
Full AI Labs access
Unlimited Prompt Builder*
500+ Writing Assistant uses
Unlimited Humanizer
Unlimited private folders
Priority support & early releases
Cancel anytime 10,000+ members
*Fair usage applies on unlimited features to prevent abuse.