By Dave DeFusco
A Katz School study accepted to the prestigious computer vision conference CVPR 2026 in June is tackling a problem that has long limited the usefulness of artificial intelligence in medicine: not just seeing what is in a medical image, but explaining where it is in a way doctors can trust.
The paper, 鈥淐G-Reasoner: Centroid-Guided Positional Reasoning Segmentation for Medical Imaging with a Robust Visual-Text Consistency Metric,鈥 introduces a system designed to do two things at once. First, it identifies areas of concern in medical images, such as tumors or lesions. Second, it explains their location in clear, human-like language. This development marks an important shift in how AI can support healthcare.
Today鈥檚 AI systems are already very good at analyzing images like X-rays, MRIs and CT scans. They can outline suspicious areas with impressive accuracy; however, these systems usually stop there. They highlight pixels but do not explain their reasoning in a way that matches how doctors think and communicate.
鈥淚n clinical practice, doctors don鈥檛 just point to a spot. They describe where it is, how it relates to nearby structures and why it matters,鈥 said Lakshmikar Polamreddy, lead author of the study and a student in the Department of Graduate Computer Science and Engineering. 鈥淢ost AI models ignore that kind of spatial reasoning. Our goal was to bridge that gap.鈥
The new system, called CG-Reasoner, is designed to combine visual understanding with language. It uses a type of AI known as a multimodal model, meaning it can process both images and text together. This allows it not only to detect a lesion, but to describe it by noting, for example, that a tumor is located in the upper left region of a lung.
A key innovation in the system is something called a 鈥淭ext2Centroid鈥 module, which is a component that connects words to precise locations. When the AI generates a description, it also predicts a central point鈥攍ike a set of coordinates鈥攖hat anchors the explanation to the actual image.
鈥淭his helps ensure the explanation isn鈥檛 just fluent, but accurate,鈥 said Polamreddy. 鈥淭he text and the image are tied together through geometry, so the reasoning reflects the real position of the lesion.鈥
The researchers also introduced a new way to measure how well the system performs. Traditional metrics focus only on how closely an AI鈥檚 outlined region matches the true area in an image, but they do not evaluate whether the explanation is correct.
To solve this, the team created PRScore, or Positional-Reasoning Score. This metric checks the visual accuracy and quality of the explanation, essentially asking: does the AI not only find the right spot, but describe it correctly?
In tests across six different types of medical imaging, including X-rays, MRIs and ultrasounds, the system achieved state-of-the-art results. It was not only highly accurate in identifying problem areas, but more consistent in aligning its explanations with those areas.
Assistant Professor Ming Ma, the paper鈥檚 corresponding author, said the work addresses a critical barrier to the adoption of AI in healthcare.
鈥淎ccuracy alone is not enough,鈥 said Ma. 鈥淒octors need systems they can understand and trust. By combining segmentation with clear, spatially grounded reasoning, we are making AI outputs more interpretable and clinically meaningful.鈥
Another advantage of the system is efficiency. Despite its advanced capabilities, CG-Reasoner uses a relatively lightweight design, meaning it can run without the massive computing resources often required by cutting-edge AI models.
The implications could be significant. In the future, systems like CG-Reasoner could help automate parts of medical reporting, assist doctors in making faster decisions and reduce the risk of misinterpretation.
鈥淎s AI becomes more integrated into healthcare, it has to communicate in ways that align with how clinicians think,鈥 said Polamreddy. 鈥淭his is about making AI not just powerful, but useful in real clinical settings.鈥