Medical Imaging AI: Radiology, Segmentation, and Digital Pathology
The application of deep learning to medical imaging has progressed from radiologist-level diagnostic classifiers to foundation models trained on whole-slide pathology images, with segmentation architectures like U-Net and nnU-Net defining the modern clinical toolkit.
Watch — a short tour of this page, narrated in my own AI-cloned voice.
1. Introduction
The application of deep learning in medical imaging has seen significant advancements over the past decade. This article provides an overview of various models and techniques used in medical image analysis, focusing on radiology and pathology.
2. Radiology
2.1 Chest X-Rays
CheXNet (Rajpurkar et al., 2017)[19] demonstrated a deep learning algorithm’s ability to detect pneumonia on chest x-rays at radiologist-level performance. However, distribution shift can significantly impact model performance when used in external hospitals[26].
2.2 Mammography
The use of AI models like McKinney et al.’s (2020)[21] system has shown promising results for breast cancer screening. These systems reduce false positives and negatives, potentially improving the efficiency and accuracy of mammograms.
2.3 Retinal Imaging
In ophthalmology, Gulshan et al. (2016)[20] trained a CNN to detect referable diabetic retinopathy from fundus photographs at specialist-level sensitivity and specificity. More recently, RETFound (Zhou et al., 2023)[38] applied masked autoencoder self-supervision to 1.6 million unlabelled retinal images, producing a foundation model that generalises to downstream tasks ranging from sight-threatening eye disease to prediction of systemic events such as heart failure.
| Model | Achievements |
|---|---|
| CheXNet | Radiologist-level pneumonia detection on chest x-rays with deep learning (Rajpurkar et al., 2017) |
| McKinney Model | Reduces false positives and negatives in mammography screening (McKinney et al., 2020) |
3. Segmentation Architectures
Medical image segmentation involves identifying and delineating structures within images, which is essential for accurate diagnosis and treatment planning. Common architectures include U-Net, its variants like Attention U-Net, 3D U-Net/V-Net, nnU-Net, TransUNet, and MedSAM.
3.1 U-Net
U-Net (Ronneberger et al., 2015)[22] introduced skip connections to improve segmentation of fine details. Variants like Attention U-Net further enhance performance by suppressing irrelevant background features. nnU-Net[23] provides a self-configuring framework that has set strong baselines across many biomedical segmentation tasks, while MedSAM[24] adapts the Segment Anything foundation model to the medical domain.
| Architecture | Innovation | Use Case |
|---|---|---|
| U-Net | Encoder–decoder + skip connections | Histology, cell segmentation |
| 3D U-Net / V-Net | Volumetric convolutions | Organ segmentation in CT/MRI |
4. Digital Pathology
The field of digital pathology is rapidly advancing, driven by whole-slide imaging (WSI). Foundation models demonstrate the utility of large-scale pre-training and promptable segmentation for various applications.
4.1 Multiple-Instance Learning (MIL)
MIL techniques enable models to learn from WSIs without requiring pixel-level annotations. Attention-based MIL[25] improves interpretability by highlighting key diagnostic features within images.
| Model | Description |
|---|---|
| UNI (Chen et al., 2024) | A large pathology foundation model trained on WSIs, showing transfer learning benefits across various downstream tasks. |
5. Clinical Validation
While deep learning models have achieved impressive results in medical imaging, their performance often declines when deployed at external sites due to distribution shift[26]. Strategies like multi-site training and domain adaptation help mitigate this issue.