Watch — a short tour of this page, narrated in my own AI-cloned voice.

1. Introduction

The application of deep learning in medical imaging has seen significant advancements over the past decade. This article provides an overview of various models and techniques used in medical image analysis, focusing on radiology and pathology.

2. Radiology

2.1 Chest X-Rays

CheXNet (Rajpurkar et al., 2017)[19] demonstrated a deep learning algorithm’s ability to detect pneumonia on chest x-rays at radiologist-level performance. However, distribution shift can significantly impact model performance when used in external hospitals[26].

2.2 Mammography

The use of AI models like McKinney et al.’s (2020)[21] system has shown promising results for breast cancer screening. These systems reduce false positives and negatives, potentially improving the efficiency and accuracy of mammograms.

2.3 Retinal Imaging

In ophthalmology, Gulshan et al. (2016)[20] trained a CNN to detect referable diabetic retinopathy from fundus photographs at specialist-level sensitivity and specificity. More recently, RETFound (Zhou et al., 2023)[38] applied masked autoencoder self-supervision to 1.6 million unlabelled retinal images, producing a foundation model that generalises to downstream tasks ranging from sight-threatening eye disease to prediction of systemic events such as heart failure.

ModelAchievements
CheXNetRadiologist-level pneumonia detection on chest x-rays with deep learning (Rajpurkar et al., 2017)
McKinney ModelReduces false positives and negatives in mammography screening (McKinney et al., 2020)

3. Segmentation Architectures

Medical image segmentation involves identifying and delineating structures within images, which is essential for accurate diagnosis and treatment planning. Common architectures include U-Net, its variants like Attention U-Net, 3D U-Net/V-Net, nnU-Net, TransUNet, and MedSAM.

3.1 U-Net

U-Net (Ronneberger et al., 2015)[22] introduced skip connections to improve segmentation of fine details. Variants like Attention U-Net further enhance performance by suppressing irrelevant background features. nnU-Net[23] provides a self-configuring framework that has set strong baselines across many biomedical segmentation tasks, while MedSAM[24] adapts the Segment Anything foundation model to the medical domain.

ArchitectureInnovationUse Case
U-NetEncoder–decoder + skip connectionsHistology, cell segmentation
3D U-Net / V-NetVolumetric convolutionsOrgan segmentation in CT/MRI

4. Digital Pathology

The field of digital pathology is rapidly advancing, driven by whole-slide imaging (WSI). Foundation models demonstrate the utility of large-scale pre-training and promptable segmentation for various applications.

4.1 Multiple-Instance Learning (MIL)

MIL techniques enable models to learn from WSIs without requiring pixel-level annotations. Attention-based MIL[25] improves interpretability by highlighting key diagnostic features within images.

ModelDescription
UNI (Chen et al., 2024)A large pathology foundation model trained on WSIs, showing transfer learning benefits across various downstream tasks.

5. Clinical Validation

While deep learning models have achieved impressive results in medical imaging, their performance often declines when deployed at external sites due to distribution shift[26]. Strategies like multi-site training and domain adaptation help mitigate this issue.