A Foundation Model for Clinical-Grade Computational Pathology and Rare Cancers Detection
Jul 22, 2024·
,,,,,,,,,,,,,,,,,,,,,,,,,,·
0 min read
Eugene Vorontsov
Equal contribution
Alican Bozkurt
Equal contribution
,Adam Casson
Equal contribution
,George Shaikovski
Equal contribution
,Michal Zelechowski
Equal contribution
,Kristen Severson
Equal contribution
,Eric Zimmermann
James Hall
Neil Tenenholtz
Nicolo Fusi
Ellen Yang
Philippe Mathieu
Alexander van Eck
Donghun Lee
Julian Viret
Eric Robert
Yi Kan Wang
Jeremy D. Kunz
Matthew C. H. Lee
Jan H. Bernhard
Ran A. Godrich
Gerard Oakley
Ewan Millar
Matthew Hanna
Hannah Wen
Juan A. Retamero
William A. Moye
Razik Yousfi
Christopher Kanan
David S. Klimstra
Brandon Rothrock
Siqi Liu
Thomas J. Fuchs
Abstract
The analysis of histopathology images with artificial intelligence aims to enable clinical decision support systems and precision medicine. The success of such applications depends on the ability to model the diverse patterns observed in pathology images. To this end, we present Virchow, the largest foundation model for computational pathology to date. In addition to the evaluation of biomarker prediction and cell identification, we demonstrate that a large foundation model enables pan-cancer detection, achieving 0.95 specimen-level area under the curve across nine common and seven rare cancers. Furthermore, we show that with less training data, the pan-cancer detector built on Virchow can achieve similar performance to tissue-specific clinical-grade models in production and outperform them on some rare variants of cancer. Virchow’s performance gains highlight the value of a foundation model and open possibilities for many high-impact applications with limited amounts of labeled training data.
Type
Publication
In Nature Medicine 30(10), 2924–2935 (2024)

Authors
AI Scientist
I am an AI Scientist at Paige AI. I did my Ph.D. with Jennifer Dy, Dana Brooks, and Jan-Willem van de Meent at Northeastern University. My main research interests are machine learning with emphasis on probabilistic programming, deep neural networks, and their applications in biomedical image processing. I am one of the developers of Probabilistic Torch, a library for deep generative models that extends PyTorch. I am also one of the maintainers of the PyTorch distributions module.