Virchow: A Million-Slide Digital Pathology Foundation Model
Sep 14, 2023·
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,·
0 min read
Eugene Vorontsov
Alican Bozkurt
Adam Casson
George Shaikovski
Michal Zelechowski
Siqi Liu
Kristen Severson
Eric Zimmermann
James Hall
Neil Tenenholtz
Nicolo Fusi
Philippe Mathieu
Alexander van Eck
Donghun Lee
Julian Viret
Eric Robert
Yi Kan Wang
Jeremy D. Kunz
Matthew C. H. Lee
Jan Bernhard
Ran A. Godrich
Gerard Oakley
Ewan Millar
Matthew Hanna
Juan Retamero
William A. Moye
Razik Yousfi
Christopher Kanan
David Klimstra
Brandon Rothrock
Thomas J. Fuchs
Abstract
The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models’ abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computational pathology. Using self-supervised learning empowered by the DINOv2 algorithm, Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images from diverse tissue and specimen types, which is orders of magnitude more data than previous works. The Virchow model enables the development of a pan-cancer detection system with 0.949 overall specimen-level AUC across 17 different cancer types, while also achieving 0.937 AUC on 7 rare cancer types. The Virchow model sets the state-of-the-art on the internal and external image tile level benchmarks and slide level biomarker prediction tasks.
Type
Publication
arXiv preprint arXiv:2309.07778 (2023)

Authors
AI Scientist
I am an AI Scientist at Paige AI. I did my Ph.D. with Jennifer Dy, Dana Brooks, and Jan-Willem van de Meent at Northeastern University. My main research interests are machine learning with emphasis on probabilistic programming, deep neural networks, and their applications in biomedical image processing. I am one of the developers of Probabilistic Torch, a library for deep generative models that extends PyTorch. I am also one of the maintainers of the PyTorch distributions module.