A joint study carried out by US health group Providence, Microsoft and the University of Washington detailing work to create a novel AI-powered pathology model that will lay the foundation to improve patient outcomes and advance individually tailored care has been published in Nature.
Leveraging large-scale machine learning models and pretrained on real-world data managed by Providence, this revolutionary approach has the potential to transform cancer diagnostics by holistically capturing global patterns across the whole slide, allowing for improved predictions around mutations and effective cancer subtyping.
“This transformative work is the result of focused efforts to overcome three major challenges that have stymied previous computational pathology models from widely being applied in the clinical setting: shortage of real-world data, inability to incorporate whole-slide modelling and lack of accessibility,” said Ari Robicsek MD, Chief Analytics and Research Officer at Providence. “Our paper being published in Nature describes a transformational solution to these challenges, exponentially reducing the effort needed to build digital diagnostic tools in the future.”
The model, Prov-GigaPath, is undergirded by the largest pretraining effort to date with whole-slide modelling, performed on 1.3 billion pathology image tiles obtained from 171,189 digital whole-slides provided by Providence. This is five to 10 times larger than other established pretraining datasets such as The Cancer Genome Atlas (TCGA). The slides come from more than 30,000 patients and cover 31 major tissue types. All computation was conducted within Providence’s tenant and approved by the Providence Institutional Review Board (IRB), adhering to appropriate standards of privacy and compliance.
By combining diverse data from Providence with a novel pathology-specific adaptation of Microsoft’s LongNet, which allows for long-context modelling of whole-slide images, Prov-GigaPath attained state-of-the-art performance on 25 out of 26 digital pathology tasks, with significant improvement over the next best model on 18 other tasks.
With the model now globally available, Prov-GigaPath’s ability in holistic whole-slide modelling promises to unlock new approaches to studying the tumour microenvironment, with potential downstream applications in cancer diagnostics and prognostics such as assisting clinicians in treatment selection. There’s also potential for it to have broader biomedical impacts in the future.
“The rich data in pathology slides can, through AI tools like Prov-GigaPath, uncover novel relationships and insights that go beyond what the human eye can discern,” said Carlo Bifulco MD, Chief Medical Officer of Providence Genomics and Associate Member and Medical Director of Translational Molecular Pathology, Earle A. Chiles Research Institute. “Recognising the potential of this model to significantly advance cancer research and diagnostics, we felt strongly about making it widely available to benefit patients globally. It’s an honour to be part of this groundbreaking work.”
- Xu H, Usuyama N, Bagga J, et al. A whole-slide foundation model for digital pathology from real-world data. Nature. Published online May 22, 2024. doi:10.1038/s41586-024-07441-w.