Foundation Models in Digital Pathology: The Good, the Bad, and the Ugly

Sep 05, 2025

Introduction: The Dawn of a New Era in Pathology

Pathology, a discipline anchored in the meticulous interpretation of tissue morphology, now stands at the cusp of a radical transformation. The shift from traditional microscopy to digital pathology workflows has paved the way for the entry of artificial intelligence (AI) as an indispensable support tool (1). In this context, a new class of algorithms has emergedone that promises not only to optimize existing tasks but also to fundamentally redefine the potential of diagnosis: foundation models.

Unlike conventional AI models, which are trained for narrow and specific tasks (such as detecting a particular cancer type or counting cells within a given field of view) foundation models (FMs) are large-scale machine learning systems (2). Trained on vast and diverse datasets that are either unlabeled or minimally annotated, these models acquire a broad, foundational understanding of the domain. This allows them to adapt to a wide variety of downstream tasks with only minimal additional data (2). The concept was popularized by language models such as GPT-4, but its scope extends far beyond text to include computer vision, genomics, and “crucially” digital pathology. This versatility and capacity for knowledge transfer are precisely what distinguish foundation models, positioning them as an enabling technology for scalability, multimodal integration, and the clinical support demanded by modern pathology (3).

This article seeks to provide a comprehensive and nuanced analysis of this technological revolution. It will examine the promises and transformative benefits of foundation models, what can be described as “The Good.” At the same time, rigorous analysis requires addressing the technical limitations and adoption challenges that hinder widespread implementation, “The Bad.” Finally, it will delve into the broader ethical, political, and strategic implications, the systemic risks, and even the existential questions this technology raises for the global health ecosystem, what we call “The Ugly.”

The ultimate goal is to offer a holistic perspective that serves as a guide for pathologists, oncologists, researchers, and healthcare leaders, navigating between technical depth and pragmatic strategic vision for the future of pathology.

2. The Good: Promises and Transformative Benefits

The potential of foundation models in digital pathology can be summarized in three fundamental pillars that directly address the field’s bottlenecks: scalability, reduced dependence on manual annotation, and integration of heterogeneous information. These advances not only improve operational efficiency but also open new frontiers in precision medicine.

2.1. Scalability and Unprecedented Generalization

The intrinsic strength of foundation models lies in their ability to scale. Unlike earlier deep learning models, which were typically trained on small, task-specific datasets, FMs are designed to assimilate and learn from massive and diverse volumes of data(3).

For example, the Virchow model, developed by Paige AI, represents a revolutionary step forward as the first foundation model in digital pathology trained on a dataset of one million whole-slide images (WSIs), a feat requiring significant computational and engineering resources(4). This scale, 5 to 10 times larger than traditional research repositories such as TCGA, enables the models to capture the full spectrum of morphological variations across tissue types, staining protocols, and pathological conditions(5).

This expansive knowledge base provides foundation models with remarkable generalization capabilities. By learning universal histopathological representations, they can transfer this knowledge to specific tasks and cancer types, including rare malignancies with very limited training data (6). Virchow, for instance, demonstrated exceptional performance in pan-cancer detection, achieving an area under the curve (AUC) of 0.95 across nine common and seven rare cancer types. In some cases, it even outperformed tissue-specific models trained on clinical datasets, underscoring the inherent advantage of scalability and generalization within this architecture(6).

2.2. The Self-Supervised Learning (SSL) Revolution

One of the most persistent obstacles in developing AI for pathology has been the need for vast amounts of annotated data from pathologists—a process that is notoriously laborious, costly, and often subjective. Foundation models address this “data cliff problem” by leveraging self-supervised learning (SSL) techniques during pretraining (4).

SSL enables models to learn meaningful structures and patterns directly from WSIs without the need for manual labels. Methods such as contrastive learning (SimCLR, DINO) or masked image modeling have proven effective in pretraining pathology models. These algorithms extract rich, semantically meaningful features from unannotated gigapixel pathology slides (7). Once the model has acquired these high-level representations, it can be fine-tuned for specific tasks (classification, segmentation, detection) with far fewer labeled samples(7).

This drastically reduces development time and effort, while simultaneously producing more robust and efficient AI tools. By harnessing the vast supply of unlabeled slides generated daily in pathology labs, SSL fundamentally shifts the economics and feasibility of AI in pathology(5).

2.3. Multimodal Integration for Precision Medicine

Clinical pathology has always gone beyond the interpretation of a single WSI. Pathologists routinely integrate data from multiple sources, including radiology images (CT, MRI), genomic and molecular profiles, and electronic health records (EHRs) (8). Foundation models are now at the forefront of replicating, and extending, this integrative capacity.

The multimodal approach fuses these diverse data streams into a holistic patient profile, enabling more precise differential diagnosis and risk stratification (8). Projects such as CONCH (CONtrastive learning from Captions for Histopathology)and Paige PRISM2 (9) exemplify this trend, combining histopathology images with biomedical text, such as clinical reports and diagnostic notes. This not only improves accuracy in classification and segmentation but also unlocks novel functionalities: retrieving images from textual descriptions or generating draft pathology reports automatically (9).

In research, models like CLOVER (Contrastive Learning for Omics-guided Whole-slide Visual Embedding Representation) show how imaging data can be integrated with multi-omics datasets (genomics, transcriptomics) to predict prognosis and therapy response, paving the way toward predictive and personalized pathology(10).

The ability of FMs to correlate morphological patterns with genetic or radiological signatures marks a conceptual leap. These models can uncover complex relationships beyond the reach of the human eye, enabling the discovery of new biomarkers and the prediction of clinical outcomes(11). This higher-order analysis goes beyond mere task automation, repositioning pathology at the very core of precision medicine.

2.4. Tangible Clinical and Operational Benefits

The adoption of foundation models translates into tangible improvements for daily clinical practice and hospital operations:

Reduced workload and improved efficiency: Pathologists face rising case volumes amid a global shortage of specialists. AI can automate repetitive, time-intensive tasks such as Ki-67 cell counting or micrometastasis detection in lymph nodes. This not only saves time but also allows pathologists to prioritize complex cases, freeing resources for deeper clinicopathological correlation (12).
Standardization and reduced variability: AI algorithms provide quantitative, reproducible assessments of tissue features, reducing inter-observer variability and ensuring more consistent, standardized cancer diagnoses(8).
Support for underserved hospitals: Enabled by telepathology and remote analysis, digital pathology allows hospitals in resource-limited regions to access high-level expertise worldwide. Because FMs require fewer annotated datasets for adaptation, they are particularly well suited to these contexts, helping bridge quality gaps in patient care(7).

Despite their promise, the path toward widespread adoption of foundation models in digital pathology is fraught with significant technical and operational challenges. These “bads” are not mere inconveniences, but rather barriers that demand substantial investment and strategic reorientation.

3.1. The High Cost of Training and Infrastructure Concentration

The very scale that makes foundation models so powerful is also their greatest obstacle. Training these models is prohibitively expensive, with costs reaching into the seven-figure range for the most advanced systems. For instance, Google’s Gemini Ultra reportedly cost an estimated $191 million to train (13). This massive investment is driven by the need for specialized computing infrastructure, tens of thousands of high-end GPUs (such as NVIDIA A100s or H100s) operating for months at a time (13).

This nearly insurmountable entry barrier leads to an inevitable concentration of power. Only Big Tech, Big Pharma, or venture-capital-backed startups can realistically afford to train a foundation model from scratch (14). The result is an ecosystem where technological and data control consolidates into the hands of a few major players, illustrated by Tempus’ acquisition of Paige for $81.25 million, a strategic move to gain access to Paige’s vast pathology dataset(15).

Such concentration poses a risk to open innovation and the democratization of technology. Smaller startups and academic institutions may be effectively excluded from this race, forced instead to rely on proprietary solutions(16).

3.2. The Challenge of Explainability and the “Black Box” Problem (XAI)

By their very nature, with billions of parameters, foundation models operate as opaque “black boxes”(5). While they may deliver highly accurate predictions, it is often difficult (if not impossible) for pathologists and developers to fully understand how the algorithm reached its conclusion(17).

In a clinical context, this lack of transparency creates fundamental distrust. A pathologist cannot simply accept an AI prediction, however precise it may appear, without understanding the logic behind it. Explainable AI (XAI) is therefore critical for clinical adoption(18). For an AI tool to be trusted, pathologists must be able to audit its decisions and validate that they are grounded in meaningful histopathological patterns rather than spurious correlations(18).

The absence of clear frameworks for auditability and accountability becomes a critical barrier for hospital adoption. This is not merely a technical challenge, but also one of confidence: if a pathologist cannot defend an AI-assisted diagnosis in a tumor board or to a clinical colleague, the tool’s clinical utility is severely compromised.

3.3. The Translational Gap and Incomplete Generalization

Although foundation models are designed for generalization, transferring a model trained in a research environment into real-world clinical practice remains a considerable challenge, commonly referred to as the “translational gap”(19).

For instance, a model trained on a large American cohort (e.g., data from Memorial Sloan Kettering Cancer Center in Paige’s case (15)) may perform suboptimally in European, Asian, or Latin American populations, due to differences in genetics, demographics, staining protocols, or sample preparation(5).

The problem goes beyond generalization: it is also a disconnect between AI developers’ goals and clinicians’ needs. A study published in Arch Pathol Lab Med emphasized that data scientists often focus on computational performance metrics (AUC, F1-score) that, while impressive on paper, may be “clinically meaningless” if misaligned with workflow priorities and clinical decision-making(19).

This communication and understanding gap between pathologists and data scientists is a fundamental obstacle that must be bridged for AI tools to become truly useful and safe in patient care settings(19).

4. The Ugly: Ethical, Political, and Strategic Implications

The promises and challenges of foundation models intertwine with deeper, systemic implications. These “uglies” are complex and multifaceted, touching on issues of governance, equity, and power that extend far beyond the raw performance of any algorithm.

4.1. Data Governance and Privacy Risks

Digital pathology relies on the collection, storage, and analysis of enormous volumes of sensitive data. Managing whole-slide images (WSIs) and their associated metadata poses significant risks for patient privacy. Regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States and the General Data Protection Regulation (GDPR) in Europe impose strict requirements on the handling of protected health information (PHI) and personal data (20).

Yet anonymizing WSIs is technically complex and anything but trivial (21). Personal identifiers (such as case codes or barcodes) are often printed directly on glass slide labels and subsequently captured in the digital image (21). Removing this information without compromising file integrity requires specialized tools.

Moreover, storing these massive datasets, which can reach petabyte-scale per hospital (22), demands robust governance frameworks to ensure security both “at rest” (stored data) and “in transit” (shared data) (23). The lack of standardized governance models, combined with differences in national privacy laws (e.g., GDPR’s explicit consent requirement vs. HIPAA’s allowance for certain disclosures (24)), complicates international collaboration and hinders the creation of truly global data consortia (20).

4.2. The Global Health Equity Gap

If not managed proactively, the deployment of foundation models risks exacerbating existing health inequities rather than reducing them (16). The high cost of computing infrastructure, the requirement for high-end scanners, and the dependence on robust internet connectivity (22) restrict adoption primarily to the wealthiest hospitals and healthcare systems.

Low- and middle-income countries—those that could benefit the most from telepathology to offset specialist shortages—often lack the infrastructure required to transfer and store massive WSIs (22).

This challenge is compounded by algorithmic bias. Foundation models are trained on available datasets, which overwhelmingly originate from high-income countries and predominantly Caucasian patient populations (25). Models trained on these privileged cohorts may perpetuate and even amplify biases inherent to clinical practice and research, performing suboptimally in populations with different genetic backgrounds, ethnicities, or lifestyles (25).

The result is a vicious cycle: the most advanced technologies disproportionately benefit those already well-served by high-quality care, while underrepresented populations remain at a disadvantage (26).

4.3. Regulatory and Reimbursement Barriers

Technological innovation advances at an exponential pace, while regulatory frameworks evolve only incrementally. Agencies such as the FDA and EMA are in the process of drafting guidelines to evaluate the safety and efficacy of AI-based medical devices. However, the traditional validation model, requiring full revalidation for every software modification, does not align well with the adaptive, continuously learning nature of foundation models (27).

This mismatch creates regulatory uncertainty that slows both investment and commercialization. To date, only a very small number of pathology AI tools have received FDA clearance, a stark contrast with radiology, where hundreds of AI solutions have already been approved (28).

In addition, the reimbursement landscape represents a critical challenge to widespread adoption. Many payers, particularly in the United States, have yet to assign CPT (Current Procedural Terminology) codes specific to AI-based pathology analysis. Without a clear mechanism for billing and reimbursement, hospitals struggle to justify the significant investments required in digital infrastructure (scanners, storage, cloud computing) (29).

This business challenge discourages investment and hampers the implementation of digital pathology in daily clinical practice, perpetuating a cycle of innovation that too often remains confined to research, without reaching patients (29).

Share Beyond the Slide

5. Toward a Future Vision: Closing Thoughts and Strategic Proposals

The analysis of foundation models in digital pathology reveals a landscape of immense promise, significant technical hurdles, and profound ethical implications. “The Good” points to a future of faster, more precise diagnoses, free from the bottleneck of manual annotation and empowered by multimodal integration. “The Bad” reminds us of formidable entry barriers, algorithmic opacity, and the risk of uneven performance in real-world settings. And “The Ugly” confronts us with a sobering possibility: that this technology could exacerbate existing inequities and consolidate power in the hands of a few unless we act proactively.

The defining question for the future of digital pathology is this: Will we choose a closed ecosystem, dominated by monopolistic actors, or will we build an open and collaborative future for the common good?

To secure a sustainable and equitable future, several strategic solutions are imperative:

5.1. Collaboration and Open Innovation

To counteract power centralization and mitigate algorithmic bias, large-scale collaboration is essential. Building international data consortia, funded through public-private partnerships, can democratize access to training datasets. Initiatives such as the NCI Cancer Research Data Commons in the United States and the ECRIN network in Europe already serve as models, enabling access to vast cancer datasets for research and validation while adhering to strict security and privacy standards.

Such consortia must ensure that datasets reflect true global diversity, producing models that are not only powerful but equitable. Promoting open-source foundation models, such as H-optimus-0, empowers the global community to build upon existing progress without being locked into proprietary solutions.

5.2. Investment in Education and Digital Literacy

The translational gap and the challenge of explainability cannot be solved without a prepared workforce. It is crucial to invest in the education of the next generation of pathologists. The pathology curriculum must evolve to include informatics, machine learning, biostatistics, and AI ethics.

A digitally literate pathologist will not only know how to use AI tools but will also understand their limitations, interpret outputs critically, and serve as a strategic co-pilot in diagnosis. Continuous education programs and professional initiatives such as the Digital Pathology Topic Center of the College of American Pathologists (CAP) are essential to ensure practicing pathologists keep pace with rapid technological advances.

5.3. Agile Regulatory Frameworks and Fair Reimbursement

Regulation must evolve in step with innovation. Regulators need to design frameworks that allow foundation models to be updated without requiring full re-approval for every incremental change. The FDA, for instance, has begun exploring this through Predetermined Change Control Plans.

At the same time, healthcare systems must establish reimbursement models that recognize the clinical and operational value of AI in pathology. By incentivizing adoption and allowing hospitals to recoup their digital infrastructure investments, we can create an environment where innovation moves from research labs into clinical practice where it truly matters.

5.4. The Future of the Pathologist in the Age of AI

Ultimately, the future of pathology is not one of replacement, but of augmentation. Foundation models are not here to displace pathologists, but to amplify their expertise and unlock their full potential. AI will increasingly assume quantitative and repetitive tasks (pattern detection, triaging cases) while pathologists will be free to focus on the most complex and rewarding aspects of their profession: clinicopathological correlation, the interpretation of ambiguous cases, the discovery of novel biomarkers, translational research, and leadership within multidisciplinary teams.

Pathology is evolving from a purely diagnostic discipline into one that is truly predictive and prescriptive. In this future, the pathologist becomes nothing less than an architect of precision care.

Foundation models are not the end of the story, but the foundation upon which the next chapter of pathology will be built. A chapter defined by the symbiotic collaboration between human expertise and artificial intelligence, designed not for the technology itself, but for the ultimate benefit of patients worldwide.

References

Quirónsalud [Internet]. [cited 2025 Sept 5]. Patología digital. Available from: https://www.quironsalud.com/es/tecnologia-punta/patologia-digital
Parole. Modelos Fundacionales: Todo lo Que Necesitas Saber [Internet]. Parole Developers. 2024 [cited 2025 Sept 5]. Available from: https://paroledevs.ai/modelos-fundacionales/
F5, Inc. [Internet]. [cited 2025 Sept 5]. Modelos básicos de IA. Available from: https://www.f5.com/es_es/glossary/foundational-ai-models
Paige.ai [Internet]. [cited 2025 Sept 5]. How Foundation Models Can Transform Pathology. Available from: https://www.paige.ai/blog/how-foundation-models-can-transform-pathology
Yoo S. The State of Foundation Models in Computational Pathology: A Comprehensive Survey [Internet]. Medium. 2025 [cited 2025 Sept 5]. Available from: https://medium.com/@loomy.sjyoo/the-state-of-foundation-models-in-computational-pathology-a-comprehensive-survey-ac4fd2a62ff0
Vorontsov E, Bozkurt A, Casson A, Shaikovski G, Zelechowski M, Severson K, et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nature medicine. 2024 Oct;30(10):2924–35.
Adem AM, Kant R, Sonia, Kumar K, Mittal V, Jain P. Exploring Self-Supervised Learning for Disease Detection and Classification in Digital Pathology: A Review. Biomedical and Pharmacology Journal. 2025 Feb 20;18(March Spl Edition):59–72.
doctorsapiens.io. IA en Hospitales: La Revolución Silenciosa del NHS en las Altas Médicas - doctorsapiens.io [Internet]. 2025 [cited 2025 Sept 5]. Available from: https://doctorsapiens.io/ia-en-hospitales-la-revolucion-silenciosa-del-nhs-en-las-altas-medicas/
Paige.ai [Internet]. [cited 2025 Sept 5]. Paige Releases PRISM2: A Whole-Slide Foundation Model for Multimodal AI in Pathology and Cancer Care. Available from: https://www.paige.ai/press-releases/paige-releases-prism2-a-whole-slide-foundation-model-for-multimodal-ai
Yu S, Kim Y, Kim H, Lee S, Kim K. Contrastive Learning for Omics-guided Whole-slide Visual Embedding Representation [Internet]. bioRxiv; 2025 [cited 2025 Sept 5]. p. 2025.01.12.632280. Available from: https://www.biorxiv.org/content/10.1101/2025.01.12.632280v1
Simon BD, Ozyoruk KB, Gelikman DG, Harmon SA, Türkbey B, Simon BD, et al. The future of multimodal artificial intelligence models for integrating imaging and clinical metadata: a narrative review. Diagnostic and Interventional Radiology [Internet]. 2025 July 8 [cited 2025 Sept 5]; Available from: https://dirjournal.org/articles/the-future-of-multimodal-artificial-intelligence-models-for-integrating-imaging-and-clinical-metadata-a-narrative-review/dir.2024.242631
¿Qué es la patología digital? | Leica Biosystems [Internet]. [cited 2025 Sept 5]. Available from: https://www.leicabiosystems.com/es/knowledge-pathway/digital-pathology/
Poole EO Richard. CUDO Compute. 2025 [cited 2025 Sept 5]. What is the cost of training large language models? Available from: https://www.cudocompute.com/blog/what-is-the-cost-of-training-large-language-models
PAIGE - 2025 Company Profile & Team - Tracxn [Internet]. 2025 [cited 2025 Sept 5]. Available from: https://tracxn.com/d/companies/paige/__GHxt9EXIE3bkX1RWvhDqvOIYp3JqqrWJniNvebDk7AY
Hale C. Tempus claims AI pathology developer Paige in $81M deal [Internet]. 2025 [cited 2025 Sept 5]. Available from: https://www.fiercebiotech.com/medtech/tempus-claims-ai-pathology-developer-paige-81m-deal
Combatiendo la desigualdad con inteligencia artificial responsable | IDRC - International Development Research Centre [Internet]. 2023 [cited 2025 Sept 5]. Available from: https://idrc-crdi.ca/es/historias/combatiendo-la-desigualdad-con-inteligencia-artificial-responsable
Inteligencia artificial en salud: retos éticos y científicos. Barcelona: Fundació Víctor Grifols i Lucas; 2023.
¿Qué es la IA explicable (XAI)? | IBM [Internet]. 2023 [cited 2025 Sept 5]. Available from: https://www.ibm.com/es-es/think/topics/explainable-ai
Gu Q, Patel A, Hanna MG, Lennerz JK, Garcia C, Zarella M, et al. Bridging the Clinical-Computational Transparency Gap in Digital Pathology. Arch Pathol Lab Med. 2025 Mar 1;149(3):276–87.
HIPAA vs. GDPR compliance: what’s the difference? [Internet]. [cited 2025 Sept 5]. Available from: https://www.onetrust.com/blog/hipaa-vs-gdpr-compliance/
Bisson T, Franz M, O I, Romberg D, Jansen C, Hufnagl P, et al. Anonymization of whole slide images in histopathology for research and education. DIGITAL HEALTH. 2023 May 9;9:205520762311714.
Alexandrov A. Overcoming Challenges in Digital Pathology: Why Do On-premises-first Hybrid Workflows Matter? [Internet]. Tiger Technology. 2023 [cited 2025 Sept 5]. Available from: https://www.tiger-technology.com/overcoming-challenges-in-digital-pathology-why-do-on-premises-first-hybrid-workflows-matter/
Three Data Management Considerations for Digital Pathology - Akoya [Internet]. 2020 [cited 2025 Sept 5]. Available from: https://www.akoyabio.com/blog/three-data-management-considerations-for-digital-pathology/
Arjan D. HIPAA & GDPR - What are the differences? [Internet]. 2023 [cited 2025 Sept 5]. Available from: https://blog.medicai.io/en/hipaa-gdpr-what-are-the-differences/
Amaya-Santos S, Jiménez-Pernett J, Bermúdez-Tamayo C, Amaya-Santos S, Jiménez-Pernett J, Bermúdez-Tamayo C. ¿Salud para quién? Interseccionalidad y sesgos de la inteligencia artificial para el diagnóstico clínico. Anales del Sistema Sanitario de Navarra [Internet]. 2024 Aug [cited 2025 Sept 5];47(2). Available from: https://scielo.isciii.es/scielo.php?script=sci_abstract&pid=S1137-66272024000200001&lng=es&nrm=iso&tlng=es
AAMC [Internet]. [cited 2025 Sept 5]. Provide Equal Access to AI. Available from: https://www.aamc.org/about-us/mission-areas/medical-education/principles-ai-use/provide-equal-access-ai
Health C for D and R. Artificial Intelligence in Software as a Medical Device. FDA [Internet]. 2025 Oct 7 [cited 2025 Sept 5]; Available from: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-software-medical-device
Bruns V. Digital Pathology AI Companies -- List of Commercial AIs (CE-IVDR, FDA, RUO) [Internet]. SMART SENSING insights. 2025 [cited 2025 Sept 5]. Available from: https://websites.fraunhofer.de/smart-sensing-insights/digital-pathology-ais/
Mitchell A. Key Considerations When Adding Digital Pathology [Internet]. Lighthouse Lab Services. 2024 [cited 2025 Sept 5]. Available from: https://www.lighthouselabservices.com/key-considerations-when-adding-digital-pathology/

Beyond the Slide