AI in Pathology: The Case for Augmentation, Not Automation

Every few months, a new headline announces that AI will automate pathologists' jobs. The story is predictable: a model achieves performance matching a panel of pathologists on a specific classification task, and the coverage extrapolates from "matches performance on this narrow benchmark" to "will replace pathologists." Both the extrapolation and the underlying premise are wrong, and they do genuine harm to the clinical adoption of tools that would actually benefit patients.

I want to be precise about this, because I am in the business of building AI tools for pathology, and I have a stake in getting the framing right.

What Pathologist Expertise Actually Is

The role of a pathologist in cancer diagnostics is not primarily counting cells. It is not, at its core, a quantification task at all. It is a synthesis task: integrating morphological evidence (cell architecture, nuclear grade, mitotic activity, stromal response), immunohistochemical evidence (marker expression patterns, heterogeneity, co-expression), clinical context (patient history, prior treatment, radiographic findings), and a large body of pattern-matching experience built over years of subspecialty training, to arrive at a diagnostic interpretation that has meaning for the specific patient in front of you.

IHC biomarker quantification — counting Ki-67-positive nuclei, measuring HER2 membrane staining intensity, estimating the proportion of PD-L1-positive tumor cells — is one input into that synthesis. It is an important input. It is also a task that is tedious, time-consuming, and, as the inter-observer variability data shows, not reproducible enough when done manually to be a reliable foundation for treatment decisions.

An algorithm that automates the quantification task is not automating pathologist expertise. It is automating the measurement step that produces one input to the synthesis. The synthesis — what the score means for this patient, in this tumor, with this clinical context — remains entirely within the pathologist's domain, and there is no plausible near-term technical path to automating it.

The Dangerous Framing Problem

When AI tools are framed as replacements for pathologists, two bad things happen. First, pathologists — rationally — resist adoption of tools they perceive as threatening their professional role. This is not irrational conservatism; it is a reasonable response to a threat framing that causes pathologists to approach clinical AI with adversarial rather than evaluative skepticism. The result is that genuinely useful tools face adoption barriers built from a miscommunication, not from evidence that the tools don't work.

Second, the replacement framing sets up an evaluation framework — "does the AI perform as well as a pathologist?" — that asks the wrong question. The right question is "does the AI, used as a tool by a pathologist, produce better diagnostic outcomes than the pathologist working without it?" These are different questions, and the answer to the second is often yes even when the answer to the first is mixed. A pathologist who reviews a pre-computed algorithmic Ki-67 hotspot score with a supporting spatial heatmap is making a different and often better-informed decision than a pathologist estimating the same score manually under time pressure — even if the algorithm's standalone performance on a benchmark dataset is slightly below the expert panel average.

We are not saying that AI performance benchmarks are unimportant. They are important — they are the evidence that the algorithm can be trusted as an input. What we are saying is that benchmarks measure algorithm performance in isolation, and clinical utility requires the algorithm to work in a workflow context alongside a pathologist who brings judgment that the algorithm cannot replicate.

What Automation Should Actually Target

The tasks in IHC pathology review that are legitimate targets for automation share specific characteristics: they are well-defined (the scoring criteria are published and explicit), they are repetitive (the same visual counting task applied to hundreds of cells per slide), they are time-consuming relative to their cognitive demand (a pathologist spending 20 minutes counting Ki-67-positive nuclei in a hotspot region is consuming expert time on a data collection task), and they are a source of reproducibility problems (manual counting introduces intra- and inter-observer variability that is structurally avoidable).

Manual IHC quantification meets all four criteria. Diagnostic synthesis — interpreting what the score means, integrating it with the histomorphology, making the final diagnostic call — meets none of them. The automation target is narrow and specific, and staying within that boundary is both more honest and more practically effective than claiming a broader scope.

The Accountability Architecture

In a well-designed clinical workflow, the algorithmic IHC score is an input that the pathologist reviews, evaluates, and incorporates into — or overrides in — the final report. The pathologist signs off. The diagnostic report is the pathologist's, not the algorithm's. This is not a limitation of the tool; it is the correct architecture for a clinical decision support tool in a high-stakes medical context.

The CLIA lab framework, FDA's digital health guidance, and the professional standards of pathology all converge on this point: software produces measurements, pathologists make diagnoses. A tool designed around this architecture — where the algorithm provides precise, reproducible, supporting data and the pathologist provides clinical judgment — is more likely to be adopted, more likely to be trusted, and more likely to be used correctly than one designed around the fantasy of full automation.

At Synthia, we built around this architecture from the start. The scoring output includes the discrete grade, the continuous underlying probability, the spatial cell-level heatmap, and the count of positive and negative nuclei — everything a pathologist needs to verify and evaluate the score, not just accept it. The pathologist's sign-off is a required step in every workflow we design for. That is not a concession to regulation. It is a design principle, because a tool that pathologists trust and actively engage with will do more good than one that they tolerate and work around.

Pathology AI that makes pathologists more capable of doing the work that requires their expertise is worth building. Pathology AI that tries to substitute for that expertise is, at best, premature — and the hype cycle around it costs us real clinical adoption of real tools that really help.