HER2 2+ Equivocal Cases: How Automated Flagging Reduces Unnecessary ISH Reflex Testing

In breast and gastric cancer pathology, the HER2 2+ equivocal designation occupies a clinically and operationally uncomfortable space. It exists precisely because the visual evidence does not clearly support either a positive (3+) or a negative (0/1+) call: staining is circumferential and complete, but the intensity sits in a range that the ASCO/CAP guidelines acknowledge as not clearly positive and not clearly negative. The standard response — reflex ISH testing — is the right answer for the ambiguous population as a whole. But within the 2+ population, there is a meaningful subgroup of cases where the IHC staining pattern is statistically more likely to reflect underlying gene amplification, and another subgroup where it is more likely to reflect polysomy or other non-amplification causes.

If we could reliably distinguish these subgroups on IHC alone — or at least meaningfully stratify the probability — we could make ISH reflex testing decisions more intelligently, reducing unnecessary testing in low-probability cases while ensuring that high-probability cases are appropriately escalated.

The 2+ Reflex Testing Rate and Its Costs

ISH reflex testing (most commonly FISH for HER2 gene amplification, using dual-probe assays for HER2 and CEP17) adds measurable cost and turnaround time to every case it is ordered for. The reagent and technical cost of a FISH assay at academic centers commonly falls in the range of $300–600 per case depending on institution and probe vendor. Turnaround time for FISH, from slide receipt to enumeration and interpretation, is typically 3–5 business days — representing a substantial addition to the time before HER2 status is definitively determined and treatment planning can proceed.

The reflex rate at any given institution reflects both the population's underlying HER2 2+ prevalence and the local staining and scoring practices. Estimates from published quality reports and pathology department data suggest that 10–20% of all invasive breast cancer IHC cases are scored 2+, triggering ISH reflex. Of those 2+ cases referred for ISH, typically 15–25% are ultimately confirmed as amplified, meaning that 75–85% of ISH reflex testing is performed on cases that will ultimately be reported as HER2-negative. These proportions vary by institution and population, but the direction is consistent: most ISH reflex testing on 2+ cases returns a non-amplified result.

Reducing the reflex rate on clearly low-probability 2+ cases — those where the IHC staining pattern strongly predicts a non-amplified ISH result — would reduce cost and turnaround time without increasing the risk of missing amplified cases, provided the stratification is accurate. The key qualifier is "provided the stratification is accurate."

What the IHC Staining Pattern Can and Cannot Predict

Published studies examining the relationship between IHC staining characteristics and ISH amplification status in HER2 2+ cases have identified several IHC features that are statistically associated with amplification probability: continuous membrane staining completeness (circumferential completeness across essentially 100% of positive cells correlates with higher amplification rates than intermittent circumferential staining), DAB optical density at the membrane relative to background (higher relative intensity within the 2+ range), and proportion of cells with strong circumferential staining within a technically 2+ case (a case where 30% of cells show near-3+ intensity staining is a different case than one where 30% of cells show uniform moderate-intensity staining).

None of these features, individually or in combination, provides diagnostic certainty. An algorithmic confidence score for HER2 amplification based on IHC features alone cannot replace ISH for definitive amplification status. What it can provide is a probability estimate that quantifies how close to the 2+/3+ boundary a given case falls — and that probability estimate is actionable information even without replacing the ISH test.

How Algorithmic Confidence Scoring Works in Practice

An algorithmic approach to 2+ equivocal flagging operates by reporting not just the discrete 2+ category assignment but a continuous underlying score reflecting the model's position on the spectrum from clearly-low-end-2+ to clearly-high-end-2+. A case with a continuous score of 0.72 (on a 0–1 scale anchored at 0 = unambiguous 0/1+ and 1 = unambiguous 3+) tells the reviewing pathologist something different from a case with a score of 0.54, even if both are formally categorized as 2+.

The clinical use of this information is not to skip ISH reflex on low-confidence 2+ cases automatically. That would substitute algorithmic judgment for the pathologist's clinical decision authority in a way that is both medically inappropriate and inconsistent with current investigational status. The appropriate use is to surface this information as part of the pathologist's decision context: a case scored 2+ with an underlying confidence score of 0.55 (squarely mid-range) and a case scored 2+ with a confidence score of 0.71 (high end, approaching the 3+ boundary) may both appropriately receive ISH reflex, but the second case deserves more careful attention in the ISH interpretation context, and the clinical team should know that the IHC was near-3+.

We are not saying that algorithmic confidence stratification eliminates the need for pathologist judgment in reflex testing decisions. We are saying that the pathologist making that decision should have the algorithmic confidence data available as part of the decision context, in the same way that they have the clinical history and the overall tumor grade available.

The Heterogeneous HER2 Complication

The 2018 ASCO/CAP update introduced specific guidance on HER2 heterogeneity — when a tumor shows different HER2 expression levels in different regions, the reporting requirements changed to mandate documentation of the spatial distribution of staining. A 2+ case with heterogeneous staining (some regions approaching 3+, others clearly 1+) is meaningfully different from a uniform 2+ case, and algorithmic analysis is well-positioned to characterize the spatial distribution precisely.

In a heterogeneous 2+ case, the proportion of the tumor area showing near-3+ versus low-2+ staining, the spatial continuity of the high-staining region, and whether high-staining regions are contiguous or scattered are all features that manual pathologist review can estimate but not easily quantify. An algorithm operating across the full whole-slide image can produce a spatial staining density map that quantifies exactly these features — how much of the invasive tumor shows each staining intensity level, and where those regions are located.

This spatial characterization is particularly relevant for needle core biopsies, where tissue sampling may not represent the full tumor. A core biopsy showing heterogeneous 2+ staining with a high-intensity subregion may be sampling a tumor where the HER2-overexpressing clone is spatially concentrated — information that is relevant to clinical management even before ISH results are available.

Integration with the ISH Report

One underutilized opportunity is using the IHC algorithmic output to improve ISH interpretation in the reflex workflow. When an ISH result comes back for a 2+ case, the HER2/CEP17 ratio and the absolute HER2 copy number are interpreted in the context of the five ISH groups defined in the 2018 ASCO/CAP guidelines. For groups 2, 3, and 4 (where ISH amplification is not clear-cut), the concurrent IHC status is required as part of the interpretation algorithm.

Specifically, the 2018 guidelines require that a case with ISH group 2 (HER2/CEP17 ratio ≥ 2, average HER2 signals < 4) be classified as positive only if the concurrent IHC is 3+, and as negative if IHC is 2+. A continuous algorithmic confidence score for HER2 IHC, embedded in the case record, would provide the ISH interpreting pathologist with a more precise characterization of "how 2+" the IHC was when navigating these complex ISH-IHC interaction rules. A 2+ that is algorithmically near 3+ has different clinical implications in the ISH group 2 context than a 2+ that is solidly mid-range.

The goal of all of this is straightforward: provide the pathologist with more precise characterization of the IHC signal so that the clinical decisions built on top of that signal are better informed. The pathologist's judgment does not disappear from this workflow — it is engaged with better data.