Abstract visualization of whole-slide image tissue architecture showing cellular grid pattern

Architecture & Methodology

The Science Behind Synthia

Deep learning for IHC biomarker quantification — trained on pathologist-annotated whole-slide images from academic cancer centers.

MODEL PIPELINE

WSI INPUT
20× / 40× magnification
TILE EXTRACTION
256×256px patches
CNN + ATTENTION
ResNet-50 + Transformer
SCORE AGGREGATION
Per ASCO/CAP guidelines

Convolutional + Attention Architecture for WSI Analysis

Whole-slide images present a unique computational challenge: a single scan at 40× magnification can exceed 100,000 × 80,000 pixels. Direct processing is computationally intractable and unnecessary — relevant diagnostic information is localized at the cellular level.

Synthia's architecture decomposes the problem into two stages. First, a tile extraction and tissue segmentation pipeline identifies IHC-relevant tissue regions, excluding glass background, fat, and stroma. Second, a hybrid convolutional-attention model processes each 256×256 pixel tile at 20× magnification, performing simultaneous cell detection and staining intensity quantification.

The backbone is a ResNet-50 encoder pre-trained on histopathology data, fine-tuned on our IHC-specific annotation corpus. Transformer-based attention heads provide long-range spatial context — critical for identifying IHC hotspots in Ki-67 proliferation scoring and distinguishing membrane vs. cytoplasmic staining in HER2 assessment.

Score aggregation follows biomarker-specific logic: for HER2, membrane-positive cell counting per the modified H-score system; for PD-L1, separate tumor proportion score (TPS) and combined positive score (CPS) pathways; for Ki-67, global percentage and spatial hotspot localization.

WSI PROCESSING PIPELINE

STAGE 1 — TILING
WSI → tissue mask → 256px patches
STAGE 2 — FEATURE EXTRACTION
ResNet-50 encoder (IHC pre-trained)
STAGE 3 — CELL DETECTION
DAB-positive detection + intensity class
STAGE 4 — ATTENTION CONTEXT
Transformer spatial heads → hotspot map
STAGE 5 — SCORE AGGREGATION
HER2 / PD-L1 / Ki-67 clinical output

Annotation Methodology and Dataset Scale

Every training annotation went through a multi-reader adjudication protocol. Agreement below consensus threshold triggered mandatory third-reader adjudication.

Pathologist Consensus Annotation Protocol

All training annotations were produced by board-certified anatomic pathologists with subspecialty expertise in oncologic pathology. Each whole-slide image received two independent reads; cases with inter-reader kappa below 0.70 were escalated to a third reader for adjudication.

Annotation was performed at cell-level resolution for HER2 membrane staining (0/1+/2+/3+ per cell), pixel-level DAB thresholding for PD-L1 (tumor cell vs. immune cell positivity), and nucleus segmentation for Ki-67 (positive/negative count per field).

Cases from externally procured de-identified tissue archives covering breast, gastric, lung, and bladder cancer histology types. No real institution names are disclosed in this summary.

Training Dataset Summary

Biomarker Training WSI Annotation Type Reader Agreement
HER2 n = 4,200 Membrane cell-level κ = 0.82
PD-L1 n = 3,100 TPS + CPS regions κ = 0.78
Ki-67 n = 2,800 Nuclear positive/neg ICC = 0.87

Internal training set. Data on file. Figures are approximate. Kappa: weighted kappa between two primary readers prior to adjudication.

The Model Scores the Same Slide the Same Way. Every Time.

Unlike human scoring, Synthia's output is deterministic. The same WSI submitted twice returns identical results.

0.94

HER2 Test-Retest ICC

Same WSI submitted at 7-day interval. Intraclass correlation coefficient measures between-run consistency. Internal validation, n=50 WSI.

0.92

Ki-67 Test-Retest ICC

Nuclear detection consistency across repeat analyses. Includes natural variation from stochastic tile sampling. All runs exceeded ICC 0.90 threshold.

100%

Classification Stability

0+ / 1+ / 2+ / 3+ grade assignment from the same WSI is deterministic — no stochastic classification drift between identical submissions.

Full Validation Methodology →

Evaluate the Science on Your Data

Request a pilot study with your own de-identified WSI set. We'll deliver scored results and a full validation report.

Request Pilot Access