I work at the boundary where machine learning meets real-world deployment — particularly in high-stakes domains where model confidence matters as much as accuracy. Current focus: taking published research and making it production-ready.
Extending EAGLE with Monte Carlo Dropout for Robust EGFR Prediction
Extended EAGLE — a GigaPath-based pathology foundation model published in Nature Medicine — by adding uncertainty quantification via Monte Carlo Dropout. Built and deployed the production inference pipeline on 24 H100 GPUs at a major cancer center. EAGLE detects EGFR mutations in lung adenocarcinoma from whole-slide pathology images; the uncertainty layer flags low-confidence predictions for pathologist review instead of making autonomous decisions.
A model that's confidently wrong in medical diagnostics is more dangerous than one that says 'I don't know.' Standard neural networks output a prediction without any signal about how certain they are. In cancer biomarker detection, a false positive triggers unnecessary treatment; a false negative delays lifesaving intervention. The challenge was adding calibrated uncertainty to an already-trained model — teaching it to express what it doesn't know.
Monte Carlo Dropout turns a trained neural network into a Bayesian approximator. Run the same input through the model N times with dropout active — the variance across runs is your uncertainty estimate. High variance = low confidence = flag for human review. This principle applies everywhere: any AI system making high-stakes decisions should surface confidence intervals alongside outputs. This is why I teach safety-first AI agent design.
EGFR mutations are found in 15–30% of non-small cell lung cancers and play a crucial role in guiding targeted therapies. We build on the EAGLE pathology model by adding Monte Carlo Dropout to quantify prediction uncertainty. Tested on 205 slides from TCGA-LUAD, the pipeline processes each slide in ~8 minutes on A100 GPUs and provides calibrated confidence scores — enabling assisted clinical workflows where uncertain predictions are flagged for pathologist review.
↗ Open PDFThe best research questions come from hitting real walls. I ran production inference on 24 H100 GPUs before I understood why uncertainty mattered. You learn different things from papers than from deployed systems.
A model that knows what it doesn't know is more useful than one that's always confident. I apply this to every AI system I build: if it can't surface a confidence interval, it shouldn't be making autonomous decisions.
The best AI systems are built by people who understand both the model and the domain. Pathologists and engineers have to be in the same room. This is true for fintech, legal, education, and anywhere else AI is deployed in high-stakes environments.
Every research project should be reproducible. Open weights, open code, documented training runs. Science that can't be replicated isn't science — it's a press release.