OncoNeo400 - New AI Confidence Interval feature

OncoNeo400 - New AI Confidence Interval feature

What's one of the main aspects that can bring a Statistical Advantage to an AI model? Improving individual predictions and modeling the confidence intervals for individual predictions. This way we are not just saying - our model is % accurate, we are saying its % accurate on the model level, but we also have a confidence level for each individual prediction... A huge step forward in making AI models more robust...

Id like to say special thanks to Geoffrey Johnson and Aleksander Molak for discussions on this topic which is, improving the predictions with Frequentist confidence intervals in Predictive AI Systems.

Now a bit on OncoNeo400 where this experimental feature is being implemented as i write this article...

First AI model PEP92 was based on point estimates as most AI models today are. Note - PEP92 is a part of OncoNeo400 (may be found on bioaiworks.com), Precision Oncology AI system for predicticting neoepitopes / neoantigens on the surface of tumor cells and their immunogenicity towards MHC I, T cell and B cell responses.

PEP92 v2 is now made out of 50 Tree based ML models and is able to construct Frequentist Confidence intervals around every AI point estimate (prediction probability). MHC I vs Neoantigen (max length = 15)


The results speak for themselves :

Standard PEP 92 (1 Deep Learning model) all test data predictions (300 000 +)

Test Sensitivity: 0.9164

Test Specificity: 0.7917


Confidence based PEP 92 v2 (50 Tree based ML model Ensemble of Ensembles) on (100 000+) test data confident predictions

Test Sensitivity (TPR): 0.9791

Test Specificity (TNR): 0.9649


It should be noted that direct comparison between these two systems is not viable (due to exclusion of the non confident data), but is still useful (in the end users will want more confident predictions). In the latter case not all predictions are counted, so non-confident predictions are disregarded and in the first case 1 Deep Learning model, all predictions are included. Also strong balancing power of Tree based ensemble is also used.

Next steps, PEP92 v2 will be introduced in the OncoNeo400 and will in my opinion become one of the main advantages over competitors in the Oncology Antigen prediction space, more confident predictions. In the end, Biotech Research requires reliable and more confident predictions. Even tough the reliability is best tested using de novo lab experiments, testing it on 100 000 a posteriori on experiments is only feasible large scale test for an AI system like this.

Thinking further, the PEP92 v3 will probably have 100+ to 1000+ AI Ensembles of Ensembles of models and with further hyperparameter tuning may get a percent up in my guesstimate, its already close to very very high accuracy so every percent up is now very difficult to achieve.



OncoNeo400 will be developed further in Partnership with Oyanalytika company.

BioAIWorks platform creator - Darko Medin




To view or add a comment, sign in

More articles by Darko Medin

Insights from the community

Others also viewed

Explore topics