Quantifying uncertainty in machine learning models

Samuel Rochette

Abstract

We'll see why and how it is very important to compute uncertainty in inferential statistics and predictive machine learning models. 1) Deep dive in random forest Random Forest gives us naturally an estimation of the distribution for each sample thanks to the bagging technic. 2) Generalisation for regression The quantile loss is useful to compute prediction intervals for every regression model. It is however a computationally costly. Certain loss like cosh can help against this con. 3) What about classification In classification, probability is a measure of the uncertainty... but does every models give us good probabilities ? Let plot some reliability curve to check if we need to calibrate the output with a sigmoid a an isotonic regression !

Relevant Publications in Telecommunications System & Management