On Calibration of Modern Neural Networks on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

On Calibration of Modern Neural Networks
Chuan Guo and Geoff Pleiss and Yu Sun and Kilian Q. Weinberger
arXiv e-Print archive - 2017 via Local arXiv
Keywords: cs.LG
more

Summaries/Notes 2

[link] Summary by David Stutz 5 years ago

Guo et al. study calibration of deep neural networks as post-processing step. Here, calibration means a correction of the predicted confidence scores as these are commonlz too overconfident in recent deep networks. They consider several state-of-the-art post-processing steps for calibration, but surprisingly, they show that a simple linear mapping, or even scaling, works surprisingly well. So if $z_i$ are the logits of the network, then (the network being fixed) a parameter $T$ is found such that

$\sigma(\frac{z_i}{T})$

is calibrated and minimized the NLL loss on a held-out validation set. Here, the temeratur $T$ either softens or roughens the probability distribution over classes. Interestingly, finding $T$ by optimizing the same training loss helps to reduce over-confidence.

Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment: