"Why Should I Trust You?": Explaining the Predictions of Any Classifier on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro and Sameer Singh and Carlos Guestrin
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.LG, cs.AI, stat.ML
more

Summaries/Notes 2

[link] Summary by Martin Thoma 7 years ago

This paper describes how to find local interpretable model-agnostic explanations (LIME) why a black-box model $m_B$ came to a classification decision for one sample $x$. The key idea is to evaluate many more samples around $x$ (local) and fit an interpretable model $m_I$ to it. The way of sampling and the kind of interpretable model depends on the problem domain.

For computer vision / image classification, the image $x$ is divided into superpixels. Single super-pixels are made black, the new image $x'$ is evaluated $p' = m_B(x')$. This is done multiple times. 

The paper is also explained in [this YouTube video](https://www.youtube.com/watch?v=KP7-JtFMLo4) by Marco Tulio Ribeiro.

A very similar idea is already in the [Zeiler & Fergus paper](http://www.shortscience.org/paper?bibtexKey=journals/corr/ZeilerF13#martinthoma).

## Follow-up Paper

* June 2016: [Model-Agnostic Interpretability of Machine Learning](https://arxiv.org/abs/1606.05386)
* November 2016:
  * [Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance](https://arxiv.org/abs/1611.05817)
  * [An unexpected unity among methods for interpreting
model predictions](https://arxiv.org/abs/1611.07478)

Your comment:

[link] Summary by Apoorva Shetty 4 years ago

Although Machine learning models have been accepted widely as the next step towards simplifying complex problems, the inner workings of a machine learning model are still unclear and these details can lead to an increase in trust of the model prediction, and the model itself. 

**Idea: ** A good explanation system that can justify the prediction of a classifier and can lead to diagnosing the reasoning behind a model can exponentially raise one’s trust in the predictive model.

**Solution: ** This paper proposes a local explanation model called LIME, that approximates a linear local explanation with respect to a data point. The paper outlines desired characteristics for explainers and expounds on how LIME matches to these characteristics, the characteristics being 1) Interpretable 2) Local Fidelity 3) Model-Agnostic and 4) Provides a global perspective. This paper also explores the concept of Fidelity-Interpretability Trade-off; The more complex a model is the less interpretable a completely faithful explanation would be, thus a balance needs to be struck between interpretability and fidelity for complex models. The paper outlines in detail how the proposed LIME explanation model works, for different types of predictive classifiers. LIME works by generating random data points around a test data point and approximating a linear explanation for these randomized points. Thus, LIME works on a rather large assumption that every complex model is linear on a microscopic level. This assumption although large seems justified for most models, although this could lead to certain global issues when analyzing a complex model on the whole.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private