Towards A Rigorous Science of Interpretable Machine Learning on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez and Been Kim
arXiv e-Print archive - 2017 via Local arXiv
Keywords: stat.ML, cs.AI, cs.LG
more

Summaries/Notes 1

[link] Summary by Apoorva Shetty 4 years ago

For a machine learning model to be trusted/ used one would need to be confident in its capabilities of dealing with all possible scenarios. To that end, designing unit test cases for more complex and global problems could be costly and bordering on impossible to create.

**Idea**: We need a basic guideline that researchers and developers can adhere to when defining problems and outlining solutions, so that model interpretability can be defined accurately in terms of the problem statement.

**Solution**: This paper outlines the basics of machine learning interpretability, what that means for different users, and how to classify these into understandable categories that can be evaluated. This paper highlights the need for interpretability, which arises from *incompleteness*,either of the problem statement, or the problem domain knowledge. This paper provides three main categories to evaluating a model/ providing interpretations:
- *Application Grounded Evaluation*: These evaluations are more costly, and involve real humans evaluating real tasks that a model would take up. Domain knowledge is necessary for the humans evaluating the real task handled by the model.
- *Human Grounded Evaluation:* these evaluations are simpler than application grounded, as they simplify the complex task and have humans evaluate the simplified task. Domain knowledge is not necessary in such an evaluation.
- *Functionally Grounded Evaluation:* No humans are involved in this version of evaluation, here previously evaluated models are perfected or tweaked to optimize certain functionality. Explanation quality is measured by a formal definition of interpretability.

This paper also outlines certain issues with the above three evaluation processes, there are certain questions that need answering before we can pick an evaluation method and metric.
-To highlight the factors of interpretability, we are provided with the Data-driven approach. Here we analyze each task and the various methods used to fulfill the task and see which of these methods and tasks are most significant to the model.
- We are introduced to the term latent dimensions of interpretability, i.e. dimensions that are inferred not observed. These are divided into task related latent dimensions and method related latent dimensions, these are a long list of factors that are task specific or method specific.

Thus this paper provides a basic taxonomy for how we should evaluate our model, and how these evaluations differ from problem to problem. The ideal scenario outlined is that researchers provide the relevant information to evaluate their proposition correctly (correctly in terms of the domain and the problem scope).

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private