The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets
Paper summary Carlini et al. propose several attacks to extract secrets form trained black-box models. Additionally, they show that state-of-the-art neural networks memorize secrets early during training. Particularly on the Penn treebank, after inserting a secret of specific format, the authors validate that the secret can be identified based on the models output probabilities (i.e., black-box access). Several metrics based on the log-perplexity of the secret show that secrets are memorized early during training and memorization happens for all popular architectures and training strategies; additionally, memorization also works for multiple secrets. Furthermore, the authors propose several attacks to extract secrets, most notably through shortest path search. Here, starting with an empty secret, the characters of the secret are identified sequentially in order to minimize log-perplexity. Using this attack, secrets such as credit card numbers are extractable from popular mail datasets. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
scholar.google.com
The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets
Carlini, Nicholas and Liu, Chang and Kos, Jernej and Erlingsson, Úlfar and Song, Dawn
arXiv e-Print archive - 2018 via Local Bibsonomy
Keywords: dblp


[link]
Summary by David Stutz 5 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and