_Objective:_ Reduce learning time for [DQN](https://deepmind.com/research/dqn/)-type architectures. They introduce a new network element, called DND (Differentiable Neural Dictionary) which is basically a dictionary that uses any key (especially embeddings) and computes the value by using kernel between keys. Plus it's differentiable. ## Architecture: They use basically a network in two steps: 1. A classical CNN network that computes and embedding for every image. 2. A DND for all possible actions (controller input) that stores the embedding as key and estimated reward as value. Also they use a buffer to store all tuples (previous image, action, reward, next image) and for training basic technique is used. [![screen shot 2017-04-12 at 11 23 32 am](https://cloud.githubusercontent.com/assets/17261080/24951103/92930022-1f73-11e7-97d2-628e2f4b5a33.png)](https://cloud.githubusercontent.com/assets/17261080/24951103/92930022-1f73-11e7-97d2-628e2f4b5a33.png) ## Results: Clearly improves learning speed but in the end other techniques catchup and it gets outperformed.