Hardware-oriented Approximation of Convolutional Neural Networks Hardware-oriented Approximation of Convolutional Neural Networks
Paper summary The authors present a framework that can quantize Caffe models into 8-bit and lower fixed-point precision models, which is useful for lowering memory and energy consumption on embedded devices. The compression is an iterative algorithm that determines data statistics to figure out activation and parameter ranges that can be compressed, and conditionally optimizes convolutional weights, fully connected weights and activations given the compression of the other parts. This work focuses on processing models already trained with high numerical precision (32 bits float) and compress them, as opposed to other work that tries to train directly with quantized operations.
arxiv.org
scholar.google.com
Hardware-oriented Approximation of Convolutional Neural Networks
Gysel, Philipp and Motamedi, Mohammad and Ghiasi, Soheil
arXiv e-Print archive - 2016 via Bibsonomy
Keywords: dblp


Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About