Hardware-oriented Approximation of Convolutional Neural NetworksHardware-oriented Approximation of Convolutional Neural NetworksGysel, Philipp and Motamedi, Mohammad and Ghiasi, Soheil2016

Paper summaryopenreviewThe authors present a framework that can quantize Caffe models into 8-bit and lower fixed-point precision models, which is useful for lowering memory and energy consumption on embedded devices. The compression is an iterative algorithm that determines data statistics to figure out activation and parameter ranges that can be compressed, and conditionally optimizes convolutional weights, fully connected weights and activations given the compression of the other parts.
This work focuses on processing models already trained with high numerical precision (32 bits float) and compress them, as opposed to other work that tries to train directly with quantized operations.

The authors present a framework that can quantize Caffe models into 8-bit and lower fixed-point precision models, which is useful for lowering memory and energy consumption on embedded devices. The compression is an iterative algorithm that determines data statistics to figure out activation and parameter ranges that can be compressed, and conditionally optimizes convolutional weights, fully connected weights and activations given the compression of the other parts.
This work focuses on processing models already trained with high numerical precision (32 bits float) and compress them, as opposed to other work that tries to train directly with quantized operations.