Fixed Point Quantization of Deep Convolutional NetworksFixed Point Quantization of Deep Convolutional NetworksLin, Darryl Dexu and Talathi, Sachin S. and Annapureddy, V. Sreekanth2015

Paper summaryopenreviewThis paper proposes a layers wise adaptive depth quantization of DCNs, giving an better tradeoff of error rate/ memory requirement than the fixed bit width across layers.
The authors describe an optimization problem for determining the bit-width for different layers of DCNs for reducing model size and required computation.
This paper builds further upon the line of research that tries to represent neural network weights and outputs with lower bit-depths. This way, NN weights will take less memory/space and can speed up implementations of NNs (on GPUs or more specialized hardware).

This paper proposes a layers wise adaptive depth quantization of DCNs, giving an better tradeoff of error rate/ memory requirement than the fixed bit width across layers.
The authors describe an optimization problem for determining the bit-width for different layers of DCNs for reducing model size and required computation.
This paper builds further upon the line of research that tries to represent neural network weights and outputs with lower bit-depths. This way, NN weights will take less memory/space and can speed up implementations of NNs (on GPUs or more specialized hardware).