Exploring Accumulated Gradient-Based Quantization and Compression for Deep Neural Networks

TR Number

Date

2020-05-29

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

The growing complexity of neural networks makes their deployment on resource-constrained embedded or mobile devices challenging. With millions of weights and biases, modern deep neural networks can be computationally intensive, with large memory, power and computational requirements. In this thesis, we devise and explore three quantization methods (post-training, in-training and combined quantization) that quantize 32-bit floating-point weights and biases to lower bit width fixed-point parameters while also achieving significant pruning, leading to model compression. We use the total accumulated absolute gradient over the training process as the indicator of importance of a parameter to the network. The most important parameters are quantized by the smallest amount. The post-training quantization method sorts and clusters the accumulated gradients of the full parameter set and subsequently assigns a bit width to each cluster. The in-training quantization method sorts and divides the accumulated gradients into two groups after each training epoch. The larger group consisting of the lowest accumulated gradients is quantized. The combined quantization method performs in-training quantization followed by post-training quantization. We assume storage of the quantized parameters using compressed sparse row format for sparse matrix storage. On LeNet-300-100 (MNIST dataset), LeNet-5 (MNIST dataset), AlexNet (CIFAR-10 dataset) and VGG-16 (CIFAR-10 dataset), post-training quantization achieves 7.62x, 10.87x, 6.39x and 12.43x compression, in-training quantization achieves 22.08x, 21.05x, 7.95x and 12.71x compression and combined quantization achieves 57.22x, 50.19x, 13.15x and 13.53x compression, respectively. Our methods quantize at the cost of accuracy, and we present our work in the light of the accuracy-compression trade-off.

Description

Keywords

Deep Neural Networks, Quantization, Pruning, Fixed-Point

Citation

Collections