SNACS: Slimming Neural Networks Using Adaptive Connectivity Scores
Published in TNNLS (Under Review), 2021
Summary: SNACS advances the state-of-the-art in single shot neural network pruning by focusing on 3 key aspects of the pruning pipeline, 1) faster computation of connectivity scores, which determine the importance of a weight, 2) proposal of guidelines that automate the definition of the upper pruning percentage limits in all the layers of a neural network, and 3) identification of sensitivity as a priority measure to determine which weights are protected or pruned.
Abstract: There are two broad approaches to deep neural network (DNN) pruning: 1) applying a deterministic constraint on the weight matrices, which takes advantage of its ease of implementation and the underlying structure of the weight matrices, and 2) using a probabilistic framework aimed at reducing the redundancy in the information content between layers while maintaining the flow of important information downstream. Each approach has its merits and limitations, with some common practical issues affecting both, e.g., manual effort required to analyze sensitivity of layers to pruning or set the upper pruning percentage limits of layers and the amount of resources spent to iteratively prune DNNs. In this work, we propose a new single-shot pruning algorithm, called Slimming Neural networks using Adaptive Connectivity Scores (SNACS), that integrates the two common approaches to pruning and overcomes their respective limitations. % that uses a probabilistic pruning framework embedded with weight-based constraints. Our proposed approach combines a probabilistic pruning framework and constraints on the underlying weight matrices at multiple levels to capitalize on the strengths of both approaches while solving their deficiencies.
In SNACS, we propose a hash-based estimator of Adaptive Conditional Mutual Information (ACMI), that uses a weight-based scaling criterion, to evaluate the strength of the connectivity between filters of adjacent layers and use this measure to prune unimportant filters. To protect against excessive or irregular pruning, we identify a set of operating constraints to automate the definition of upper pruning percentage limits across all the layers in a deep network. Finally, we take extended advantage of weight matrices by defining a novel sensitivity criterion for filters that measures the strength of their contributions to the succeeding layer and highlights critical filters that need to be protected from pruning. Through our experimental validation we show that SNACS is faster by over 17x the nearest comparable method and is the state-of-the-art single-shot pruning method across three standard Dataset-DNN pruning benchmarks: CIFAR10-VGG16, CIFAR10-ResNet56 and ILSVRC2012-ResNet50.
Recommended citation: Ganesh, M.R., Corso, J.J. and Sekeh, S.Y.. Slimming Neural Networks Using Adaptive Connectivity Scores.