Multi-iteration compression for deep neural networks
Abstract:
A multi-iteration method for compressing a deep neural network into a sparse neural network without degrading the accuracy is disclosed herein. In an example, the method includes determining a respective initial compression ratio for each of a plurality of matrices characterizing the weights between the neurons of the neural network, compressing each of the plurality of matrices based on the respective initial compression ratio, so as to obtain a compressed neural network, and fine-tuning the compressed neural network.
Public/Granted literature
Information query
Patent Agency Ranking
0/0