Deep compression: compressing deep neural networks

speaker: Brahim event: Papers We Love SG ** extend deep neural networks to embedded systems reduce power consumption, space needed ** pruning: remove some weights remove low weights, retrain the remaining weights, repeat ** quantization: less bits per weight scalar quantization and centroids fine-tuning ** huffman encoding ** design custom ASICs for the compressed network

Melvin's digital garden

Deep compression: compressing deep neural networks

Links to this note