References

Summary: This a is non-exhaustive list of references for this component.

Table of Contents

Model Compression
- References
NAS
- References
Compressed Feature Extraction (i.e. Compression of Representation Learning Models)
- References

Model Compression

References

2013

Restructuring of Deep Neural Network Acoustic Models with Singular Value Decomposition

2015

Neurons vs Weights Pruning in Artificial Neural Networks

2016

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Dynamic Network Surgery for Efficient DNNs
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
Learning Structured Sparsity in Deep Neural Networks

2017

Soft Weight-Sharing for Neural Network Compression
Variational Dropout Sparsifies Deep Neural Networks
Structured Bayesian Pruning via Log-Normal Multiplicative Noise
Bayesian Compression for Deep Learning
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
To prune, or not to prune: exploring the efficacy of pruning for model compression
A Survey of Model Compression and Acceleration for Deep Neural Networks

2018

Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks
Bayesian Compression for Natural Language Processing
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Rethinking the Value of Network Pruning
AMC: AutoML for Model Compression and Acceleration on Mobile Devices

2019

Stabilizing the Lottery Ticket Hypothesis
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Weight Agnostic Neural Networks
The State of Sparsity in Deep Neural Networks

2020

A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions
An Overview of Neural Network Compression

2021

Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Compressed Feature Extraction (i.e. Compression of Representation Learning Models)

Note: Representation Learning Models (RLMs) can be either pretrained or not, but training (or reuse) takes place during the stage of Preprocessing. RLMs can be in the NLP, Computer Vision or other domain.

References

2019

Patient Knowledge Distillation for BERT Model Compression

2021

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
Towards Efficient Post-training Quantization of Pre-trained Language Models
Compression of Generative Pre-trained Language Models via Quantization
Synergistic Self-supervised and Quantization Learning
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

2022

ViTKD: Practical Guidelines for ViT feature knowledge distillation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

References

Model Compression

References

NAS

References

Compressed Feature Extraction (i.e. Compression of Representation Learning Models)

References

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

References

Model Compression

References

NAS

References

Compressed Feature Extraction (i.e. Compression of Representation Learning Models)

References