Awesome Knowledge Distillation in Computer vision

[TOC]

Diffusion Knowledge Distillation

Title	Venue	Note
A Comprehensive Survey on Knowledge Distillation of Diffusion Models	2023	Weijian Luo. [pdf]
Knowledge distillation in iterative generative models for improved sampling speed	2021	Eric Luhman, Troy Luhman. [pdf]
Progressive Distillation for Fast Sampling of Diffusion Models	ICLR 2022	Tim Salimans and Jonathan Ho. [pdf]
On Distillation of Guided Diffusion Models	CVPR 2023	Chenlin Meng, Robin Rombach, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans. [pdf]
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation	2023	Berthelot, David, Autef, Arnaud, Lin, Jierui, Yap, Dian Ang, Zhai, Shuangfei, Hu, Siyuan, Zheng, Daniel, Talbott, Walter, Gu, Eric. [pdf]
BK-SDM: Architecturally Compressed Stable Diffusion for Efficient Text-to-Image Generation	ICML 2023	Kim, Bo-Kyeong, Song, Hyoung-Kyu, Castells, Thibault, Choi, Shinkook. [pdf]
On Architectural Compression of Text-to-Image Diffusion Models	2023	Kim, Bo-Kyeong, Song, Hyoung-Kyu, Castells, Thibault, Choi, Shinkook. [pdf]
Knowledge Diffusion for Distillation	2023	Tao Huang, Yuan Zhang, Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Chang Xu. [pdf]
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds	2023	Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, Jian Ren1. [pdf]
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping	2023	Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Josh Susskind. [pdf]
Consistency models	2023	Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. [pdf]

Knowledge Distillation for Semantic Segmentation

Title	Venue	Note
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks	arXiv:1709.00513
Knowledge Distillation for Semantic Segmentation
Structured knowledge distillation for semantic segmentation	CVPR-2019
Intra-class feature variation distillation for semantic segmentation	ECCV-2020
Channel-wise knowledge distillation for dense prediction	ICCV-2021
Double Similarity Distillation for Semantic Image Segmentation	TIP-2021
Cross-Image Relational Knowledge Distillation for Semantic Segmentation	CVPR-2022

Knowledge Distillation for Object Detection

Title	Venue	Note
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks	arXiv:1709.00513
Mimicking very efficient network for object detection	CVPR 2017	pdf
Distilling object detectors with fine-grained feature imitation	CVPR 2019	pdf
General instance distillation for object detection	CVPR 2021	pdf
Distilling object detectors via decoupled features	CVPR 2021	pdf
Distilling object detectors with feature richness	NeurIPS 2021	pdf
Focal and global knowledge distillation for detectors	CVPR 2022	pdf
Rank Mimicking and Prediction-guided Feature Imitation	AAAI 2022	pdf
Prediction-Guided Distillation	ECCV 2022	pdf
Masked Distillation with Receptive Tokens	ICLR 2023	pdf
Structural Knowledge Distillation for Object Detection	NeurIPS 2022	OpenReview
Dual Relation Knowledge Distillation for Object Detection	IJCAI 2023	pdf
GLAMD: Global and Local Attention Mask Distillation for Object Detectors	ECCV 2022	ECVA
G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation	ICCV 2021	CVF
PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient	NeurIPS 2022	OpenReview
MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection	ECCV 2020	ECVA
LabelEnc: A New Intermediate Supervision Method for Object Detection	ECCV 2020	ECVA

Title	Venue	Note
HEtero-Assists Distillation for Heterogeneous Object Detectors	ECCV 2022	HEAD
LGD: Label-Guided Self-Distillation for Object Detection	AAAI 2022	LGD
When Object Detection Meets Knowledge Distillation: A Survey	TPAMI
ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector	CVPR 2023	ScaleKD
CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection	arXiv:2306.11369	CrossKD

Knowledge Distillation in Vision Transformers

Title	Venue	Note
Training data-efficient image transformers & distillation through attention	ICML2021
Co-advise: Cross inductive bias distillation	CVPR2022
Tinyvit: Fast pretraining distillation for small vision transformers	arXiv:2207.10666
Attention Probe: Vision Transformer Distillation in the Wild	ICASSP2022
Dear KD: Data-Efficient Early Knowledge Distillation for Vision Transformers	CVPR2022
Efficient vision transformers via fine-grained manifold distillation	NIPS2022
Cross-Architecture Knowledge Distillation	arXiv:2207.05273
MiniViT: Compressing Vision Transformers with Weight Multiplexing	CVPR2022
ViTKD: Practical Guidelines for ViT feature knowledge distillation	arXiv 2022	code

Knowledge Distillation for Teacher-Student Gaps

Title	Venue	Note
Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher	AAAI2020
Search to Distill: Pearls are Everywhere but not the Eyes	CVPR 2020
Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation	arXiv:2020
Knowledge Distillation via the Target-aware Transformer	CVPR2022
Decoupled Knowledge Distillation	CVPR 2022	code
Prune Your Model Before Distill It	ECCV 2022	code
Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again	NeurIPS 2022
Weighted Distillation with Unlabeled Examples	NeurIPS 2022
Respecting Transfer Gap in Knowledge Distillation	NeurIPS 2022
Knowledge Distillation from A Stronger Teacher	arXiv:2205.10536
Masked Generative Distillation	ECCV 2022	code
Curriculum Temperature for Knowledge Distillation	AAAI 2023	code
Knowledge distillation: A good teacher is patient and consistent	CVPR 2022
Knowledge Distillation with the Reused Teacher Classifier	CVPR 2022
Scaffolding a Student to Instill Knowledge	ICLR2023
Function-Consistent Feature Distillation	ICLR2023
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation	ICLR2023
Supervision Complexity and its Role in Knowledge Distillation	ICLR2023

Logits Knowledge Distillation

Title	Venue	Note
Distilling the knowledge in a neural network	arXiv:1503.2531
Deep Model Compression: Distilling Knowledge from Noisy Teachers	arXiv:161009650
Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data	ICLR 2017
Knowledge Adaptation: Teaching to Adapt	Arxiv:17022052
Learning from Multiple Teacher Networks	KDD 2017
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results	NIPS 2017
Training Deep Neural Networks in Generations:A More Tolerant Teacher Educates Better Students	arXiv:1805.551
Moonshine:Distilling with Cheap Convolutions	NIPS 2018
Learning from Multiple Teacher Networks	KDD 2017
Positive-Unlabeled Compression on the Cloud	NIPS 2019
Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework	arXiv:1910.12061
Preparing Lessons: Improve Knowledge Distillation with Better Supervision	arXiv:1911.7471
Adaptive Regularization of Labels	arXiv:1908.5474
Learning Metrics from Teachers: Compact Networks for Image Embedding	CVPR 2019
Diversity with Cooperation: Ensemble Methods for Few-Shot Classification	ICCV 2019
Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher	arXiv:1902.3393
MEAL: Multi-Model Ensemble via Adversarial Learning	AAAI 2019
Revisit Knowledge Distillation: a Teacher-free Framework	CVPR 2020 [code]
Ensemble Distribution Distillation	ICLR 2020
Noisy Collaboration in Knowledge Distillation	ICLR 2020
Self-training with Noisy Student improves ImageNet classification	CVPR 2020
QUEST: Quantized embedding space for transferring knowledge	CVPR 2020(pre)
Meta Pseudo Labels	ICML 2020
Subclass Distillation	ICML2020
Boosting Self-Supervised Learning via Knowledge Transfer	CVPR 2018
Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model	CVPR 2020 [code]
Regularizing Class-wise Predictions via Self-knowledge Distillation	CVPR 2020 [code]
Rethinking Data Augmentation: Self-Supervision and Self-Distillation	ICLR 2020
What it Thinks is Important is Important: Robustness Transfers through Input Gradients	CVPR 2020
Role-Wise Data Augmentation for Knowledge Distillation	ICLR 2020 [code]
Distilling Effective Supervision from Severe Label Noise	CVPR 2020
Learning with Noisy Class Labels for Instance Segmentation	ECCV 2020
Self-Distillation Amplifies Regularization in Hilbert Space	arXiv:2002.5715
MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers	arXiv:200210957
Hydra: Preserving Ensemble Diversity for Model Distillation	arXiv:20014694
Teacher-Class Network: A Neural Network Compression Mechanism	arXiv:2004.3281
Learning from a Lightweight Teacher for Efficient Knowledge Distillation	arXiv:2005.9163
Self-Distillation as Instance-Specific Label Smoothing	arXiv:2006.5065
Self-supervised Knowledge Distillation for Few-shot Learning	arXiv:2006.09785
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation	arXiv:2007.1951
Few Sample Knowledge Distillation for Efficient Network Compression	CVPR 2020
Learning What and Where to Transfer	ICML 2019
Transferring Knowledge across Learning Processes	ICLR 2019
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval	ICCV 2019
Diversity with Cooperation: Ensemble Methods for Few-Shot Classification	ICCV 2019
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation	arXiv:191105329v1
Progressive Knowledge Distillation For Generative Modeling	ICLR 2020
Few Shot Network Compression via Cross Distillation	AAAI 2020

Intermediate Knowledge Distillation

Title	Venue	Note
Fitnets: Hints for thin deep nets	arXiv:1412.6550
Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer	ICLR 2017
Knowledge Projection for Effective Design of Thinner and Faster Deep Neural Networks	arXiv:1710.9505
A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning	CVPR 2017
Paraphrasing complex network: Network compression via factor transfer	NIPS 2018
Knowledge transfer with jacobian matching	ICML 2018
Like What You Like: Knowledge Distill via Neuron Selectivity Transfer	CVPR2018
An Embarrassingly Simple Approach for Knowledge Distillation	MLR 2018
Self-supervised knowledge distillation using singular value decomposition	ECCV 2018
Learning Deep Representations with Probabilistic Knowledge Transfer	ECCV 2018
Correlation Congruence for Knowledge Distillation	ICCV 2019
Similarity-Preserving Knowledge Distillation	ICCV 2019
Variational Information Distillation for Knowledge Transfer	CVPR 2019
Knowledge Distillation via Instance Relationship Graph	CVPR 2019
Knowledge Distillation via Instance Relationship Graph	CVPR 2019
Knowledge Distillation via Route Constrained Optimization	ICCV 2019
Similarity-Preserving Knowledge Distillation	ICCV 2019
Stagewise Knowledge Distillation	arXiv: 1911.6786
Distilling Object Detectors with Fine-grained Feature Imitation	ICLR 2020
Knowledge Squeezed Adversarial Network Compression	AAAI 2020
Knowledge Distillation from Internal Representations	AAAI 2020
Knowledge Flow:Improve Upon Your Teachers	ICLR 2019
LIT: Learned Intermediate Representation Training for Model Compression	ICML 2019
A Comprehensive Overhaul of Feature Distillation	ICCV 2019
Residual Knowledge Distillation	arXiv:2002.9168
Knowledge distillation via adaptive instance normalization	arXiv:2003.4289
Channel Distillation: Channel-Wise Attention for Knowledge Distillation	arXiv:2006.01683
Matching Guided Distillation	ECCV 2020
Differentiable Feature Aggregation Search for Knowledge Distillation	ECCV 2020
Local Correlation Consistency for Knowledge Distillation	ECCV 2020

Oneline Knowledge Distillation

Title	Venue	Note
Deep Mutual Learning	CVPR 2018
Born-Again Neural Networks	ICML 2018
Knowledge distillation by on-the-fly native ensemble	NIPS 2018
Collaborative learning for deep neural networks	NIPS 2018
Unifying Heterogeneous Classifiers with Distillation	CVPR 2019
Snapshot Distillation: Teacher-Student Optimization in One Generation	CVPR 2019
Deeply-supervised knowledge synergy	CVPR 2019
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation	ICCV 2019
Distillation-Based Training for Multi-Exit Architectures	ICCV 2019
MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks	arXiv:1911.9418
FEED: Feature-level Ensemble for Knowledge Distillation	AAAI 2020
Stochasticity and Skip Connection Improve Knowledge Transfer	ICLR 2020
Online Knowledge Distillation with Diverse Peers	AAAI 2020
Online Knowledge Distillation via Collaborative Learning	CVPR 2020
Collaborative Learning for Faster StyleGAN Embedding	arXiv:20071758
Online Knowledge Distillation via Collaborative Learning	CVPR 2020
Feature-map-level Online Adversarial Knowledge Distillation	ICML 2020
Knowledge Transfer via Dense Cross-layer Mutual-distillation	ECCV 2020
MetaDistiller: Network Self-boosting via Meta-learned Top-down Distillation	ECCV 2020
ResKD: Residual-Guided Knowledge Distillation	arXiv:2006.4719
Interactive Knowledge Distillation	arXiv:2007.1476

Understanding Knowledge Distillation

Title	Venue	Note
Do deep nets really need to be deep?	NIPS 2014
When Does Label Smoothing Help?	NIPS 2019
Towards Understanding Knowledge Distillation	AAAI 2019
Harnessing deep neural networks with logical rules	ACL 2016
Adaptive Regularization of Labels	arXiv:1908
Knowledge Isomorphism between Neural Networks	arXiv:1908
Understanding and Improving Knowledge Distillation	arXiv:2002.3532
The State of Knowledge Distillation for Classification	arXiv:1912.10850
Explaining Knowledge Distillation by Quantifying the Knowledge	CVPR 2020
DeepVID: deep visual interpretation and diagnosis for image classifiers via knowledge distillation	IEEE Trans, 2019
On the Unreasonable Effectiveness of Knowledge Distillation: Analysis in the Kernel Regime	arXiv:2003.13438
Why distillation helps: a statistical perspective	arXiv:2005.10419
Transferring Inductive Biases through Knowledge Distillation	arXiv:2006.555
Does label smoothing mitigate label noise? Lukasik, Michal et al	ICML 2020
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation	arXiv:2006.3810
Does Adversarial Transferability Indicate Knowledge Transferability?	arXiv:2006.14512
On the Demystification of Knowledge Distillation: A Residual Network Perspective	arXiv:2006.16589
Teaching To Teach By Structured Dark Knowledge	ICLR 2020
Inter-Region Affinity Distillation for Road Marking Segmentation	CVPR 2020 [code]
Heterogeneous Knowledge Distillation using Information Flow Modeling	CVPR 2020 [code]
Local Correlation Consistency for Knowledge Distillation	ECCV2020
Few-Shot Class-Incremental Learning	CVPR 2020
Unifying distillation and privileged information	ICLR 2016

Knowledge Distillation with Pruning , Quantization, NAS

Title	Venue	Note
Accelerating Convolutional Neural Networks with Dominant Convolutional Kernel and Knowledge Pre-regression	ECCV 2016
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning	ICLR 2018
Slimmable Neural Networks	ICLR 2018
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy	NIPS 2018
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning	ICCV 2019
LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning	ICLR 2020
Pruning with hints: an efficient framework for model acceleration	ICLR 2020
Knapsack Pruning with Inner Distillation	arXiv:2002.8258
Training convolutional neural networks with cheap convolutions and online distillation	arXiv:190913063
Cooperative Pruning in Cross-Domain Deep Neural Network Compression	IJCAI 2019
QKD: Quantization-aware Knowledge Distillation	arXiv:191112491v1
Neural Network Pruning with Residual-Connections and Limited-Data	CVPR 2020
Training Quantized Neural Networks with a Full-precision Auxiliary Module	CVPR 2020
Towards Effective Low-bitwidth Convolutional Neural Networks	CVPR 2018
Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations	arXiv:19084680
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation	arXiv:200611487
Knowledge Distillation Beyond Model Compression	arxiv:20071493
Teacher Guided Architecture Search	ICCV 2019
Distillation Guided Residual Learning for Binary Convolutional Neural Networks	ECCV 2020
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution	ECCV 2020
Improving Neural Architecture Search Image Classifiers via Ensemble Learning	arXiv:19036236
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation	arXiv:191113053v1
Towards Oracle Knowledge Distillation with Neural Architecture Search	AAAI 2020
Search for Better Students to Learn Distilled Knowledge	arXiv:200111612
Circumventing Outliers of AutoAugment with Knowledge Distillation	arXiv:200311342
Network Pruning via Transformable Architecture Search	NIPS 2019
Search to Distill: Pearls are Everywhere but not the Eyes	CVPR 2020
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks	ICML 2020 [code]

Application of Knowledge Distillation

Sub	Title	Venue
Graph	Graph-based Knowledge Distillation by Multi-head Attention Network	arXiv:19072226
	Graph Representation Learning via Multi-task Knowledge Distillation	arXiv:19115700
	Deep geometric knowledge distillation with graphs	arXiv:19113080
	Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification	IJCAI 2018
	Distillating Knowledge from Graph Convolutional Networks	CVPR 2020

Face	Face model compression by distilling knowledge from neurons	AAAI 2016
	MarginDistillation: distillation for margin-based softmax	arXiv:2003.2586

ReID	Distilled Person Re-Identification: Towards a More Scalable System	CVPR 2019
	Robust Re-Identification by Multiple Views Knowledge Distillation	ECCV 2020 [code]

Detection	Learning efficient object detection models with knowledge distillation	NIPS 2017
	Distilling Object Detectors with Fine-grained Feature Imitation	CVPR 2019
	Relation Distillation Networks for Video Object Detection	ICCV 2019
	Learning Lightweight Face Detector with Knowledge Distillation	IEEE 2019
	Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection	ICCV 2019
	Learning Lightweight Lane Detection CNNs by Self Attention Distillation	ICCV 2019
	A Multi-Task Mean Teacher for Semi-Supervised Shadow Detection	CVPR 2020 [code]
	Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer	ECCV 2020
	A Multi-Task Mean Teacher for Semi-Supervised Shadow Detection	CVPR 2020 [code]
	Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection	IEEE 2020 [code]
	Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings	CVPR 2020
	Distilling Knowledge from Refinement in Multiple Instance Detection Networks	arXiv:2004.10943
	Enabling Incremental Knowledge Transfer for Object Detection at the Edge	arXiv:2004.5746

Pose	DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild	ECCV 2020
	Fast Human Pose Estimation	CVPR 2019
	Distill Knowledge From NRSfM for Weakly Supervised 3D Pose Learning	ICCV 2019

Segmentation	ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes	CVPR 2018
	Knowledge Distillation for Incremental Learning in Semantic Segmentation	arXiv:1911.3462
	Geometry-Aware Distillation for Indoor Semantic Segmentation	CVPR 2019
	Structured Knowledge Distillation for Semantic Segmentation	CVPR 2019
	Self-similarity Student for Partial Label Histopathology Image Segmentation	ECCV 2020
	Knowledge Distillation for Brain Tumor Segmentation	arXiv:2002.3688
	ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes	CVPR 2018

Low-Vision	Lightweight Image Super-Resolution with Information Multi-distillation Network	ICCVW 2019
	Collaborative Distillation for Ultra-Resolution Universal Style Transfer	CVPR 2020 [code]

Video	Efficient Video Classification Using Fewer Frames	CVPR 2019
	Relation Distillation Networks for Video Object Detection	ICCV 2019
	Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection	ICCV 2019
	Progressive Teacher-student Learning for Early Action Prediction	CVPR 2019
	MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization	arXiv:1910.12295
	AWSD:Adaptive Weighted Spatiotemporal Distillation for Video Representation	ICCV 2019
	Dynamic Kernel Distillation for Efficient Pose Estimation in Videos	ICCV 2019
	Online Model Distillation for Efficient Video Inference	ICCV 2019
	Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer	ECCV 2020
	Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition	ECCV 2020
	Object Relational Graph with Teacher-Recommended Learning for Video Captioning	CVPR 2020
	Spatio-Temporal Graph for Video Captioning with Knowledge distillation	CVPR 2020 [code]
	TA-Student VQA: Multi-Agents Training by Self-Questioning	CVPR 2020

Data-free Knowledge Distillation

Title	Venue	Note
Data-Free Knowledge Distillation for Deep Neural Networks	NIPS 2017
Zero-Shot Knowledge Distillation in Deep Networks	ICML 2019
DAFL:Data-Free Learning of Student Networks	ICCV 2019
Zero-shot Knowledge Transfer via Adversarial Belief Matching	NIPS 2019
Dream Distillation: A Data-Independent Model Compression Framework	ICML 2019
Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion	CVPR 2020
Data-Free Adversarial Distillation	CVPR 2020
The Knowledge Within: Methods for Data-Free Model Compression	CVPR 2020
Knowledge Extraction with No Observable Data	NIPS 2019
Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN	CVPR 2020
DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier	arXiv:1912.11960
Generative Low-bitwidth Data Free Quantization	arXiv:2003.3603
This dataset does not exist: training models from generated images	arXiv:1911.2888
MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation	arXiv:2005.3161
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data	ECCV 2020
Billion-scale semi-supervised learning for image classification	arXiv:1905.00546
Data-free Parameter Pruning for Deep Neural Networks	arXiv:1507.6149
Data-Free Quantization Through Weight Equalization and Bias Correction	ICCV 2019
DAC: Data-free Automatic Acceleration of Convolutional Networks	WACV 2019

Cross-modal Knowledge Distillation

Title	Venue	Note
SoundNet: Learning Sound Representations from Unlabeled Video SoundNet Architecture	ECCV 2016
Cross Modal Distillation for Supervision Transfer	CVPR 2016
Emotion recognition in speech using cross-modal transfer in the wild	ACM MM 2018
Through-Wall Human Pose Estimation Using Radio Signals	CVPR 2018
Compact Trilinear Interaction for Visual Question Answering	ICCV 2019
Cross-Modal Knowledge Distillation for Action Recognition	ICIP 2019
Learning to Map Nearly Anything	arXiv:1909.6928
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval	ICCV 2019
UM-Adapt: Unsupervised Multi-Task Adaptation Using Adversarial Cross-Task Distillation	ICCV 2019
CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency	CVPR 2019
XD:Cross lingual Knowledge Distillation for Polyglot Sentence Embeddings
Effective Domain Knowledge Transfer with Soft Fine-tuning	arXiv:1909.2236
ASR is all you need: cross-modal distillation for lip reading	arXiv:1911.12747
Knowledge distillation for semi-supervised domain adaptation	arXiv:1908.7355
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition	arXiv:2001.1798
Cluster Alignment with a Teacher for Unsupervised Domain Adaptation	ICCV 2019.
Attention Bridging Network for Knowledge Transfer	ICCV 2019
Unpaired Multi-modal Segmentation via Knowledge Distillation	arXiv:2001.3111
Multi-source Distilling Domain Adaptation	arXiv:1911.11554
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing	CVPR 2020
Improving Semantic Segmentation via Self-Training	arXiv:2004.14960
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation	arXiv:2005.8213
Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation	arXiv:2005.7839
Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge	CVPR 2020
Large-Scale Domain Adaptation via Teacher-Student Learning	arXiv:1708.5466
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data	IJCAI 2020
Distilling Cross-Task Knowledge via Relationship Matching	CVPR 2020 [code]
Modality distillation with multiple stream networks for action recognition	ECCV 2018
Domain Adaptation through Task Distillation	ECCV 2020

Adversarial Knowledge Distillation

Title	Venue	Note
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks	arXiv:1709.00513
KTAN: Knowledge Transfer Adversarial Network	arXiv:1810.08126
KDGAN:Knowledge Distillation with Generative Adversarial Networks.	NIPS 2018
Adversarial Learning of Portable Student Networks	AAAI 2018
Adversarial Network Compression	ECCV 2018
Cross-Modality Distillation: A case for Conditional Generative Adversarial Networks	ICASSP 2018
Adversarial Distillation for Efficient Recommendation with External Knowledge	TOIS 2018
Training student networks for acceleration with conditional adversarial networks	BMVC 2018
Adversarial network compression	ECCV 2018
KDGAN:Knowledge Distillation with Generative Adversarial Networks	NIPS 2018
DAFL:Data-Free Learning of Student Networks	ICCV 2019
MEAL: Multi-Model Ensemble via Adversarial Learning	AAAI 2019
Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection	AAAI 2019
Adversarially Robust Distillation	AAAI 2020
GAN-Knowledge Distillation for one-stage Object Detection	arXiv:1906.08467
Lifelong GAN: Continual Learning for Conditional Image Generation	arXiv:1908.03884
Compressing GANs using Knowledge Distillation	arXiv:1902.00159
Feature-map-level Online Adversarial Knowledge Distillation	ICML 2020
MineGAN: effective knowledge transfer from GANs to target domains with few images	CVPR 2020
Distilling portable Generative Adversarial Networks for Image Translation	AAAI 2020
GAN Compression: Efficient Architectures for Interactive Conditional GANs	CVPR 2020

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Knowledge Distillation in Computer vision

Diffusion Knowledge Distillation

Knowledge Distillation for Semantic Segmentation

Knowledge Distillation for Object Detection

Knowledge Distillation in Vision Transformers

Knowledge Distillation for Teacher-Student Gaps

Logits Knowledge Distillation

Intermediate Knowledge Distillation

Oneline Knowledge Distillation

Understanding Knowledge Distillation

Knowledge Distillation with Pruning , Quantization, NAS

Application of Knowledge Distillation

Data-free Knowledge Distillation

Cross-modal Knowledge Distillation

Adversarial Knowledge Distillation

About

Releases

Packages

Contributors 2

lliai/Awesome-Vision-Knowledge-Distillation

Folders and files

Latest commit

History

Repository files navigation

Awesome Knowledge Distillation in Computer vision

Diffusion Knowledge Distillation

Knowledge Distillation for Semantic Segmentation

Knowledge Distillation for Object Detection

Knowledge Distillation in Vision Transformers

Knowledge Distillation for Teacher-Student Gaps

Logits Knowledge Distillation

Intermediate Knowledge Distillation

Oneline Knowledge Distillation

Understanding Knowledge Distillation

Knowledge Distillation with Pruning , Quantization, NAS

Application of Knowledge Distillation

Data-free Knowledge Distillation

Cross-modal Knowledge Distillation

Adversarial Knowledge Distillation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages