Tensor Decomposition

A curated list of tensor decomposition resources for model compression.

📋 Research Papers

Transformer, LLM & more

Title	Venue	Year
MoE-I2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition	EMNLP 2024 Findings	2024
Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs	Arxiv	2024
Dual-grained Lightweight Strategy	TPAMI	2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models	Arxiv	2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients	Arxiv	2024
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction	ICLR	2024
MoDeGPT: Modular Decomposition for Large Language Model Compression	Arxiv	2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression	Arxiv	2024
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models	Arxiv	2024
AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning	EMNLP	2024
Adaptive Rank Selections for Low-Rank Approximation of Language Models	NAACL	2024
Dynamic Low-rank Estimation for Transformer-based Language Models	NAACL Findings	2024
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation	CVPR	2024
Adaptive Rank Selections for Low-Rank Approximation of Language Models	ACL	2024
TRAWL: Tensor Reduced and Approximated Weights for Large Language Models	Arxiv	2024
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization	ISCA	2024
Unified Low-rank Compression Framework for Click-through Rate Prediction	KDD2024 ADS	2024
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models	NAACL	2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization	Arxiv	2024
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer	WACV	2024
FLoRA: Low-Rank Core Space for N-dimension	Arxiv	2024
Enhancing GAN Performance Through Neural Architecture Search and Tensor Decomposition	ICASSP	2024
A Computational Study of Matrix Decomposition Methods for Compression of Pre-trained Transformers	PACLIC	2024
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation	ICML	2023
FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer	AAAI	2023
Compressing Transformers: Features Are Low-Rank, but Weights Are Not!	AAAI	2023
Strategies for Applying Low Rank Decomposition to Transformer-Based Models	NeurIPS	2022
Kronecker Decomposition for GPT Compression	ACL	2022
Language model compression with weighted low-rank factorization	ICLR	2022
DRONE: Data-aware Low-rank Compression for Large NLP Models	NeurIPS	2021
A Tensorized Transformer for Language Modeling	NeurIPS	2019

CNN & RNN

Title	Venue	Year
TEC-CNN: Towards Efficient Compressing Convolutional Neural Nets with Low-rank Tensor Decomposition	TOMM	2024
Activation Map Compression through Tensor Decomposition for Deep Learning	NeurIPS	2024
Geometry-aware training of factorized layers in tensor Tucker format	NeurIPS	2024
Robustness of Tensor Decomposition-Based Neural Network Compression	ICIP	2024
How to Train Your Unstable Looped Tensor Network	JSTSP	2024
Learning Low-Rank Tensor Cores with Probabilistic l0-Regularized Rank Selection for Model Compression	IJCAI	2024
Compact Model Training by Low-Rank Projection With Energy Transfer	TNNLS	2024
An Accuracy-Preserving Neural Network Compression via Tucker Decomposition	IEEE Transactions on Sustainable Computing	2024
Convolution Filter Compression via Sparse Linear Combinations of Quantized Basis	TNNLS	2024
Co-Exploring Structured Sparsification and Low-Rank Tensor Decomposition for Compact DNNs	TNNLS	2024
Coarse-To-Fine Tensor Trains for Compact Visual Representations	ICML	2024
Position: Tensor Networks are a Valuable Asset for Green AI	ICML	2024
Compression-aware Training of Neural Networks using Frank-Wolfe	Arxiv	2024
A Practical Approach for Employing Tensor Train Decomposition in Edge Devices	IJPP	2024
Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition	Arxiv	2024
Reduced storage direct tensor ring decomposition for convolutional neural networks compression	Arxiv	2024
Federated Learning Using Coupled Tensor Train Decomposition	Arxiv	2024
Neural Network Compression Based on Tensor Ring Decomposition	TNNLS	2024
Enhanced network compression through tensor decompositions and pruning	TNNLS	2024
Deep Convolutional Neural Network Compression Method: Tensor Ring Decomposition with Variational Bayesian Approach	Neural Processing Letters	2024
Deep Learning Model Compression With Rank Reduction in Tensor Decomposition	TNNLS	2023
MARS: Masked Automatic Ranks Selection in Tensor Decompositions	AISTATS	2023
Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific Tensor Decomposition	FPL	2023
SVD-NAS: Coupling Low-Rank Approximation and Neural Architecture Search	WACV	2023
How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression?	ICLR	2023
Tensor shape search for efficient compression of tensorized data and neural networks	Applied Soft Computing	2023
Compressing convolutional neural networks with hierarchical Tucker-2 decomposition	Applied Soft Computing	2023
Tensor shape search for efficient compression of tensorized data and neural networks	Applied Soft Computing	2023
An effective low-rank compression with a joint rank selection followed by a compression-friendly training	Neural Networks	2023
Joint matrix decomposition for deep convolutional neural networks compression	Neurocomputing	2023
Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization	Arxiv	2023
Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks	TNNLS	2022
Compression of Deep Neural Networks based on quantized tensor decomposition to implement on reconfigurable hardware platforms	Neural Networks	2022
Teacher–student knowledge distillation based on decomposed deep feature representation for intelligent mobile applications	Expert Systems with Applications	2022
HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks	CVPR	2022
Towards Practical Control of Singular Values of Convolutional Layers	NeurIPS	2022
Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations	NeurIPS	2022
BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition	AAAI	2022
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition	AAAI	2022
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination	SIMODS	2022
A novel compact design of convolutional layers with spatial transformation towards lower-rank representation for image classification	Knowledge-Based Systems	2022
Deep neural network compression by Tucker decomposition with nonlinear response	Knowledge-Based Systems	2022
Nested compression of convolutional neural networks with Tucker-2 decomposition	IJCNN	2022
PSM-nets: Compressing Neural Networks with Product of Sparse Matrices	IJCNN	2022
A Design Space Exploration Methodology for Enabling Tensor Train Decomposition in Edge Devices	SAMOS	2022
Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition	NeurIPS	2021
Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks	NeurIPS	2021
Tensor Regression Networks	JMLR	2021
Parameter Efficient Dynamic Convolution via Tensor Decomposition	BMVC	2021
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework	CVPR	2021
Pufferfish: Communication-efficient Models At No Extra Cost	MLSys	2021
Learning-based Tensor Decomposition with Adaptive Rank Penalty for CNNs Compression	MIPR	2021
Deep Convolutional Neural Network Compression via Coupled Tensor Decomposition	JSTSP	2021
Tensor Reordering for CNN Compression	ICASSP	2021
Block-term tensor neural networks	Neural Networks	2020
Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer	CVPR	2020
Few Sample Knowledge Distillation for Efficient Network Compression	CVPR	2020
Factorized Higher-Order CNNs with an Application to Spatio-Temporal Emotion Estimation	CVPR	2020
Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification	CVPRW	2020
T-Basis: a Compact Representation for Neural Networks	ICML	2020
PENNI: Pruned Kernel Sharing for Efficient CNN Inference	ICML	2020
A Novel Rank Selection Scheme in Tensor Ring Decomposition Based on Reinforcement Learning for Deep Neural Networks	ICASSP	2020
Holistic CNN Compression via Low-Rank Decomposition with Knowledge Transfer	TPAMI	2019
LTNN: A Layerwise Tensorized Compression of Multilayer Neural Network	TNNLS	2019
Efficient Neural Network Compression	CVPR	2019
ADA-Tucker: Compressing deep neural networks via adaptive dimension adjustment tucker decomposition	Neural Networks	2019
Learning Filter Basis for Convolutional Neural Network Compression	ICCV	2019
Automated Multi-Stage Compression of Neural Networks	ICCVW	2019
Compressing Deep Models using Multi Tensor Train Decomposition	ICCAIS	2019
Compressing Fully Connected Layers using Kronecker Tensor Decomposition	ICCSNT	2019
Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling	OpenReview	2019
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition	CVPR	2018
Wide Compression: Tensor Ring Nets	CVPR	2018
Self-supervised Knowledge Distillation Using Singular Value Decomposition	ECCV	2018
Extreme Network Compression via Filter Group Approximation	ECCV	2018
Network Decoupling: From Regular to Depthwise Separable Convolutions	BMVC	2018
On Compressing Deep Models by Low Rank and Sparse Decomposition	CVPR	2017
Coordinating Filters for Faster Deep Neural Networks	ICCV	2017
Factorized Convolutional Neural Networks	ICCVW	2017
Tensor Regression Networks with various Low-Rank Tensor Approximations	Arxiv	2017
Accelerating Very Deep Convolutional Networks for Classification and Detection	TPAMI	2016
Convolutional Neural Networks With Low-rank Regularization	ICLR	2016
Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications	ICLR	2016
Towards Convolutional Neural Networks Compression via Global Error Reconstruction	IJCAI	2016
Accelerating Convolutional Neural Networks for Mobile Applications	MM	2016
Ultimate tensorization: compressing convolutional and FC layers alike	NIPSW	2016
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition	ICLR	2015
Speeding up Convolutional Neural Networks with Low Rank Expansions	Arxiv	2014

📚 Surveys

Title	Venue	Year
Low Rank Optimization for Efficient Deep Learning: Making a Balance Between Compact Architecture And Fast Training	Journal of Systems Engineering and Electronics	2024
Tensor Decomposition for Model Reduction in Neural Networks: A Review	IEEE Circuits and Systems Magazine	2023
Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training	Arxiv	2023
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives	Arxiv	2023
High-performance tensor decompositions for compressing and accelerating deep neural networks	Tensors for Data Processing	2022
Tensor Methods in Computer Vision and Deep Learning	Proceedings of the IEEE	2021
Tensor Decomposition for Signal Processing and Machine Learning	IEEE Transactions on Signal Processing	2017
A literature survey of low-rank tensor approximation techniques	GAMM-Mitteilungen	2013
The Higher-Order Singular Value Decomposition: Theory and an Application	IEEE Signal Processing Magazine	2010
Tensor Decompositions and Applications	SIAM Review	2009