Skip to content

Recent papers and projects in multitask Learning, fine-tuning, and their applications

Notifications You must be signed in to change notification settings

VirtuosoResearch/Multitask-learning-reference-repository

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multitask learning reference repositories

Surveys

  • Zhang, Y., & Yang, Q. (2021). A survey on multi-task learning. IEEE transactions on knowledge and data engineering.
  • Jiang et al. (2022). Transferability in deep learning: A survey.

Multitask Learning Basics

  • Caruana, R. (1997). Multitask learning. Machine learning. paper
  • Caruana, R. (1996). Algorithms and applications for multitask learning. In ICML. Paper
  • Duong et al. (2015). Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In ACL.
  • Yang, Y., & Hospedales, T. (2016). Deep multi-task representation learning: A tensor factorisation approach. ICLR. Paper
  • GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. ICLR 2019. paper
  • SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. NeurIPS 2019. paper
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018. paper
  • Multi-task Sequence to Sequence Learning. ICLR 2016. paper
  • The natural language decathlon: Multitask learning as question answering. arXiv 2019. paper
  • Understanding and Improving Information Transfer in Multi-Task Learning. ICLR 2020. paper
  • Multi-Task Deep Neural Networks for Natural Language Understanding. ACL 2019. paper

Task Relatedness

Theoretical notions of task relatedness.

  • Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In Learning Theory and Kernel Machines. paper
  • Ben-David et al. (2010). A theory of learning from different domains. Machine learning paper
  • Hanneke, S., & Kpotufe, S. (2019). On the value of target data in transfer learning. Advances in Neural Information Processing Systems. Paper
  • Du et al. (2020). Few-shot learning via learning the representation, provably. ICLR. paper

Measurements in deep neural networks.

Grdients

  • Yu et al. (2020). Gradient surgery for multi-task learning. NeurIPS. Paper
  • Dery et al. (2021). Auxiliary task update decomposition: The good, the bad and the neutral. ICLR. paper
  • Chen et al. (2021). Weighted training for cross-task learning. ICLR. paper

Predicted probabilities between tasks

  • Nguyen et al (2020). Leep: A new measure to evaluate transferability of learned representations. ICML. Paper

  • Identifying beneficial task relations for multi-task learning in deep neural networks. EACL 2017. paper

Task affinity

  • Standley et al. (2020). Which tasks should be learned together in multi-task learning? ICML. paper
  • Fifty et al. (2021). Efficiently identifying task groupings for multi-task learning. NeuIPS. Paper

Multitask Learning Architectures

Mixture-of-Experts

  • Ma et al. (2018). Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD. paper

Branching

  • Guo et al. (2020). Learning to branch for multi-task learning. In ICML. paper
  • Huang, et al. (2018). Gnas: A greedy neural architecture search method for multi-attribute learning. In ACM MM.
  • Ruder et al. (2019). Latent multi-task architecture learning. In AAAI.

Soft-parameter sharing

  • Liu et al. (2019). End-to-end multi-task learning with attention. In CVPR. paper

  • Cross-stitch Networks for Multi-task Learning. CVPR 2016. paper

  • Gated multi-task network for text classification. NAACL 2018. paper

  • A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. EMNLP 2017. paper

  • End-to-End Multi-Task Learning with Attention. CVPR 2019. paper

  • Latent Multi-task Architecture Learning. AAAI 2019. paper

  • Learning Multiple Tasks with Multilinear Relationship Networks. NIPS 2017. paper

Optimization Methods for Multi-Task Learning

  • Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. CVPR 2018. paper

Benchmarks

  • GLUE: Natural Language Understanding
  • decaNLP: 10 NLP Tasks

Softwares and Open-source Libraries

  • LibMTL: an open-source library built on PyTorch for mulitask learning.

Meta Learning

Survey

Meta-Learning in Neural Networks: A Survey. paper

Black-Box Approaches

Recurrent Neural Network

(MANN) Meta-learning with memory-augmented neural networks. ICML 2016. paper

Attention-Based Network

Matching Networks for One-Shot Learning. NIPS 2016. paper

(SNAIL) A Simple Neural Attentive Meta-Learner. ICLR 2018. paper

Optimization-Based Methods

(MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. paper

(Reptile; First-order method) On First-Order Meta-Learning Algorithms. arXiv 2018. paper

Other Forms of Prior on MAML

(Implicit MAML) Meta-Learning with Implicit Gradients. NIPS 2019. paper

(Implicit Differentiation; SVM) Meta-Learning with Differentiable Convex Optimization. CVPR 2019. paper

(Bayesian linear regression) Meta-Learning Priors for Efficient Online Bayesian Regression. Workshop on the Algorithmic Foundations of Robotics 2018. paper

(Ridge regression; Logistic regression) Meta-learning with Differentiable Closed-Form Solvers. ICLR 2019. paper

Understanding MAML

(MAML expressive power and university) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

(Map MAML to Bayes Framework) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes. ICLR 2018. paper

Tricks to Optimize MAML

Choose architecture that is effective for inner gradient-step

Auto-Meta: Automated Gradient Based Meta Learner Search. NIPS 2018 Workshop on Meta-Learning. paper

Automatically learn inner vector learning rate, tune outer learning rate

Alpha MAML: Adaptive Model-Agnostic Meta-Learning. ICML 2019 Workshop on Automated Machine Learning. paper

Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv 2017. paper

Optimize only a subset of the parameters in the inner loop

(DEML) Deep Meta-Learning: Learning to Learn in the Concept Space. arXiv 2018. paper

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Decouple inner learning rate, BN statistics per-step

(MAML++) How to train your MAML. ICLR 2019. paper

Introduce context variables for increased expressive power

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

(Bias transformation) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

Non-Parametric Methods via Metric Learning

Siamese Neural Networks for One-shot Image Recognition. ICML 2015. paper

Matching Networks for One Shot Learning. NIPS 2016. paper

Prototypical Networks for Few-shot Learning. NIPS 2017. paper

Learn non-linear relation module on embeddings

Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018. paper

Learn infinite mixture of prototypes

Infinite Mixture Prototypes for Few-Shot Learning. ICML 2019. paper

Perform message passing on embeddings

Few-Shot Learning with Graph Neural Networks ICLR 2018. paper

Bayesian Meta-Learning & Generative Models

Amortized Inference

Amortized Bayesian Meta-Learning. ICLR 2019. paper

Ensemble Method

Bayesian Model-Agnostic Meta-Learning. NIPS 2018. paper

Sampling & Hybrid Inference

Probabilistic Model-Agnostic Meta-Learning. NIPS 2018. paper

Meta-Learning Probabilistic Inference for Prediction. ICLR 2019. paper

Hybrid meta-learning approaches

Meta-Learning with Latent Embedding Optimization. ICLR 2019. paper

Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020. paper

Few-Shot Learning with Graph Neural Networks. ICLR 2018. paper

(CAML) Learning to Learn with Conditional Class Dependencies. ICLR 2019. paper

Meta Reinforcement Learning

Policy Gradient RL

MAML and Black-Box Meta Learning Approaches can be directly applied to Policy-Gradient RL methods

Value-Based RL

It is not easy to applied existing meta learning approaches to Value-Based RL because Value-Based RL is dynamic programming method

Meta-Q-Learning. ICLR 2020. paper

(Goal-Conditioned RL with hindsight relabeling)/(Multi-Task RL) Hindsight Experience Replay. NIPS 2017. paper

(better learning) Learning Latent Plans from Play. CoRL 2019. paper

(learn a better goal representation)

Universal Planning Networks. ICML 2018. paper

Unsupervised Visuomotor Control through Distributional Planning Networks. RSS 2019. paper

Applications

Meta-Learning for Low-Resource Neural Machine Translation. EMNLP 2018. paper

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions. ICLR 2018. paper

One-Shot Imitation Learning. NIPS 2017. paper

Massively Multitask Networks for Drug Discovery. ICML 2015. paper

About

Recent papers and projects in multitask Learning, fine-tuning, and their applications

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •