Summary of Analytical Uncertainty-based Loss Weighting in Multi-task Learning, by Lukas Kirchdorfer et al.
Analytical Uncertainty-Based Loss Weighting in Multi-Task Learning
by Lukas Kirchdorfer, Cathrin Elich, Simon Kutsche, Heiner Stuckenschmidt, Lukas Schott, Jan M. Köhler
First submitted to arxiv on: 15 Aug 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel task-weighting method in multi-task learning (MTL) neural networks to balance individual task losses during training. The approach builds on uncertainty weighting and computes optimal uncertainty-based weights using a softmax function with tunable temperature. This method offers comparable results to the combinatorially prohibitive brute-force approach of scalarization while being more cost-effective. The authors conduct extensive benchmarks on various datasets and architectures, finding that their method consistently outperforms six other common weighting methods. Additionally, they report experimental findings relevant to practical MTL applications, such as the influence of network size and weight decay tuning on performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a new way to help neural networks learn many tasks at once. Right now, these networks have trouble balancing how much attention to give each task during training. The researchers came up with a clever solution that works well without being too complicated or expensive. They tested their method on lots of different data sets and network architectures and found it did just as well as other methods that are more time-consuming to use. This is important because it could help make these networks even better at doing many things at once. |
Keywords
» Artificial intelligence » Attention » Multi task » Softmax » Temperature