Summary of Swish-t : Enhancing Swish Activation with Tanh Bias For Improved Neural Network Performance, by Youngmin Seo et al.
Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance
by Youngmin Seo, Jinha Kim, Unsang Park
First submitted to arxiv on: 1 Jul 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
| Summary difficulty | Written by | Summary | 
|---|---|---|
| High | Paper authors | High Difficulty Summary Read the original abstract here | 
| Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The Swish-T family, an enhanced version of the non-monotonic activation function Swish, is proposed. The modification adds a Tanh bias to the original Swish function, creating variants that excel in different tasks depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. The Swish-T family includes Swish-T, Swish-T_{}, and Swish-T_{}, with the latter being the proposed function. An ablation study shows that using Swish-T_{} as a non-parametric function can achieve high performance. The superiority of the Swish-T family is empirically demonstrated across various models and benchmark datasets, including MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. | 
| Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper proposes a new way to improve a type of math function called an activation function. This function helps artificial intelligence learn from data. The authors add a special trick to the original function to make it better for different tasks. They test this new function on many datasets and show that it works well. You can find the code they used online. | 
Keywords
* Artificial intelligence * Tanh




