Loading Now

Summary of Simple Hack For Transformers Against Heavy Long-text Classification on a Time- and Memory-limited Gpu Service, by Mirza Alim Mutasodirin et al.


Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service

by Mirza Alim Mutasodirin, Radityo Eko Prasojo, Achmad F. Abka, Hanif Rasyidi

First submitted to arxiv on: 19 Mar 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel study investigates hyperparameter optimization (HPO) for long-text classification using Transformers, focusing on resource-constrained environments. Researchers rely on free services like Google Colab but face limitations due to quadratic complexity and needed larger resources. The paper proposes an efficient and dynamic HPO procedure that can be run gradually on limited resources without requiring a long-running optimization library. It compares various hacks for shortening and enriching sequences, such as removing stopwords, punctuation, low-frequency words, and recurring words. The best hack found is the removal of stopwords while keeping punctuation and low-frequency words. Some setups outperform larger token lengths using smaller ones, which represent similar information with reduced computational resources. This study aims to help developers optimize model performance efficiently on limited resources.
Low GrooveSquid.com (original content) Low Difficulty Summary
A team of researchers looked at how to improve long-text classification using special computer models called Transformers. They found that most studies use small amounts of data and don’t explain how they did their experiments. The team used a big dataset of news articles and tried different ways to shorten and enrich the text. They also developed a new way to find the best settings for the model without needing powerful computers. This helped them figure out which settings worked best, even when using less computer power. Their discoveries can help developers make their models work better on limited resources.

Keywords

» Artificial intelligence  » Hyperparameter  » Optimization  » Text classification  » Token