Loading Now

Summary of Llm-select: Feature Selection with Large Language Models, by Daniel P. Jeong et al.


LLM-Select: Feature Selection with Large Language Models

by Daniel P. Jeong, Zachary C. Lipton, Pradeep Ravikumar

First submitted to arxiv on: 2 Jul 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper demonstrates an unexpected capability of large language models (LLMs): given only input feature names and a task description, they can select the most predictive features with performance rivaling standard data science tools. The models exhibit this capacity across various query mechanisms, including zero-shot prompting for numerical importance scores. The latest models, such as GPT-4, consistently identify the most predictive features regardless of the query mechanism and prompting strategy. Extensive experiments on real-world data show that LLM-based feature selection achieves strong performance competitive with data-driven methods like LASSO, despite never having seen the downstream training data. This suggests that LLMs may be useful not only for selecting features for training but also for deciding which features to collect in the first place.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper shows that big language models can do something really cool: they can figure out which pieces of information are most important without seeing any actual data. They just need to know what kind of question is being asked and what the answer might look like. The best models, like GPT-4, can even tell us how important each piece of information is. This could be really helpful for people who collect data in fields like healthcare, where it’s expensive and time-consuming.

Keywords

* Artificial intelligence  * Feature selection  * Gpt  * Prompting  * Zero shot