Summary of Winning Amazon Kdd Cup’24, by Chris Deotte et al.
Winning Amazon KDD Cup’24
by Chris Deotte, Ivan Sorokin, Ahmet Erdem, Benedikt Schifferer, Gilberto Titericz Jr, Simon Jegou
First submitted to arxiv on: 5 Aug 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The winning solution for the Amazon KDD Cup 2024 Multi Task Online Shopping Challenge for Large Language Models (LLMs) is a single model per track, fine-tuned on an in-house training dataset using Qwen2-72B-Instruct. The competition featured 57 diverse tasks across five task types and four tracks, including multi-lingual ones. To address the limited example questions available (only 96), the authors developed their own training dataset through data augmentation and synthetic data generation using Large Language Models. Additionally, they employed wise-ft to account for distribution shifts, ensemble multiple LoRA adapters in one model, and utilized Logits Processors to constrain model output on relevant tokens. During inference, AWQ 4-bit Quantization and vLLM were used to predict the test dataset within time constraints ranging from 20 to 140 minutes per track. The solution achieved first place in each individual track and overall. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The Amazon KDD Cup 2024 challenge is all about building a helpful assistant that can answer questions about online shopping. A team of researchers created a special kind of computer program, or model, that could do this job. They took an existing language model and made it better by training it on their own data. To make sure the model worked well in different situations, they used some clever techniques like combining multiple smaller models together. When it was time to test the model, they had to make sure it could answer questions quickly enough. And guess what? Their model did amazingly well, beating all the other teams and winning the competition! |
Keywords
» Artificial intelligence » Data augmentation » Inference » Language model » Logits » Lora » Multi task » Quantization » Synthetic data