Summary of Legallens Shared Task 2024: Legal Violation Identification in Unstructured Text, by Ben Hagag et al.
LegalLens Shared Task 2024: Legal Violation Identification in Unstructured Text
by Ben Hagag, Liav Harpaz, Gil Semo, Dor Bernsohn, Rohit Saha, Pashootan Vaezipoor, Kyryl Truskovskyi, Gerasimos Spanakis
First submitted to arxiv on: 15 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The LegalLens Shared Task aims to detect legal violations within text using two sub-tasks: identifying legal violation entities (LegalLens-NER) and associating violations with relevant contexts and affected individuals (LegalLens-NLI). The task attracts 38 teams, leveraging an enhanced dataset covering labor, privacy, and consumer protection domains. Results show that top-performing teams rely on fine-tuning pre-trained language models, outperforming legal-specific models and few-shot methods. Notably, the top team achieves a 7.11% improvement in NER over the baseline, while NLI sees a more marginal gain of 5.7%. Despite advancements, the complexity of legal texts presents opportunities for further innovation. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper explores how to detect legal violations in text. It looks at two tasks: finding what’s wrong and linking those issues to relevant laws and people affected. Many teams take part, using a big dataset with examples from labor, privacy, and consumer protection areas. The results show that the best teams use special language models that can learn quickly, doing better than other methods. One team does especially well, improving by 7.11% in one task and 5.7% in another. Even though there’s progress, understanding legal texts remains a challenge. |
Keywords
» Artificial intelligence » Few shot » Fine tuning » Ner