Summary of Fair: Filtering Of Automatically Induced Rules, by Divya Jyoti Bajpai et al.
FAIR: Filtering of Automatically Induced Rules
by Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal, Ganesh Ramakrishnan
First submitted to arxiv on: 23 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper addresses the challenge of training machine learning algorithms on diverse domains by exploring Weak Supervision, which accelerates labeled data creation using domain-specific rules. However, this approach requires users to write high-quality rules, a task that can be time-consuming and labor-intensive. Automatic Rule Induction (ARI) approaches alleviate this issue by generating rules from features on a small labeled set and filtering the final set of rules. The key challenge in ARI is filtering out high-quality rules from the large set generated automatically. To tackle this problem, the authors propose an algorithm that leverages submodular objective functions to filter rules based on their collective precision, coverage, and conflicts. The authors experiment with three ARI approaches and five text classification datasets, demonstrating the superiority of their algorithm compared to semi-supervised label aggregation approaches. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper solves a big problem in machine learning: making sure we have enough labeled data to train our algorithms correctly. One way to get more labeled data is by using rules that tell us which examples belong to which categories. But it’s hard for humans to write those rules, so scientists developed ways to make the computer generate them instead. The tricky part is then figuring out which of these automatically generated rules are actually useful and accurate. The authors of this paper created an algorithm that helps us find the good rules by looking at how well they work together. They tested their algorithm on five different datasets and showed that it’s better than other methods. |
Keywords
* Artificial intelligence * Machine learning * Precision * Semi supervised * Text classification