Summary of Contrastive Learning to Improve Retrieval For Real-world Fact Checking, by Aniruddh Sriram et al.
Contrastive Learning to Improve Retrieval for Real-world Fact Checking
by Aniruddh Sriram, Fangyuan Xu, Eunsol Choi, Greg Durrett
First submitted to arxiv on: 7 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper presents Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims. By leveraging the AVeriTeC dataset, which annotates subquestions for claims with human-written answers from evidence documents, the authors fine-tune Contriever with a contrastive objective based on multiple training signals, including distillation from GPT-4, evaluating subquestion answers, and gold labels in the dataset. The model is evaluated on both retrieval and end-to-end veracity judgments about claims, achieving a 6% improvement in veracity classification accuracy on the AVeriTeC dataset. Additionally, the authors demonstrate that their gains can be transferred to other datasets, including FEVER, ClaimDecomp, HotpotQA, and a synthetic dataset. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper is about finding better information online to help fact-checking machines get more accurate answers. The problem is that traditional methods might not give you all the relevant information you need. For example, if you’re trying to figure out what’s in a vaccine, you might need to look at documents that aren’t directly about the vaccine, but are related to how it was developed. The authors created a new way to retrieve this kind of information, called Contrastive Fact-Checking Reranker (CFR), which uses a big dataset and different training methods to improve its accuracy. They tested their model on several datasets and found that it worked better than other models in some cases. |
Keywords
» Artificial intelligence » Classification » Distillation » Gpt