Summary of Beyond Accuracy: on the Effects Of Fine-tuning Towards Vision-language Model’s Prediction Rationality, by Qitong Wang et al.
Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model’s Prediction Rationality
by Qitong Wang, Tang Li, Kien X. Nguyen, Xi Peng
First submitted to arxiv on: 17 Dec 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper investigates the impact of fine-tuning Vision-Language Models (VLMs) on prediction rationality in safety-critical domains. The authors propose two new metrics, Prediction Trustworthiness and Inference Reliability, to measure the validity of predictions. Experimenting with various settings, they find that well-adopted fine-tuning methods often lead to correct predictions based on invalid evidence, potentially undermining trustworthiness. However, when VLMs are trained to identify valid evidence, they become more likely to make correct predictions. The results hold across distributional shifts and different experimental settings. This research offers fresh insights into the fine-tuning of VLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This study looks at how well-trained computer models called Vision-Language Models (VLMs) work when we change their training data. The authors want to know if these changes make the models more or less trustworthy. They created new ways to measure trustworthiness and tested them on different datasets. Surprisingly, they found that when we train VLMs in a way that’s common now, it makes them good at predicting things based on bad information. This might not be good for safety-critical situations! However, if we train them to find good information, they become more reliable. The study shows that how we fine-tune these models matters. |
Keywords
» Artificial intelligence » Fine tuning » Inference