Summary of Perception Test 2024: Challenge Summary and a Novel Hour-long Videoqa Benchmark, by Joseph Heyward et al.
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark
by Joseph Heyward, João Carreira, Dima Damen, Andrew Zisserman, Viorica Pătrăucean
First submitted to arxiv on: 29 Nov 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Computation and Language (cs.CL); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The Second Perception Test challenge is a benchmarking exercise to measure the progress of state-of-the-art video models. Organized as a half-day workshop at ECCV 2024, it features seven tracks that cover low-level and high-level tasks across various modalities (video, audio, and text). The additional track introduces a novel hour-long video understanding task with a video QA benchmark called 1h-walk VQA. Participants tackle tasks such as object tracking, temporal action localisation, and multiple-choice video question-answering. This report summarizes the challenge tasks and results, with a focus on the novel hour-long video QA benchmark. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The Second Perception Test challenge is like a big test to see how well computers can understand videos. It’s an event held at a big computer science conference where people try to make their computers do cool things with videos. There are different challenges that computers have to do, like following objects in a video or understanding what’s happening in a movie. This year, there was even a new challenge where computers had to answer questions about super-long videos that are an hour long! We’re reporting on all the challenges and how they did. |
Keywords
» Artificial intelligence » Object tracking » Question answering