Loading Now

Summary of Perception Test 2024: Challenge Summary and a Novel Hour-long Videoqa Benchmark, by Joseph Heyward et al.


Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

by Joseph Heyward, João Carreira, Dima Damen, Andrew Zisserman, Viorica Pătrăucean

First submitted to arxiv on: 29 Nov 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Computation and Language (cs.CL); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The Second Perception Test challenge is a benchmarking exercise to measure the progress of state-of-the-art video models. Organized as a half-day workshop at ECCV 2024, it features seven tracks that cover low-level and high-level tasks across various modalities (video, audio, and text). The additional track introduces a novel hour-long video understanding task with a video QA benchmark called 1h-walk VQA. Participants tackle tasks such as object tracking, temporal action localisation, and multiple-choice video question-answering. This report summarizes the challenge tasks and results, with a focus on the novel hour-long video QA benchmark.
Low GrooveSquid.com (original content) Low Difficulty Summary
The Second Perception Test challenge is like a big test to see how well computers can understand videos. It’s an event held at a big computer science conference where people try to make their computers do cool things with videos. There are different challenges that computers have to do, like following objects in a video or understanding what’s happening in a movie. This year, there was even a new challenge where computers had to answer questions about super-long videos that are an hour long! We’re reporting on all the challenges and how they did.

Keywords

» Artificial intelligence  » Object tracking  » Question answering