Summary of Longvideobench: a Benchmark For Long-context Interleaved Video-language Understanding, by Haoning Wu et al.
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understandingby Haoning Wu, Dongxu Li, Bei Chen, Junnan…
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understandingby Haoning Wu, Dongxu Li, Bei Chen, Junnan…
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Databy Xinyi Wang, Antonis Antoniades,…
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inferenceby Qichen Fu, Minsik Cho, Thomas…
INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic…
Evaluation of RAG Metrics for Question Answering in the Telecom Domainby Sujoy Roychowdhury, Sumit Soman,…
Reasoning with Large Language Models, a Surveyby Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens,…
Unraveling the Truth: Do VLMs really Understand Charts? A Deep Dive into Consistency and Robustnessby…
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Togetherby Dilara Soylu, Christopher Potts,…
IoT-LM: Large Multisensory Language Models for the Internet of Thingsby Shentong Mo, Russ Salakhutdinov, Louis-Philippe…
GOFA: A Generative One-For-All Model for Joint Graph Language Modelingby Lecheng Kong, Jiarui Feng, Hao…