Paper List

We recommend you use the search box as this list is very long.

Summary of On the Modeling Capabilities Of Large Language Models For Sequential Decision Making, by Martin Klissarov et al.
Summary of Acpbench: Reasoning About Action, Change, and Planning, by Harsha Kokel et al.
Summary of T2v-turbo-v2: Enhancing Video Generation Model Post-training Through Data, Reward, and Conditional Guidance Design, by Jiachen Li et al.
Summary of Patch Is Enough: Naturalistic Adversarial Patch Against Vision-language Pre-training Models, by Dehong Kong et al.
Summary of Training Interactive Agent in Large Fps Game Map with Rule-enhanced Reinforcement Learning, by Chen Zhang et al.
Summary of Real-time Ship Recognition and Georeferencing For the Improvement Of Maritime Situational Awareness, by Borja Carrillo Perez
Summary of Activation Scaling For Steering and Interpreting Language Models, by Niklas Stoehr et al.
Summary of 6dgs: Enhanced Direction-aware Gaussian Splatting For Volumetric Rendering, by Zhongpai Gao et al.
Summary of Can Llms Plan Paths with Extra Hints From Solvers?, by Erik Wu and Sayan Mitra
Summary of Named Clinical Entity Recognition Benchmark, by Wadood M Abdul et al.
Summary of On the Structure Of Game Provenance and Its Applications, by Shawn Bowers et al.
Summary of Synthetic Generation Of Dermatoscopic Images with Gan and Closed-form Factorization, by Rohan Reddy Mekala et al.
Summary of Scalable and Accurate Graph Reasoning with Llm-based Multi-agents, by Yuwei Hu et al.
Summary of Ctc-gmm: Ctc Guided Modality Matching For Fast and Accurate Streaming Speech Translation, by Rui Zhao et al.
Summary of Vlm2vec: Training Vision-language Models For Massive Multimodal Embedding Tasks, by Ziyan Jiang et al.
Summary of Beyond Correlation: Interpretable Evaluation Of Machine Translation Metrics, by Stefano Perrella et al.
Summary of Casimedicos-arg: a Medical Question Answering Dataset Annotated with Explanatory Argumentative Structures, by Ekaterina Sviridova et al.
Summary of Preserving Multi-modal Capabilities Of Pre-trained Vlms For Improving Vision-linguistic Compositionality, by Youngtaek Oh et al.
Summary of Navigating the Digital World As Humans Do: Universal Visual Grounding For Gui Agents, by Boyu Gou et al.
Summary of Texthawk2: a Large Vision-language Model Excels in Bilingual Ocr and Grounding with 16x Fewer Tokens, by Ya-qi Yu et al.
Summary of Hirt: Enhancing Robotic Control with Hierarchical Robot Transformers, by Jianke Zhang et al.
Summary of Scale-invariant Object Detection by Adaptive Convolution with Unified Global-local Context, By Amrita Singh et al.
Summary of Hate Speech Detection Using Cross-platform Social Media Data in English and German Language, by Gautam Kishore Shahi and Tim A. Majchrzak
Summary of Mvp-bench: Can Large Vision–language Models Conduct Multi-level Visual Perception Like Humans?, by Guanzhen Li et al.
Summary of Capeen: Image Captioning with Early Exits and Knowledge Distillation, by Divya Jyoti Bajpai and Manjesh Kumar Hanawal
Summary of Dadee: Unsupervised Domain Adaptation in Early Exit Plms, by Divya Jyoti Bajpai and Manjesh Kumar Hanawal
Summary of Empowering Backbone Models For Visual Text Generation with Input Granularity Control and Glyph-aware Training, by Wenbo Li et al.
Summary of Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task Generation, by Jakub Bednarek et al.
Summary of Mindscope: Exploring Cognitive Biases in Large Language Models Through Multi-agent Systems, by Zhentao Xie et al.
Summary of A Pluggable Common Sense-enhanced Framework For Knowledge Graph Completion, by Guanglin Niu et al.
Summary of Generalizability Analysis Of Deep Learning Predictions Of Human Brain Responses to Augmented and Semantically Novel Visual Stimuli, by Valentyn Piskovskyi et al.
Summary of Knowledge-guided Dynamic Modality Attention Fusion Framework For Multimodal Sentiment Analysis, by Xinyu Feng et al.
Summary of Semi-markovian Planning to Coordinate Aerial and Maritime Medical Evacuation Platforms, by Mahdi Al-husseini et al.
Summary of Lrhp: Learning Representations For Human Preferences Via Preference Pairs, by Chenglong Wang et al.
Summary of Famma: a Benchmark For Financial Domain Multilingual Multimodal Question Answering, by Siqiao Xue et al.
Summary of Kgarevion: An Ai Agent For Knowledge-intensive Biomedical Qa, by Xiaorui Su et al.
Summary of Driving with Regulation: Interpretable Decision-making For Autonomous Vehicles with Retrieval-augmented Reasoning Via Llm, by Tianhui Cai et al.
Summary of Passage Retrieval Of Polish Texts Using Okapi Bm25 and An Ensemble Of Cross Encoders, by Jakub Pokrywka
Summary of Representing the Under-represented: Cultural and Core Capability Benchmarks For Developing Thai Large Language Models, by Dahyun Kim et al.
Summary of Resource-efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders, by Kosta Dakic et al.
Summary of Transforming Color: a Novel Image Colorization Method, by Hamza Shafiq and Bumshik Lee
Summary of Postedit: Posterior Sampling For Efficient Zero-shot Image Editing, by Feng Tian et al.
Summary of Leveraging Grammar Induction For Language Understanding and Generation, by Jushi Kai et al.
Summary of Econ: on the Detection and Resolution Of Evidence Conflicts, by Cheng Jiayang et al.
Summary of Pad: Personalized Alignment Of Llms at Decoding-time, by Ruizhe Chen et al.
Summary of Multi-round Region-based Optimization For Scene Sketching, by Yiqi Liang et al.
Summary of Globesumm: a Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization, by Yangfan Ye et al.
Summary of Epsilon-vae: Denoising As Visual Decoding, by Long Zhao et al.
Summary of From Reading to Compressing: Exploring the Multi-document Reader For Prompt Compression, by Eunseong Choi et al.
Summary of Reasoning with Natural Language Explanations, by Marco Valentino et al.
Summary of Dammi:daily Activities in a Psychologically Annotated Multi-modal Iot Dataset, by Mohsen Falah Rad et al.
Summary of Iv-mixed Sampler: Leveraging Image Diffusion Models For Enhanced Video Synthesis, by Shitong Shao et al.
Summary of Neuro-symbolic Entity Alignment Via Variational Inference, by Shengyuan Chen et al.
Summary of Accelerating Diffusion Models with One-to-many Knowledge Distillation, by Linfeng Zhang et al.
Summary of Longgenbench: Long-context Generation Benchmark, by Xiang Liu et al.
Summary of Rainbowpo: a Unified Framework For Combining Improvements in Preference Optimization, by Hanyang Zhao et al.
Summary of Correlation-aware Select and Merge Attention For Efficient Fine-tuning and Context Length Extension, by Ning Wang et al.
Summary of Improving Portfolio Optimization Results with Bandit Networks, by Gustavo De Freitas Fonseca et al.
Summary of Towards Propositional Klm-style Defeasible Standpoint Logics, by Nicholas Leisegang et al.
Summary of Implicit to Explicit Entropy Regularization: Benchmarking Vit Fine-tuning Under Noisy Labels, by Maria Marrium et al.
Summary of Constructing Cloze Questions Generatively, by Yicheng Sun (1) and Jie Wang (2)
Summary of Channel-aware Throughput Maximization For Cooperative Data Fusion in Cav, by Haonan An et al.
Summary of Mechanistic Behavior Editing Of Language Models, by Joykirat Singh et al.
Summary of Human Bias in the Face Of Ai: the Role Of Human Judgement in Ai Generated Text Evaluation, by Tiffany Zhu et al.
Summary of Unsupervised Human Preference Learning, by Sumuk Shashidhar et al.
Summary of Erasmo: Leveraging Large Language Models For Enhanced Clustering Segmentation, by Fillipe Dos Santos Silva et al.
Summary of Grammar Induction From Visual, Speech and Text, by Yu Zhao et al.
Summary of A Two-stage Proactive Dialogue Generator For Efficient Clinical Information Collection Using Large Language Model, by Xueshen Li et al.
Summary of Scisafeeval: a Comprehensive Benchmark For Safety Alignment Of Large Language Models in Scientific Tasks, by Tianhao Li et al.
Summary of Precision Knowledge Editing: Enhancing Safety in Large Language Models, by Xuying Li et al.
Summary of Determine-then-ensemble: Necessity Of Top-k Union For Large Language Model Ensembling, by Yuxuan Yao et al.
Summary of Calliffusionv2: Personalized Natural Calligraphy Generation with Flexible Multi-modal Control, by Qisheng Liao et al.
Summary of Graphrouter: a Graph-based Router For Llm Selections, by Tao Feng et al.
Summary of Swe-bench Multimodal: Do Ai Systems Generalize to Visual Software Domains?, by John Yang et al.
Summary of Kidlm: Advancing Language Models For Children — Early Insights and Future Directions, by Mir Tafseer Nayeem et al.
Summary of Chain-of-jailbreak Attack For Image Generation Models Via Editing Step by Step, By Wenxuan Wang et al.
Summary of Still Not Quite There! Evaluating Large Language Models For Comorbid Mental Health Diagnosis, by Amey Hengle et al.
Summary of Take It Easy: Label-adaptive Self-rationalization For Fact Verification and Explanation Generation, by Jing Yang and Anderson Rocha
Summary of Grounding Language in Multi-perspective Referential Communication, by Zineng Tang et al.
Summary of Syllablelm: Learning Coarse Semantic Units For Speech Language Models, by Alan Baade et al.
Summary of Gamified Crowd-sourcing Of High-quality Data For Visual Fine-tuning, by Shashank Yadav et al.
Summary of Large Language Models Can Achieve Social Balance, by Pedro Cisneros-velarde
Summary of Lorta: Low Rank Tensor Adaptation Of Large Language Models, by Ignacio Hounie et al.
Summary of Nlip_lab-iith Low-resource Mt System For Wmt24 Indic Mt Shared Task, by Pramit Sahoo et al.
Summary of Enriching Ontologies with Disjointness Axioms Using Large Language Models, by Elias Crum et al.
Summary of Generating Bilingual Example Sentences with Large Language Models As Lexicography Assistants, by Raphael Merx et al.
Summary of Grounded-videollm: Sharpening Fine-grained Temporal Grounding in Video Large Language Models, by Haibo Wang et al.
Summary of Towards a Benchmark For Large Language Models For Business Process Management Tasks, by Kiran Busch and Henrik Leopold
Summary of Comparing Zero-shot Self-explanations with Human Rationales in Text Classification, by Stephanie Brandl and Oliver Eberle
Summary of Comparative Analysis and Ensemble Enhancement Of Leading Cnn Architectures For Breast Cancer Classification, by Gary Murphy et al.
Summary of An X-ray Is Worth 15 Features: Sparse Autoencoders For Interpretable Radiology Report Generation, by Ahmed Abdulaal et al.
Summary of Lantern: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding, by Doohyuk Jang et al.
Summary of One2set + Large Language Model: Best Partners For Keyphrase Generation, by Liangying Shao et al.
Summary of Exploring the Benefit Of Activation Sparsity in Pre-training, by Zhengyan Zhang et al.
Summary of Mare: Multi-aspect Rationale Extractor on Unsupervised Rationale Extraction, by Han Jiang et al.
Summary of Constructive Apraxia: An Unexpected Limit Of Instructible Vision-language Models and Analog For Human Cognitive Disorders, by David Noever and Samantha E. Miller Noever
Summary of Not All Diffusion Model Activations Have Been Evaluated As Discriminative Features, by Benyuan Meng et al.
Summary of Variational Bayes Gaussian Splatting, by Toon Van De Maele et al.
Summary of Similarity-enhanced Homophily For Multi-view Heterophilous Graph Clustering, by Jianpeng Chen et al.
Summary of Aligning Llms with Individual Preferences Via Interaction, by Shujin Wu et al.