Paper List
We recommend you use the search box as this list is very long.
-
Summary of Texthawk2: a Large Vision-language Model Excels in Bilingual Ocr and Grounding with 16x Fewer Tokens, by Ya-qi Yu et al.
-
Summary of Navigating the Digital World As Humans Do: Universal Visual Grounding For Gui Agents, by Boyu Gou et al.
-
Summary of Hirt: Enhancing Robotic Control with Hierarchical Robot Transformers, by Jianke Zhang et al.
-
Summary of Casimedicos-arg: a Medical Question Answering Dataset Annotated with Explanatory Argumentative Structures, by Ekaterina Sviridova et al.
-
Summary of Hate Speech Detection Using Cross-platform Social Media Data in English and German Language, by Gautam Kishore Shahi and Tim A. Majchrzak
-
Summary of Output Scouting: Auditing Large Language Models For Catastrophic Responses, by Andrew Bell and Joao Fonseca
-
Summary of Scale-invariant Object Detection by Adaptive Convolution with Unified Global-local Context, By Amrita Singh et al.
-
Summary of Proceedings Of the First International Workshop on Next-generation Language Models For Knowledge Representation and Reasoning (nelamkrr 2024), by Ken Satoh et al.
-
Summary of Falcon Mamba: the First Competitive Attention-free 7b Language Model, by Jingwei Zuo et al.
-
Summary of Egooops: a Dataset For Mistake Action Detection From Egocentric Videos Referring to Procedural Texts, by Yuto Haneji et al.
-
Summary of Post-hoc Study Of Climate Microtargeting on Social Media Ads with Llms: Thematic Insights and Fairness Evaluation, by Tunazzina Islam et al.
-
Summary of Synthesizing Interpretable Control Policies Through Large Language Model Guided Search, by Carlo Bosio and Mark W. Mueller
-
Summary of Herd Mentality in Augmentation — Not a Good Idea! a Robust Multi-stage Approach Towards Deepfake Detection, by Monu et al.
-
Summary of Intuitions Of Compromise: Utilitarianism Vs. Contractualism, by Jared Moore et al.
-
Summary of Generalizability Analysis Of Deep Learning Predictions Of Human Brain Responses to Augmented and Semantically Novel Visual Stimuli, by Valentyn Piskovskyi et al.
-
Summary of Lrhp: Learning Representations For Human Preferences Via Preference Pairs, by Chenglong Wang et al.
-
Summary of Semi-markovian Planning to Coordinate Aerial and Maritime Medical Evacuation Platforms, by Mahdi Al-husseini et al.
-
Summary of Famma: a Benchmark For Financial Domain Multilingual Multimodal Question Answering, by Siqiao Xue et al.
-
Summary of Passage Retrieval Of Polish Texts Using Okapi Bm25 and An Ensemble Of Cross Encoders, by Jakub Pokrywka
-
Summary of Kgarevion: An Ai Agent For Knowledge-intensive Biomedical Qa, by Xiaorui Su et al.
-
Summary of Representing the Under-represented: Cultural and Core Capability Benchmarks For Developing Thai Large Language Models, by Dahyun Kim et al.
-
Summary of Driving with Regulation: Interpretable Decision-making For Autonomous Vehicles with Retrieval-augmented Reasoning Via Llm, by Tianhui Cai et al.
-
Summary of Transforming Color: a Novel Image Colorization Method, by Hamza Shafiq and Bumshik Lee
-
Summary of Resource-efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders, by Kosta Dakic et al.
-
Summary of Postedit: Posterior Sampling For Efficient Zero-shot Image Editing, by Feng Tian et al.
-
Summary of Leveraging Grammar Induction For Language Understanding and Generation, by Jushi Kai et al.
-
Summary of Patch Is Enough: Naturalistic Adversarial Patch Against Vision-language Pre-training Models, by Dehong Kong et al.
-
Summary of Training Interactive Agent in Large Fps Game Map with Rule-enhanced Reinforcement Learning, by Chen Zhang et al.
-
Summary of Real-time Ship Recognition and Georeferencing For the Improvement Of Maritime Situational Awareness, by Borja Carrillo Perez
-
Summary of Activation Scaling For Steering and Interpreting Language Models, by Niklas Stoehr et al.
-
Summary of 6dgs: Enhanced Direction-aware Gaussian Splatting For Volumetric Rendering, by Zhongpai Gao et al.
-
Summary of Can Llms Plan Paths with Extra Hints From Solvers?, by Erik Wu and Sayan Mitra
-
Summary of Named Clinical Entity Recognition Benchmark, by Wadood M Abdul et al.
-
Summary of On the Structure Of Game Provenance and Its Applications, by Shawn Bowers et al.
-
Summary of Neuro-symbolic Entity Alignment Via Variational Inference, by Shengyuan Chen et al.
-
Summary of Iv-mixed Sampler: Leveraging Image Diffusion Models For Enhanced Video Synthesis, by Shitong Shao et al.
-
Summary of Accelerating Diffusion Models with One-to-many Knowledge Distillation, by Linfeng Zhang et al.
-
Summary of Longgenbench: Long-context Generation Benchmark, by Xiang Liu et al.
-
Summary of Rainbowpo: a Unified Framework For Combining Improvements in Preference Optimization, by Hanyang Zhao et al.
-
Summary of Towards Propositional Klm-style Defeasible Standpoint Logics, by Nicholas Leisegang et al.
-
Summary of Improving Portfolio Optimization Results with Bandit Networks, by Gustavo De Freitas Fonseca et al.
-
Summary of Implicit to Explicit Entropy Regularization: Benchmarking Vit Fine-tuning Under Noisy Labels, by Maria Marrium et al.
-
Summary of Mechanistic Behavior Editing Of Language Models, by Joykirat Singh et al.
-
Summary of Channel-aware Throughput Maximization For Cooperative Data Fusion in Cav, by Haonan An et al.
-
Summary of Mvp-bench: Can Large Vision–language Models Conduct Multi-level Visual Perception Like Humans?, by Guanzhen Li et al.
-
Summary of Dadee: Unsupervised Domain Adaptation in Early Exit Plms, by Divya Jyoti Bajpai and Manjesh Kumar Hanawal
-
Summary of Capeen: Image Captioning with Early Exits and Knowledge Distillation, by Divya Jyoti Bajpai and Manjesh Kumar Hanawal
-
Summary of Empowering Backbone Models For Visual Text Generation with Input Granularity Control and Glyph-aware Training, by Wenbo Li et al.
-
Summary of Mindscope: Exploring Cognitive Biases in Large Language Models Through Multi-agent Systems, by Zhentao Xie et al.
-
Summary of Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task Generation, by Jakub Bednarek et al.
-
Summary of Knowledge-guided Dynamic Modality Attention Fusion Framework For Multimodal Sentiment Analysis, by Xinyu Feng et al.
-
Summary of Determine-then-ensemble: Necessity Of Top-k Union For Large Language Model Ensembling, by Yuxuan Yao et al.
-
Summary of Graphrouter: a Graph-based Router For Llm Selections, by Tao Feng et al.
-
Summary of Swe-bench Multimodal: Do Ai Systems Generalize to Visual Software Domains?, by John Yang et al.
-
Summary of Chain-of-jailbreak Attack For Image Generation Models Via Editing Step by Step, By Wenxuan Wang et al.
-
Summary of Kidlm: Advancing Language Models For Children — Early Insights and Future Directions, by Mir Tafseer Nayeem et al.
-
Summary of Still Not Quite There! Evaluating Large Language Models For Comorbid Mental Health Diagnosis, by Amey Hengle et al.
-
Summary of Grounding Language in Multi-perspective Referential Communication, by Zineng Tang et al.
-
Summary of Take It Easy: Label-adaptive Self-rationalization For Fact Verification and Explanation Generation, by Jing Yang and Anderson Rocha
-
Summary of Syllablelm: Learning Coarse Semantic Units For Speech Language Models, by Alan Baade et al.
-
Summary of Gamified Crowd-sourcing Of High-quality Data For Visual Fine-tuning, by Shashank Yadav et al.
-
Summary of Lorta: Low Rank Tensor Adaptation Of Large Language Models, by Ignacio Hounie et al.
-
Summary of Large Language Models Can Achieve Social Balance, by Pedro Cisneros-velarde
-
Summary of Econ: on the Detection and Resolution Of Evidence Conflicts, by Cheng Jiayang et al.
-
Summary of Pad: Personalized Alignment Of Llms at Decoding-time, by Ruizhe Chen et al.
-
Summary of Multi-round Region-based Optimization For Scene Sketching, by Yiqi Liang et al.
-
Summary of Globesumm: a Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization, by Yangfan Ye et al.
-
Summary of Epsilon-vae: Denoising As Visual Decoding, by Long Zhao et al.
-
Summary of From Reading to Compressing: Exploring the Multi-document Reader For Prompt Compression, by Eunseong Choi et al.
-
Summary of Reasoning with Natural Language Explanations, by Marco Valentino et al.
-
Summary of Dammi:daily Activities in a Psychologically Annotated Multi-modal Iot Dataset, by Mohsen Falah Rad et al.
-
Summary of One2set + Large Language Model: Best Partners For Keyphrase Generation, by Liangying Shao et al.
-
Summary of Lantern: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding, by Doohyuk Jang et al.
-
Summary of Exploring the Benefit Of Activation Sparsity in Pre-training, by Zhengyan Zhang et al.
-
Summary of Constructive Apraxia: An Unexpected Limit Of Instructible Vision-language Models and Analog For Human Cognitive Disorders, by David Noever and Samantha E. Miller Noever
-
Summary of Mare: Multi-aspect Rationale Extractor on Unsupervised Rationale Extraction, by Han Jiang et al.
-
Summary of Variational Bayes Gaussian Splatting, by Toon Van De Maele et al.
-
Summary of Not All Diffusion Model Activations Have Been Evaluated As Discriminative Features, by Benyuan Meng et al.
-
Summary of Similarity-enhanced Homophily For Multi-view Heterophilous Graph Clustering, by Jianpeng Chen et al.
-
Summary of Aligning Llms with Individual Preferences Via Interaction, by Shujin Wu et al.
-
Summary of Learning From Committee: Reasoning Distillation From a Mixture Of Teachers with Peer-review, by Zhuochun Li et al.
-
Summary of Estimating Body and Hand Motion in An Ego-sensed World, by Brent Yi et al.
-
Summary of Thematic Analysis with Open-source Generative Ai and Machine Learning: a New Method For Inductive Qualitative Codebook Development, by Andrew Katz and Gabriella Coloyan Fleming and Joyce Main
-
Summary of Unsupervised Human Preference Learning, by Sumuk Shashidhar et al.
-
Summary of Human Bias in the Face Of Ai: the Role Of Human Judgement in Ai Generated Text Evaluation, by Tiffany Zhu et al.
-
Summary of Erasmo: Leveraging Large Language Models For Enhanced Clustering Segmentation, by Fillipe Dos Santos Silva et al.
-
Summary of Grammar Induction From Visual, Speech and Text, by Yu Zhao et al.
-
Summary of Scisafeeval: a Comprehensive Benchmark For Safety Alignment Of Large Language Models in Scientific Tasks, by Tianhao Li et al.
-
Summary of Precision Knowledge Editing: Enhancing Safety in Large Language Models, by Xuying Li et al.
-
Summary of A Two-stage Proactive Dialogue Generator For Efficient Clinical Information Collection Using Large Language Model, by Xueshen Li et al.
-
Summary of Calliffusionv2: Personalized Natural Calligraphy Generation with Flexible Multi-modal Control, by Qisheng Liao et al.
-
Summary of Visual Editing with Llm-based Tool Chaining: An Efficient Distillation Approach For Real-time Applications, by Oren Sultan et al.
-
Summary of Aibat: Artificial Intelligence/instructions For Build, Assembly, and Test, by Benjamin Nuernberger et al.
-
Summary of Guided Stream Of Search: Learning to Better Search with Language Models Via Optimal Path Guidance, by Seungyong Moon et al.
-
Summary of Is Your Paper Being Reviewed by An Llm? Investigating Ai Text Detectability in Peer Review, By Sungduk Yu et al.
-
Summary of Dynamic Sparse Training Versus Dense Training: the Unexpected Winner in Image Corruption Robustness, by Boqian Wu et al.
-
Summary of Image First or Text First? Optimising the Sequencing Of Modalities in Large Language Model Prompting and Reasoning Tasks, by Grant Wardle and Teo Susnjak