Paper List
We recommend you use the search box as this list is very long.
-
Summary of Vibecheck: Discover and Quantify Qualitative Differences in Large Language Models, by Lisa Dunlap et al.
-
Summary of Prompt Engineering a Schizophrenia Chatbot: Utilizing a Multi-agent Approach For Enhanced Compliance with Prompt Instructions, by Per Niklas Waaler et al.
-
Summary of Jailjudge: a Comprehensive Jailbreak Judge Benchmark with Multi-agent Enhanced Explanation Evaluation Framework, by Fan Liu et al.
-
Summary of Tpo: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees, by Weibin Liao et al.
-
Summary of Optimized Biomedical Question-answering Services with Llm and Multi-bert Integration, by Cheng Qian et al.
-
Summary of Enterprise Benchmarks For Large Language Model Evaluation, by Bing Zhang et al.
-
Summary of Large Language Models For Medical Osce Assessment: a Novel Approach to Transcript Analysis, by Ameer Hamza Shakur et al.
-
Summary of Llmd: a Large Language Model For Interpreting Longitudinal Medical Records, by Robert Porter et al.
-
Summary of Enhancing Long Context Performance in Llms Through Inner Loop Query Mechanism, by Yimin Tang et al.
-
Summary of Scaled and Inter-token Relation Enhanced Transformer For Sample-restricted Residential Nilm, by Minhajur Rahman et al.
-
Summary of Investigating Implicit Bias in Large Language Models: a Large-scale Study Of Over 50 Llms, by Divyanshu Kumar et al.
-
Summary of Empowering Dysarthric Speech: Leveraging Advanced Llms For Accurate Speech Correction and Multimodal Emotion Analysis, by Kaushal Attaluri et al.
-
Summary of Navigating the Cultural Kaleidoscope: a Hitchhiker’s Guide to Sensitivity in Large Language Models, by Somnath Banerjee et al.
-
Summary of Mind: Math Informed Synthetic Dialogues For Pretraining Llms, by Syeda Nahida Akter et al.
-
Summary of Conformity in Large Language Models, by Xiaochen Zhu and Caiqi Zhang and Tom Stafford and Nigel Collier and Andreas Vlachos
-
Summary of Open Ko-llm Leaderboard2: Bridging Foundational and Practical Evaluation For Korean Llms, by Hyeonwoo Kim et al.
-
Summary of Stabilize the Latent Space For Image Autoregressive Modeling: a Unified Perspective, by Yongxin Zhu et al.
-
Summary of Dh-vton: Deep Text-driven Virtual Try-on Via Hybrid Attention Learning, by Jiabao Wei and Zhiyuan Ma
-
Summary of Benchmarking Defeasible Reasoning with Large Language Models — Initial Experiments and Future Directions, by Ilias Tachmazidis et al.
-
Summary of Queenscamp: An Rgb-d Dataset For Robust Visual Slam, by Hudson M. S. Bruno et al.
-
Summary of Counterfactual Effect Decomposition in Multi-agent Sequential Decision Making, by Stelios Triantafyllou et al.
-
Summary of Llm-based Translation Inference with Iterative Bilingual Understanding, by Andong Chen et al.
-
Summary of Development Of Image Collection Method Using Yolo and Siamese Network, by Chan Young Shin et al.
-
Summary of A Claim Decomposition Benchmark For Long-form Answer Verification, by Zhihao Zhang and Yixing Fan and Ruqing Zhang and Jiafeng Guo
-
Summary of Strux: An Llm For Decision-making with Structured Explanations, by Yiming Lu et al.
-
Summary of Rethinking Visual Counterfactual Explanations Through Region Constraint, by Bartlomiej Sobieski et al.
-
Summary of Evaluating Morphological Compositional Generalization in Large Language Models, by Mete Ismayilzada et al.
-
Summary of Cross-modal Safety Mechanism Transfer in Large Vision-language Models, by Shicheng Xu et al.
-
Summary of Worldcuisines: a Massive-scale Benchmark For Multilingual and Multicultural Visual Question Answering on Global Cuisines, by Genta Indra Winata et al.
-
Summary of Unitary Multi-margin Bert For Robust Natural Language Processing, by Hao-yuan Chang and Kang L. Wang
-
Summary of Identifying Task Groupings For Multi-task Learning Using Pointwise V-usable Information, by Yingya Li et al.
-
Summary of Interpretable Rule-based System For Radar-based Gesture Sensing: Enhancing Transparency and Personalization in Ai, by Sarah Seifi et al.
-
Summary of Interactive Explainable Anomaly Detection For Industrial Settings, by Daniel Gramelt et al.
-
Summary of A Comprehensive Survey Of Retrieval-augmented Generation (rag): Evolution, Current Landscape and Future Directions, by Shailja Gupta et al.
-
Summary of Order-aware Interactive Segmentation, by Bin Wang et al.
-
Summary of Omnixr: Evaluating Omni-modality Language Models on Reasoning Across Modalities, by Lichang Chen and Hexiang Hu and Mingda Zhang and Yiwen Chen and Zifeng Wang and Yandong Li and Pranav Shyam and Tianyi Zhou and Heng Huang and Ming-hsuan Yang and Boqing Gong
-
Summary of On a Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation, by Xiaonan Jing et al.
-
Summary of Controlled Automatic Task-specific Synthetic Data Generation For Hallucination Detection, by Yong Xie et al.
-
Summary of Kallini Et Al. (2024) Do Not Compare Impossible Languages with Constituency-based Ones, by Tim Hunter
-
Summary of A Prompt-based Knowledge Graph Foundation Model For Universal In-context Reasoning, by Yuanning Cui and Zequn Sun and Wei Hu
-
Summary of Pyramid-driven Alignment: Pyramid Principle Guided Integration Of Large Language Models and Knowledge Graphs, by Lei Sun et al.
-
Summary of Open Domain Question Answering with Conflicting Contexts, by Siyi Liu et al.
-
Summary of Facechain-fact: Face Adapter with Decoupled Training For Identity-preserved Personalization, by Cheng Yu et al.
-
Summary of Reversal Of Thought: Enhancing Large Language Models with Preference-guided Reverse Reasoning Warm-up, by Jiahao Yuan et al.
-
Summary of Understanding the Role Of Llms in Multimodal Evaluation Benchmarks, by Botian Jiang et al.
-
Summary of Tas: Distilling Arbitrary Teacher and Student Via a Hybrid Assistant, by Guopeng Li et al.
-
Summary of Characterizing Model Collapse in Large Language Models Using Semantic Networks and Next-token Probability, by Daniele Gambetta et al.
-
Summary of Efficient Diffusion As Low Light Enhancer, by Guanzhou Lan et al.
-
Summary of Proactive Agent: Shifting Llm Agents From Reactive Responses to Active Assistance, by Yaxi Lu et al.
-
Summary of Preflexor: Preference-based Recursive Language Modeling For Exploratory Optimization Of Reasoning and Agentic Thinking, by Markus J. Buehler
-
Summary of Shapefilegpt: a Multi-agent Large Language Model Framework For Automated Shapefile Processing, by Qingming Lin et al.
-
Summary of Humaneval-v: Benchmarking High-level Visual Reasoning with Complex Diagrams in Coding Tasks, by Fengji Zhang et al.
-
Summary of Revealing the Barriers Of Language Agents in Planning, by Jian Xie et al.
-
Summary of A Fast Convoluted Story: Scaling Probabilistic Inference For Integer Arithmetic, by Lennert De Smet and Pedro Zuidberg Dos Martires
-
Summary of Videgothink: Assessing Egocentric Video Understanding Capabilities For Embodied Ai, by Sijie Cheng et al.
-
Summary of Ed-vit: Splitting Vision Transformer For Distributed Inference on Edge Devices, by Xiang Liu et al.
-
Summary of Retrieval Augmented Spelling Correction For E-commerce Applications, by Xuan Guo et al.
-
Summary of Visualrwkv-hd and Uhd: Advancing High-resolution Processing For Visual Language Models, by Zihang Li and Haowen Hou
-
Summary of Leaving the Barn Door Open For Clever Hans: Simple Features Predict Llm Benchmark Answers, by Lorenzo Pacchiardi et al.
-
Summary of Rclicks: Realistic Click Simulation For Benchmarking Interactive Segmentation, by Anton Antonov et al.
-
Summary of Magnifier Prompt: Tackling Multimodal Hallucination Via Extremely Simple Instructions, by Yuhan Fu et al.
-
Summary of Patch-based Diffusion Models Beat Whole-image Models For Mismatched Distribution Inverse Problems, by Jason Hu et al.
-
Summary of Evidence Of Cognitive Deficits Anddevelopmental Advances in Generative Ai: a Clock Drawing Test Analysis, by Isaac R. Galatzer-levy et al.
-
Summary of Slidechat: a Large Vision-language Assistant For Whole-slide Pathology Image Understanding, by Ying Chen et al.
-
Summary of Ctrlsynth: Controllable Image Text Synthesis For Data-efficient Multimodal Learning, by Qingqing Cao et al.
-
Summary of Concept-reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models Via Abstraction, by Kaiqiao Han et al.
-
Summary of Large-scale Cloze Evaluation Reveals That Token Prediction Tasks Are Neither Lexically Nor Semantically Aligned, by Cassandra L. Jacobs et al.
-
Summary of Weatherdg: Llm-assisted Diffusion Model For Procedural Weather Generation in Domain-generalized Semantic Segmentation, by Chenghao Qian et al.
-
Summary of Planning Anything with Rigor: General-purpose Zero-shot Planning with Llm-based Formalized Programming, by Yilun Hao et al.
-
Summary of Iter-ahmcl: Alleviate Hallucination For Large Language Model Via Iterative Model-level Contrastive Learning, by Huiwen Wu et al.
-
Summary of Layer-of-thoughts Prompting (lot): Leveraging Llm-based Retrieval with Constraint Hierarchies, by Wachara Fungwacharakorn et al.
-
Summary of Dual-model Distillation For Efficient Action Classification with Hybrid Edge-cloud Solution, by Timothy Wei et al.
-
Summary of Exploiting Llms’ Reasoning Capability to Infer Implicit Concepts in Legal Information Retrieval, by Hai-long Nguyen et al.
-
Summary of Sparse Prototype Network For Explainable Pedestrian Behavior Prediction, by Yan Feng et al.
-
Summary of Athena: Retrieval-augmented Legal Judgment Prediction with Large Language Models, by Xiao Peng et al.
-
Summary of On the Capacity Of Citation Generation by Large Language Models, By Haosheng Qian et al.
-
Summary of In-context Learning For Long-context Sentiment Analysis on Infrastructure Project Opinions, by Alireza Shamshiri et al.
-
Summary of Hr-agent: a Task-oriented Dialogue (tod) Llm Agent Tailored For Hr Applications, by Weijie Xu et al.
-
Summary of Process Reward Model with Q-value Rankings, by Wendi Li et al.
-
Summary of Enhancing Assamese Nlp Capabilities: Introducing a Centralized Dataset Repository, by S. Tamang et al.
-
Summary of Speculative Knowledge Distillation: Bridging the Teacher-student Gap Through Interleaved Sampling, by Wenda Xu et al.
-
Summary of Preserve or Modify? Context-aware Evaluation For Balancing Preservation and Modification in Text-guided Image Editing, by Yoonjeon Kim et al.
-
Summary of Rate: Causal Explainability Of Reward Models with Imperfect Counterfactuals, by David Reber et al.
-
Summary of Implementing Derivations Of Definite Logic Programs with Self-attention Networks, by Phan Thi Thanh Thuy et al.
-
Summary of A Case For Ai Consciousness: Language Agents and Global Workspace Theory, by Simon Goldstein and Cameron Domenico Kirk-giannini
-
Summary of Pmmt: Preference Alignment in Multilingual Machine Translation Via Llm Distillation, by Shuqiao Sun et al.
-
Summary of Dynamicer: Resolving Emerging Mentions to Dynamic Entities For Rag, by Jinyoung Kim et al.
-
Summary of Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal Llms, by Sihang Zhao et al.
-
Summary of Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework For Llms, by Wanying Wang et al.
-
Summary of Agentigraph: An Interactive Knowledge Graph Platform For Llm-based Chatbots Utilizing Private Data, by Xinjie Zhao et al.
-
Summary of Y-mol: a Multiscale Biomedical Knowledge-guided Large Language Model For Drug Development, by Tengfei Ma et al.
-
Summary of Multi-round Jailbreak Attack on Large Language Models, by Yihua Zhou et al.
-
Summary of Lcd-net: a Lightweight Remote Sensing Change Detection Network Combining Feature Fusion and Gating Mechanism, by Wenyu Liu et al.
-
Summary of Auditwen:an Open-source Large Language Model For Audit, by Jiajia Huang et al.
-
Summary of Toolbridge: An Open-source Dataset to Equip Llms with External Tool Capabilities, by Zhenchao Jin et al.
-
Summary of Optimizing Transformer Based on High-performance Optimizer For Predicting Employment Sentiment in American Social Media Content, by Feiyang Wang et al.
-
Summary of Improving Data Efficiency Via Curating Llm-driven Rating Systems, by Jinlong Pang et al.
-
Summary of Cultural Heritage 3d Reconstruction with Diffusion Networks, by Pablo Jaramillo and Ivan Sipiran
-
Summary of Agent-as-a-judge: Evaluate Agents with Agents, by Mingchen Zhuge et al.