Paper List
We recommend you use the search box as this list is very long.
-
Summary of Semopo: Learning High-quality Model and Policy From Low-quality Offline Visual Datasets, by Shenghua Wan et al.
-
Summary of My Body My Choice: Human-centric Full-body Anonymization, by Umur Aybars Ciftci et al.
-
Summary of Introducing Hot3d: An Egocentric Dataset For 3d Hand and Object Tracking, by Prithviraj Banerjee et al.
-
Summary of Vlind-bench: Measuring Language Priors in Large Vision-language Models, by Kang-il Lee et al.
-
Summary of Alphazeroes: Direct Score Maximization Outperforms Planning Loss Minimization, by Carlos Martin et al.
-
Summary of Batch-instructed Gradient For Prompt Evolution:systematic Prompt Optimization For Enhanced Text-to-image Synthesis, by Xinrui Yang et al.
-
Summary of 3d Building Generation in Minecraft Via Large Language Models, by Shiying Hu et al.
-
Summary of Srfund: a Multi-granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding, by Jiefeng Ma et al.
-
Summary of Injecting Combinatorial Optimization Into Mcts: Application to the Board Game Boop, by Florian Richoux
-
Summary of Computer Vision-based Model For Detecting Turning Lane Features on Florida’s Public Roadways, by Richard Boadu Antwi et al.
-
Summary of An Approach to Build Zero-shot Slot-filling System For Industry-grade Conversational Assistants, by G P Shrivatsa Bhargav et al.
-
Summary of A Survey on Compositional Learning Of Ai Models: Theoretical and Experimental Practices, by Sania Sinha et al.
-
Summary of Zoom and Shift Are All You Need, by Jiahao Qin
-
Summary of Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-attention Cues in Multitask Learning, by Arnav Goel et al.
-
Summary of Egoexo-fitness: Towards Egocentric and Exocentric Full-body Action Understanding, by Yuan-ming Li et al.
-
Summary of Introducing Brain-like Concepts to Embodied Hand-crafted Dialog Management System, by Frank Joublin et al.
-
Summary of Multi-agent Software Development Through Cross-team Collaboration, by Zhuoyun Du et al.
-
Summary of Language Models Are Crossword Solvers, by Soumadeep Saha and Sutanoya Chakraborty and Saptarshi Saha and Utpal Garain
-
Summary of Towards Reliable Detection Of Llm-generated Texts: a Comprehensive Evaluation Framework with Cudrt, by Zhen Tao et al.
-
Summary of Pc-lora: Low-rank Adaptation For Progressive Model Compression with Knowledge Distillation, by Injoon Hwang et al.
-
Summary of Suitability Of Kans For Computer Vision: a Preliminary Investigation, by Basim Azam and Naveed Akhtar
-
Summary of Fine-grained Domain Generalization with Feature Structuralization, by Wenlong Yu et al.
-
Summary of Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn’t, by Chihiro Taguchi and David Chiang
-
Summary of A Sociotechnical Lens For Evaluating Computer Vision Models: a Case Study on Detecting and Reasoning About Gender and Emotion, by Sha Luo et al.
-
Summary of Research Trends For the Interplay Between Large Language Models and Knowledge Graphs, by Hanieh Khorashadizadeh et al.
-
Summary of Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games, by Carlos Garcia Ling et al.
-
Summary of From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition, by Shiwei Wu et al.
-
Summary of 2.5d Multi-view Averaging Diffusion Model For 3d Medical Image Translation: Application to Low-count Pet Reconstruction with Ct-less Attenuation Correction, by Tianqi Chen et al.
-
Summary of Mmworld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos, by Xuehai He et al.
-
Summary of Omnicorpus: a Unified Multimodal Corpus Of 10 Billion-level Images Interleaved with Text, by Qingyun Li et al.
-
Summary of Tailoring Generative Ai Chatbots For Multiethnic Communities in Disaster Preparedness Communication: Extending the Casa Paradigm, by Xinyan Zhao et al.
-
Summary of Awgunet: Attention-aided Wavelet Guided U-net For Nuclei Segmentation in Histopathology Images, by Ayush Roy et al.
-
Summary of Olmes: a Standard For Language Model Evaluations, by Yuling Gu et al.
-
Summary of Taste: Teaching Large Language Models to Translate Through Self-reflection, by Yutong Wang et al.
-
Summary of Next-generation Database Interfaces: a Survey Of Llm-based Text-to-sql, by Zijin Hong et al.
-
Summary of Magpie: Alignment Data Synthesis From Scratch by Prompting Aligned Llms with Nothing, By Zhangchen Xu et al.
-
Summary of Surprise! Using Physiological Stress For Allostatic Regulation Under the Active Inference Framework [pre-print], by Imran Khan and Robert Lowe
-
Summary of Asi As the New God: Technocratic Theocracy, by Tevfik Uyar
-
Summary of Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks, by Justin Zhao et al.
-
Summary of Tc-bench: Benchmarking Temporal Compositionality in Text-to-video and Image-to-video Generation, by Weixi Feng et al.
-
Summary of A Generative Marker Enhanced End-to-end Framework For Argument Mining, by Nilmadhab Das et al.
-
Summary of Reversing the Forget-retain Objectives: An Efficient Llm Unlearning Framework From Logit Difference, by Jiabao Ji et al.
-
Summary of Multi-agent Reinforcement Learning with Deep Networks For Diverse Q-vectors, by Zhenglong Luo et al.
-
Summary of Dynamic Stochastic Decoding Strategy For Open-domain Dialogue Generation, by Yiwei Li et al.
-
Summary of Let’s Go Real Talk: Spoken Dialogue Model For Face-to-face Conversation, by Se Jin Park et al.
-
Summary of Unveiling the Power Of Wavelets: a Wavelet-based Kolmogorov-arnold Network For Hyperspectral Image Classification, by Seyd Teymoor Seydi and Zavareh Bozorgasl and Hao Chen
-
Summary of Exploring Self-supervised Multi-view Contrastive Learning For Speech Emotion Recognition with Limited Annotations, by Bulat Khaertdinov et al.
-
Summary of Toward a Method to Generate Capability Ontologies From Natural Language Descriptions, by Luis Miguel Vieira Da Silva et al.
-
Summary of Designing a Dashboard For Transparency and Control Of Conversational Ai, by Yida Chen et al.
-
Summary of Efficient Adaptation in Mixed-motive Environments Via Hierarchical Opponent Modeling and Planning, by Yizhe Huang et al.
-
Summary of Openobj: Open-vocabulary Object-level Neural Radiance Fields with Fine-grained Understanding, by Yinan Deng et al.
-
Summary of Shacl2fol: An Fol Toolkit For Shacl Decision Problems, by Paolo Pareti
-
Summary of Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-language Models, by Shimin Chen et al.
-
Summary of Lvbench: An Extreme Long Video Understanding Benchmark, by Weihan Wang et al.
-
Summary of Austrotox: a Dataset For Target-based Austrian German Offensive Language Detection, by Pia Pachinger et al.
-
Summary of Multimodal Table Understanding, by Mingyu Zheng et al.
-
Summary of Supportiveness-based Knowledge Rewriting For Retrieval-augmented Language Modeling, by Zile Qiao et al.
-
Summary of Legend: Leveraging Representation Engineering to Annotate Safety Margin For Preference Datasets, by Duanyu Feng et al.
-
Summary of Continuous Fake Media Detection: Adapting Deepfake Detectors to New Generative Techniques, by Francesco Tassone et al.
-
Summary of Making Ai Intelligible: Philosophical Foundations, by Herman Cappelen and Josh Dever
-
Summary of Mobileagentbench: An Efficient and User-friendly Benchmark For Mobile Llm Agents, by Luyuan Wang et al.
-
Summary of Accessing Gpt-4 Level Mathematical Olympiad Solutions Via Monte Carlo Tree Self-refine with Llama-3 8b, by Di Zhang et al.
-
Summary of Textual Similarity As a Key Metric in Machine Translation Quality Estimation, by Kun Sun et al.
-
Summary of Open-llm-leaderboard: From Multi-choice to Open-style Questions For Llms Evaluation, Benchmark, and Arena, by Aidar Myrzakhan and Sondos Mahmoud Bsharat and Zhiqiang Shen
-
Summary of Neural Gaffer: Relighting Any Object Via Diffusion, by Haian Jin et al.
-
Summary of Cads: a Systematic Literature Review on the Challenges Of Abstractive Dialogue Summarization, by Frederic Kirstein et al.
-
Summary of Structured Active Inference (extended Abstract), by Toby St Clere Smithe
-
Summary of Commonsense-t2i Challenge: Can Text-to-image Generation Models Understand Commonsense?, by Xingyu Fu et al.
-
Summary of Situated Ground Truths: Enhancing Bias-aware Ai by Situating Data Labels with Situannotate, By Delfina Sol Martinez Pandiani and Valentina Presutti
-
Summary of Brainchat: Decoding Semantic Information From Fmri Using Vision-language Pretrained Models, by Wanaiu Huang
-
Summary of Mllmguard: a Multi-dimensional Safety Evaluation Suite For Multimodal Large Language Models, by Tianle Gu et al.
-
Summary of Modeling Sustainable Resource Management Using Active Inference, by Mahault Albarracin et al.
-
Summary of Test-time Fairness and Robustness in Large Language Models, by Leonardo Cotta and Chris J. Maddison
-
Summary of Cupid: Contextual Understanding Of Prompt-conditioned Image Distributions, by Yayan Zhao et al.
-
Summary of The Muse 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition, by Shahin Amiriparian et al.
-
Summary of Judging the Judges: a Systematic Study Of Position Bias in Llm-as-a-judge, by Lin Shi et al.
-
Summary of Making Task-oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests, By Amogh Mannekote et al.
-
Summary of Collective Constitutional Ai: Aligning a Language Model with Public Input, by Saffron Huang et al.
-
Summary of Are Large Language Models Good Statisticians?, by Yizhang Zhu et al.
-
Summary of Sense Less, Generate More: Pre-training Lidar Perception with Masked Autoencoders For Ultra-efficient 3d Sensing, by Sina Tayebati et al.
-
Summary of Sciriff: a Resource to Enhance Language Model Instruction-following Over Scientific Literature, by David Wadden et al.
-
Summary of Beyond Bare Queries: Open-vocabulary Object Grounding with 3d Scene Graph, by Sergey Linok et al.
-
Summary of T2s-gpt: Dynamic Vector Quantization For Autoregressive Sign Language Production From Text, by Aoxiong Yin et al.
-
Summary of Mining Frequent Structures in Conceptual Models, by Mattia Fumagalli et al.
-
Summary of Argus: Benchmarking and Enhancing Vision-language Models For 3d Radiology Report Generation, by Che Liu et al.
-
Summary of Scaling Large Language Model-based Multi-agent Collaboration, by Chen Qian et al.
-
Summary of Trustworthy and Practical Ai For Healthcare: a Guided Deferral System with Large Language Models, by Joshua Strong et al.
-
Summary of Needle in a Multimodal Haystack, by Weiyun Wang et al.
-
Summary of Merging Improves Self-critique Against Jailbreak Attacks, by Victor Gallego
-
Summary of Dual-reflect: Enhancing Large Language Models For Reflective Translation Through Dual Learning Feedback Mechanisms, by Andong Chen et al.
-
Summary of Improving Commonsense Bias Classification by Mitigating the Influence Of Demographic Terms, By Jinkyu Lee et al.
-
Summary of Scholarly Question Answering Using Large Language Models in the Nfdi4datascience Gateway, by Hamed Babaei Giglou et al.
-
Summary of Is One Gpu Enough? Pushing Image Generation at Higher-resolutions with Foundation Models, by Athanasios Tragakis et al.
-
Summary of Dca-bench: a Benchmark For Dataset Curation Agents, by Benhao Huang et al.
-
Summary of Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication, by Olaf Lipinski et al.
-
Summary of Unsupervised Object Detection with Theoretical Guarantees, by Marian Longa et al.
-
Summary of Can We Achieve High-quality Direct Speech-to-speech Translation Without Parallel Speech Data?, by Qingkai Fang et al.
-
Summary of Ctc-based Non-autoregressive Textless Speech-to-speech Translation, by Qingkai Fang et al.