Paper List

We recommend you use the search box as this list is very long.

Summary of Superhuman Performance Of a Large Language Model on the Reasoning Tasks Of a Physician, by Peter G. Brodeur et al.
Summary of Heterogeneous Graph Transformer For Multiple Tiny Object Tracking in Rgb-t Videos, by Qingyu Xu et al.
Summary of Tokens, the Oft-overlooked Appetizer: Large Language Models, the Distributional Hypothesis, and Meaning, by Julia Witte Zimmerman et al.
Summary of Llm-as-an-interviewer: Beyond Static Testing Through Dynamic Llm Evaluation, by Eunsu Kim et al.
Summary of Active Inference For Self-organizing Multi-llm Systems: a Bayesian Thermodynamic Approach to Adaptation, by Rithvik Prakki
Summary of Identifying and Manipulating Personality Traits in Llms Through Activation Engineering, by Rumi A. Allbert and James K. Wiles and Vlad Grankovsky
Summary of Gptdrawer: Enhancing Visual Synthesis Through Chatgpt, by Kun Li et al.
Summary of Nat-nl2gql: a Novel Multi-agent Framework For Translating Natural Language to Graph Query Language, by Yuanyuan Liang et al.
Summary of Imitate Before Detect: Aligning Machine Stylistic Preference For Machine-revised Text Detection, by Jiaqi Chen et al.
Summary of Coef-vq: Cost-efficient Video Quality Understanding Through a Cascaded Multimodal Llm Framework, by Xin Dong et al.
Summary of Multi-level Matching Network For Multimodal Entity Linking, by Zhiwei Hu et al.
Summary of Steganography in Game Actions, by Ching-chun Chang and Isao Echizen
Summary of Sweettok: Semantic-aware Spatial-temporal Tokenizer For Compact Video Discretization, by Zhentao Tan et al.
Summary of Unlocking Visual Secrets: Inverting Features with Diffusion Priors For Image Reconstruction, by Sai Qian Zhang et al.
Summary of Disentanglement and Compositionality Of Letter Identity and Letter Position in Variational Auto-encoder Vision Models, by Bruno Bianchi et al.
Summary of Geo-llava: a Large Multi-modal Model For Solving Geometry Math Problems with Meta In-context Learning, by Shihao Xu et al.
Summary of Fovealnet: Advancing Ai-driven Gaze Tracking Solutions For Optimized Foveated Rendering System Performance in Virtual Reality, by Wenxuan Liu et al.
Summary of Enriching Multimodal Sentiment Analysis Through Textual Emotional Descriptions Of Visual-audio Content, by Sheng Wu et al.
Summary of Automatic Detection, Positioning and Counting Of Grape Bunches Using Robots, by Xumin Gao
Summary of Vca: Video Curious Agent For Long Video Understanding, by Zeyuan Yang et al.
Summary of Crossvit-augmented Geospatial-intelligence Visualization System For Tracking Economic Development Dynamics, by Yanbing Bai et al.
Summary of Dynamic Entity-masked Graph Diffusion Model For Histopathological Image Representation Learning, by Zhenfeng Zhuang et al.
Summary of Svgbuilder: Component-based Colored Svg Generation with Text-guided Autoregressive Transformers, by Zehao Chen et al.
Summary of Wordvis: a Color Worth a Thousand Words, by Umar Khan et al.
Summary of Swifttry: Fast and Consistent Video Virtual Try-on with Diffusion Models, by Hung Nguyen et al.
Summary of Gaf: Gaussian Avatar Reconstruction From Monocular Videos Via Multi-view Diffusion, by Jiapeng Tang et al.
Summary of How Good Is My Story? Towards Quantitative Metrics For Evaluating Llm-generated Xai Narratives, by Timour Ichmoukhamedov et al.
Summary of Targeted Angular Reversal Of Weights (tars) For Knowledge Removal in Large Language Models, by Harry J. Davies et al.
Summary of Envisioning National Resources For Artificial Intelligence Research: Nsf Workshop Report, by Shantenu Jha and Yolanda Gil
Summary of Deepseek-vl2: Mixture-of-experts Vision-language Models For Advanced Multimodal Understanding, by Zhiyu Wu et al.
Summary of Brushedit: All-in-one Image Inpainting and Editing, by Yaowei Li et al.
Summary of Iris: Breaking Gui Complexity with Adaptive Focus and Self-refining, by Zhiqi Ge et al.
Summary of A Dual Contrastive Framework, by Yuan Sun et al.
Summary of Neural-symbolic Reasoning Over Knowledge Graphs: a Survey From a Query Perspective, by Lihui Liu et al.
Summary of Apollo: An Exploration Of Video Understanding in Large Multimodal Models, by Orr Zohar et al.
Summary of Evaluating Robustness Of Llms on Crisis-related Microblogs Across Events, Information Types, and Linguistic Features, by Muhammad Imran et al.
Summary of Tango: Training-free Embodied Ai Agents For Open-world Tasks, by Filippo Ziliotto et al.
Summary of Generative Adversarial Reviews: When Llms Become the Critic, by Nicolas Bougie and Narimasa Watanabe
Summary of Supermerge: An Approach For Gradient-based Model Merging, by Haoyu Yang et al.
Summary of Leveraging Audio and Text Modalities in Mental Health: a Study Of Llms Performance, by Abdelrahman A. Ali et al.
Summary of Autoprep: Natural Language Question-aware Data Preparation with a Multi-agent Framework, by Meihao Fan and Ju Fan and Nan Tang and Lei Cao and Guoliang Li and Xiaoyong Du
Summary of Constrained Decoding with Speculative Lookaheads, by Nishanth Nakshatri et al.
Summary of Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with Guidelinellm, by Shaoqing Zhang et al.
Summary of On Round-off Errors and Gaussian Blur in Superresolution and in Image Registration, by Serap A. Savari
Summary of Memory Layers at Scale, by Vincent-pierre Berges et al.
Summary of Learning Visually Grounded Domain Ontologies Via Embodied Conversation and Explanation, by Jonghyuk Park et al.
Summary of Semi-iin: Semi-supervised Intra-inter Modal Interaction Learning Network For Multimodal Sentiment Analysis, by Jinhao Lin et al.
Summary of Cp-detr: Concept Prompt Guide Detr Toward Stronger Universal Object Detection, by Qibo Chen et al.
Summary of Autopatent: a Multi-agent Framework For Automatic Patent Generation, by Qiyao Wang et al.
Summary of Meralion-audiollm: Bridging Audio and Language with Large Language Models, by Yingxu He et al.
Summary of B-vllm: a Vision Large Language Model with Balanced Spatio-temporal Tokens, by Zhuqiang Lu et al.
Summary of Enhancing Nursing and Elderly Care with Large Language Models: An Ai-driven Framework, by Qiao Sun et al.
Summary of Sumi-ifl: An Information-theoretic Framework For Image Forgery Localization with Sufficiency and Minimality Constraints, by Ziqi Sheng et al.
Summary of Small Language Model As Data Prospector For Large Language Model, by Shiwen Ni et al.
Summary of Visual Object Tracking Across Diverse Data Modalities: a Review, by Mengmeng Wang et al.
Summary of Tsgaussian: Semantic and Depth-guided Target-specific Gaussian Splatting From Sparse Views, by Liang Zhao et al.
Summary of Large Action Models: From Inception to Implementation, by Lu Wang et al.
Summary of Data Pruning Can Do More: a Comprehensive Data Pruning Approach For Object Re-identification, by Zi Yang et al.
Summary of Gaokao-eval: Does High Scores Truly Reflect Strong Capabilities in Llms?, by Zhikai Lei et al.
Summary of Retqa: a Large-scale Open-domain Tabular Question Answering Dataset For Real Estate Sector, by Zhensheng Wang et al.
Summary of Route: Robust Multitask Tuning and Collaboration For Text-to-sql, by Yang Qin et al.
Summary of Label-template Based Few-shot Text Classification with Contrastive Learning, by Guanghua Hou et al.
Summary of Vlr-bench: Multilingual Benchmark Dataset For Vision-language Retrieval Augmented Generation, by Hyeonseok Lim et al.
Summary of Beware Of Metacognitive Laziness: Effects Of Generative Artificial Intelligence on Learning Motivation, Processes, and Performance, by Yizhou Fan et al.
Summary of Benchmarking Llms For Mimicking Child-caregiver Language in Interaction, by Jing Liu et al.
Summary of Causal Graphical Models For Vision-language Compositional Understanding, by Fiorenzo Parascandolo et al.
Summary of Word Sense Linking: Disambiguating Outside the Sandbox, by Andrei Stefan Bejgu et al.
Summary of All You Need in Knowledge Distillation Is a Tailored Coordinate System, by Junjie Zhou et al.
Summary of Ai Predicts Agi: Leveraging Agi Forecasting and Peer Review to Explore Llms’ Complex Reasoning Capabilities, by Fabrizio Davide et al.
Summary of Ufo: Enhancing Diffusion-based Video Generation with a Uniform Frame Organizer, by Delong Liu et al.
Summary of Uncommon Belief in Rationality, by Qi Shi and Pavel Naumov
Summary of Imitate, Explore, and Self-improve: a Reproduction Report on Slow-thinking Reasoning Systems, by Yingqian Min et al.
Summary of Vision Transformers For Efficient Indoor Pathloss Radio Map Prediction, by Edvard Ghukasyan et al.
Summary of New Keypoint-based Approach For Recognising British Sign Language (bsl) From Sequences, by Oishi Deb and Kr Prajwal and Andrew Zisserman
Summary of The Parameters Of Educability, by Leslie G. Valiant
Summary of Efficient and Comprehensive Feature Extraction in Large Vision-language Model For Clinical Pathology Analysis, by Shengxuming Zhang et al.
Summary of Internlm-xcomposer2.5-omnilive: a Comprehensive Multimodal System For Long-term Streaming Video and Audio Interactions, by Pan Zhang et al.
Summary of Timerefine: Temporal Grounding with Time Refining Video Llm, by Xizi Wang et al.
Summary of Olympus: a Universal Task Router For Computer Vision Tasks, by Yuanze Lin et al.
Summary of Evaluation Agent: Efficient and Promptable Evaluation Framework For Visual Generative Models, by Fan Zhang and Shulin Tian and Ziqi Huang and Yu Qiao and Ziwei Liu
Summary of Bridging Ai and Science: Implications From a Large-scale Literature Analysis Of Ai4science, by Yutong Xie et al.
Summary of From Noise to Nuance: Advances in Deep Generative Image Models, by Benji Peng et al.
Summary of Systematic Analysis Of Llm Contributions to Planning: Solver, Verifier, Heuristic, by Haoming Li et al.
Summary of Is Contrastive Distillation Enough For Learning Comprehensive 3d Representations?, by Yifan Zhang et al.
Summary of What Makes Cryptic Crosswords Challenging For Llms?, by Abdelrahman Sadallah et al.
Summary of Shiksha: a Technical Domain Focused Translation Dataset and Model For Indian Languages, by Advait Joglekar and Srinivasan Umesh
Summary of Multi-task Learning with Llms For Implicit Sentiment Analysis: Data-level and Task-level Automatic Weight Learning, by Wenna Lai et al.
Summary of Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning For Skeleton-based Person Re-identification, by Haocong Rao et al.
Summary of A Context-enhanced Framework For Sequential Graph Reasoning, by Shuo Shi et al.
Summary of Forest-of-thought: Scaling Test-time Compute For Enhancing Llm Reasoning, by Zhenni Bi et al.
Summary of Temporal Numeric Planning with Patterns, by Matteo Cardellini and Enrico Giunchiglia
Summary of Polyipa — Multilingual Phoneme-to-grapheme Conversion Model, by Davor Lauc
Summary of Goal-driven Query Answering Over First- and Second-order Dependencies with Equality, by Efthymia Tsamoura and Boris Motik
Summary of Foundation Models and Adaptive Feature Selection: a Synergistic Approach to Video Question Answering, by Sai Bhargav Rongali et al.
Summary of When Text Embedding Meets Large Language Model: a Comprehensive Survey, by Zhijie Nie et al.
Summary of Lmagent: a Large-scale Multimodal Agents Society For Multi-user Simulation, by Yijun Liu et al.
Summary of Vlms Meet Uda: Boosting Transferability Of Open Vocabulary Segmentation with Unsupervised Domain Adaptation, by Roberto Alcover-couso et al.
Summary of First Train to Generate, Then Generate to Train: Unitedsynt5 For Few-shot Nli, by Sourav Banerjee et al.
Summary of Speeding Up Approximate Map by Applying Domain Knowledge About Relevant Variables, By Johan Kwisthout and Andrew Schroeder
Summary of Towards Understanding the Robustness Of Llm-based Evaluations Under Perturbations, by Manav Chaudhary et al.