Paper List

We recommend you use the search box as this list is very long.

Summary of Digital Twin in Industries: a Comprehensive Survey, by Md Bokhtiar Al Zami et al.
Summary of An Ai-driven Data Mesh Architecture Enhancing Decision-making in Infrastructure Construction and Public Procurement, by Saurabh Mishra et al.
Summary of Twisted Convolutional Networks (tcns): Enhancing Feature Interactions For Non-spatial Data Classification, by Junbo Jacob Lian
Summary of Scratcheval: Are Gpt-4o Smarter Than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges, by Rao Fu et al.
Summary of Ustcctsu at Semeval-2024 Task 1: Reducing Anisotropy For Cross-lingual Semantic Textual Relatedness Task, by Jianjian Li et al.
Summary of Way to Specialist: Closing Loop Between Specialized Llm and Evolving Domain Knowledge Graph, by Yutong Zhang et al.
Summary of Mars-po: Multi-agent Reasoning System Preference Optimization, by Xiaoxuan Lou et al.
Summary of Msg Score: a Comprehensive Evaluation For Multi-scene Video Generation, by Daewon Yoon et al.
Summary of Objectrelator: Enabling Cross-view Object Relation Understanding in Ego-centric and Exo-centric Videos, by Yuqian Fu et al.
Summary of Hot3d: Hand and Object Tracking in 3d From Egocentric Multi-view Videos, by Prithviraj Banerjee et al.
Summary of Sowing Information: Cultivating Contextual Coherence with Mllms in Image Generation, by Yuhan Pei and Ruoyu Wang and Yongqi Yang and Ye Zhu and Olga Russakovsky and Yu Wu
Summary of Talking to Dino: Bridging Self-supervised Vision Backbones with Language For Open-vocabulary Segmentation, by Luca Barsellotti et al.
Summary of Integrating Transit Signal Priority Into Multi-agent Reinforcement Learning Based Traffic Signal Control, by Dickness Kakitahi Kwesiga et al.
Summary of Omulet: Orchestrating Multiple Tools For Practicable Conversational Recommendation, by Se-eun Yoon et al.
Summary of Beyond Surface Structure: a Causal Assessment Of Llms’ Comprehension Ability, by Yujin Han et al.
Summary of Tqa-bench: Evaluating Llms For Multi-table Question Answering with Scalable Context and Symbolic Extension, by Zipeng Qiu et al.
Summary of A Local Information Aggregation Based Multi-agent Reinforcement Learning For Robot Swarm Dynamic Task Allocation, by Yang Lv et al.
Summary of Knowledge Management For Automobile Failure Analysis Using Graph Rag, by Yuta Ojima et al.
Summary of Training Agents with Weakly Supervised Feedback From Large Language Models, by Dihong Gong et al.
Summary of Great: Geometry-intention Collaborative Inference For Open-vocabulary 3d Object Affordance Grounding, by Yawen Shao et al.
Summary of Chinesewebtext 2.0: Large-scale High-quality Chinese Web Text with Multi-dimensional and Fine-grained Information, by Wanyue Zhang et al.
Summary of Pddlfuse: a Tool For Generating Diverse Planning Domains, by Vedant Khandelwal et al.
Summary of Handling Irresolvable Conflicts in the Semantic Web: An Rdf-based Conflict-tolerant Version Of the Deontic Traditional Scheme, by Livio Robaldo and Gianluca Pozzato
Summary of Mvketr: Chest Ct Report Generation with Multi-view Perception and Knowledge Enhancement, by Xiwei Deng et al.
Summary of Helvipad: a Real-world Dataset For Omnidirectional Stereo Depth Estimation, by Mehdi Zayene et al.
Summary of Continual Learning in Machine Speech Chain Using Gradient Episodic Memory, by Geoffrey Tyndall et al.
Summary of Tryoffdiff: Virtual-try-off Via High-fidelity Garment Reconstruction Using Diffusion Models, by Riza Velioglu et al.
Summary of Gpt As Ghostwriter at the White House, by Jacques Savoy
Summary of Draft Model Knows When to Stop: a Self-verification Length Policy For Speculative Decoding, by Ziyin Zhang and Jiahao Xu and Tian Liang and Xingyu Chen and Zhiwei He and Rui Wang and Zhaopeng Tu
Summary of Is My Meeting Summary Good? Estimating Quality with a Multi-llm Evaluator, by Frederic Kirstein et al.
Summary of Weakly Supervised Framework Considering Multi-temporal Information For Large-scale Cropland Mapping with Satellite Imagery, by Yuze Wang et al.
Summary of Dspy-based Neural-symbolic Pipeline to Enhance Spatial Reasoning in Llms, by Rong Wang et al.
Summary of Cross-modal Information Flow in Multimodal Large Language Models, by Zhi Zhang et al.
Summary of Dhcp: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-language Models, By Yudong Zhang et al.
Summary of Scaleviz: Scaling Visualization Recommendation Models on Large Data, by Ghazi Shazan Ahmad et al.
Summary of Gaussianspeech: Audio-driven Gaussian Avatars, by Shivangi Aneja et al.
Summary of Generative Visual Communication in the Era Of Vision-language Models, by Yael Vinker
Summary of On the Effectiveness Of Incremental Training Of Large Language Models, by Miles Q. Li et al.
Summary of The Performance Of the Lstm-based Code Generated by Large Language Models (llms) in Forecasting Time Series Data, By Saroj Gopali et al.
Summary of Covis: a Collaborative Framework For Fine-grained Graphic Visual Understanding, by Xiaoyu Deng et al.
Summary of Devising a Set Of Compact and Explainable Spoken Language Feature For Screening Alzheimer’s Disease, by Junan Li et al.
Summary of Newsedits 2.0: Learning the Intentions Behind Updating News, by Alexander Spangher et al.
Summary of Ezsql: An Sql Intermediate Representation For Improving Sql-to-text Generation, by Meher Bhardwaj et al.
Summary of Arabic-nougat: Fine-tuning Vision Transformers For Arabic Ocr and Markdown Extraction, by Mohamed Rashad
Summary of Hoppr Medical-grade Platform For Medical Imaging Ai, by Kalina P. Slavkova et al.
Summary of Can Llms Plan Paths in the Real World?, by Wanyi Chen et al.
Summary of Evaluating Generative Ai-enhanced Content: a Conceptual Framework Using Qualitative, Quantitative, and Mixed-methods Approaches, by Saman Sarraf
Summary of A Novel Pareto-optimal Ranking Method For Comparing Multi-objective Optimization Algorithms, by Amin Ibrahim et al.
Summary of An End-to-end Two-stream Network Based on Rgb Flow and Representation Flow For Human Action Recognition, by Song-jiang Lai et al.
Summary of Vlm-hoi: Vision Language Models For Interpretable Human-object Interaction Analysis, by Donggoo Kang et al.
Summary of Personacraft: Personalized and Controllable Full-body Multi-human Scene Generation Using Occlusion-aware 3d-conditioned Diffusion, by Gwanghyun Kim et al.
Summary of Simulating Tabular Datasets Through Llms to Rapidly Explore Hypotheses About Real-world Entities, by Miguel Zabaleta et al.
Summary of Monopoly: Learning to Price Public Facilities For Revaluing Private Properties with Large-scale Urban Data, by Miao Fan et al.
Summary of Dumapper: Towards Automatic Verification Of Large-scale Pois with Street Views at Baidu Maps, by Miao Fan et al.
Summary of A Survey on Cutting-edge Relation Extraction Techniques Based on Language Models, by Jose A. Diaz-garcia and Julio Amador Diaz Lopez
Summary of Abductive Symbolic Solver on Abstraction and Reasoning Corpus, by Mintaek Lim et al.
Summary of From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects, by Zizhao Li et al.
Summary of Pdzseg: Adapting the Foundation Model For Dissection Zone Segmentation with Visual Prompts in Robot-assisted Endoscopic Submucosal Dissection, by Mengya Xu et al.
Summary of Timemarker: a Versatile Video-llm For Long and Short Video Understanding with Superior Temporal Localization Ability, by Shimin Chen et al.
Summary of Paths: a Hierarchical Transformer For Efficient Whole Slide Image Analysis, by Zak Buzzard et al.
Summary of Dependency-aware Cav Task Scheduling Via Diffusion-based Reinforcement Learning, by Xiang Cheng et al.
Summary of Thai Financial Domain Adaptation Of Thalle — Technical Report, by Kbtg Labs et al.
Summary of Large Language Model-brained Gui Agents: a Survey, by Chaoyun Zhang et al.
Summary of Buffer Anytime: Zero-shot Video Depth and Normal From Image Priors, by Zhengfei Kuang et al.
Summary of Semantic Data Augmentation For Long-tailed Facial Expression Recognition, by Zijian Li et al.
Summary of Heie: Mllm-based Hierarchical Explainable Aigc Image Implausibility Evaluator, by Fan Yang et al.
Summary of Refine: a Reward-based Framework For Interpretable and Nuanced Evaluation Of Radiology Report Generation, by Yunyi Liu et al.
Summary of Different Bias Under Different Criteria: Assessing Bias in Llms with a Fact-based Approach, by Changgeon Ko et al.
Summary of Towards Intention Recognition For Robotic Assistants Through Online Pomdp Planning, by Juan Carlos Saborio and Joachim Hertzberg
Summary of Fairness and Performance in Harmony: Data Debiasing Is All You Need, by Junhua Liu and Wendy Wan Yee Hui and Roy Ka-wei Lee and Kwan Hui Lim
Summary of Bpp-search: Enhancing Tree Of Thought Reasoning For Mathematical Modeling Problem Solving, by Teng Wang et al.
Summary of Can Llms Be Good Graph Judger For Knowledge Graph Construction?, by Haoyu Huang et al.
Summary of Advancing Uncertain Combinatorics Through Graphization, Hyperization, and Uncertainization: Fuzzy, Neutrosophic, Soft, Rough, and Beyond, by Takaaki Fujita
Summary of Spatially Visual Perception For End-to-end Robotic Learning, by Travis Davies et al.
Summary of Wf-vae: Enhancing Video Vae by Wavelet-driven Energy Flow For Latent Video Diffusion Model, By Zongjian Li and Bin Lin and Yang Ye and Liuhan Chen and Xinhua Cheng and Shenghai Yuan and Li Yuan
Summary of Showui: One Vision-language-action Model For Gui Visual Agent, by Kevin Qinghong Lin et al.
Summary of What’s in the Image? a Deep-dive Into the Vision Of Vision Language Models, by Omri Kaduri et al.
Summary of A Bilayer Segmentation-recombination Network For Accurate Segmentation Of Overlapping C. Elegans, by Mengqian Dinga et al.
Summary of Stableanimator: High-quality Identity-preserving Human Image Animation, by Shuyuan Tu et al.
Summary of Mvboost: Boost 3d Reconstruction with Multi-view Refinement, by Xiangyu Liu et al.
Summary of Self-supervised Monocular Depth and Pose Estimation For Endoscopy with Generative Latent Priors, by Ziang Xu et al.
Summary of Uvcg: Leveraging Temporal Consistency For Universal Video Protection, by Kaizhou Li et al.
Summary of Svgdreamer++: Advancing Editability and Diversity in Text-guided Svg Generation, by Ximing Xing et al.
Summary of Unipose: a Unified Multimodal Framework For Human Pose Comprehension, Generation and Editing, by Yiheng Li et al.
Summary of Background-aware Defect Generation For Robust Industrial Anomaly Detection, by Youngjae Cho et al.
Summary of Gemex: a Large-scale, Groundable, and Explainable Medical Vqa Benchmark For Chest X-ray Diagnosis, by Bo Liu et al.
Summary of Magic-slam: Multi-agent Gaussian Globally Consistent Slam, by Vladimir Yugay et al.
Summary of What Can Llm Tell Us About Cities?, by Zhuoheng Li et al.
Summary of Enhancing Answer Reliability Through Inter-model Consensus Of Large Language Models, by Alireza Amiri-margavi et al.
Summary of Human Motion Instruction Tuning, by Lei Li and Sen Jia and Wang Jianhao and Zhongyu Jiang and Feng Zhou and Ju Dai and Tianfang Zhang and Wu Zongkai and Jenq-neng Hwang
Summary of Fine-tuning Llms with Noisy Data For Political Argument Generation and Post Guidance, by Svetlana Churina et al.
Summary of Augmenting Multimodal Llms with Self-reflective Tokens For Knowledge-based Visual Question Answering, by Federico Cocchi et al.
Summary of Boundless Socratic Learning with Language Games, by Tom Schaul
Summary of Harnessing Llms For Educational Content-driven Italian Crossword Generation, by Kamyar Zeinalipour et al.
Summary of Teaching Smaller Language Models to Generalise to Unseen Compositional Questions (full Thesis), by Tim Hartill
Summary of G3d-lf: Generalizable 3d-language Feature Fields For Embodied Tasks, by Zihan Wang et al.
Summary of Path-rag: Knowledge-guided Key Region Retrieval For Open-ended Pathology Visual Question Answering, by Awais Naeem et al.
Summary of Advancing Content Moderation: Evaluating Large Language Models For Detecting Sensitive Content Across Text, Images, and Videos, by Nouar Aldahoul et al.
Summary of Doge: Towards Versatile Visual Document Grounding and Referring, by Yinan Zhou et al.
Summary of Learning Monotonic Attention in Transducer For Streaming Generation, by Zhengrui Ma et al.