Paper List
We recommend you use the search box as this list is very long.
-
Summary of Sketchinr: a First Look Into Sketches As Implicit Neural Representations, by Hmrishav Bandyopadhyay et al.
-
Summary of B-avibench: Towards Evaluating the Robustness Of Large Vision-language Model on Black-box Adversarial Visual-instructions, by Hao Zhang et al.
-
Summary of Heuristic Reasoning in Ai: Instrumental Use and Mimetic Absorption, by Anirban Mukherjee et al.
-
Summary of D3t: Distinctive Dual-domain Teacher Zigzagging Across Rgb-thermal Gap For Domain-adaptive Object Detection, by Dinh Phat Do et al.
-
Summary of A Multi-population Integrated Approach For Capacitated Location Routing, by Pengfei He et al.
-
Summary of Xcoop: Explainable Prompt Learning For Computer-aided Diagnosis Via Concept-guided Context Optimization, by Yequan Bie et al.
-
Summary of Mitigating Attribute Amplification in Counterfactual Image Generation, by Tian Xia et al.
-
Summary of 3d-scenedreamer: Text-driven 3d-consistent Scene Generation, by Frank Zhang et al.
-
Summary of Token Alignment Via Character Matching For Subword Completion, by Ben Athiwaratkun et al.
-
Summary of The Garden Of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models, by Carlo Nicolini et al.
-
Summary of Veagle: Advancements in Multimodal Representation Learning, by Rajat Chawla et al.
-
Summary of Leveraging Chat-based Large Vision Language Models For Multimodal Out-of-context Detection, by Fatma Shalabi et al.
-
Summary of Procedural Terrain Generation with Style Transfer, by Fabio Merizzi
-
Summary of Image-text Out-of-context Detection Using Synthetic Multimodal Misinformation, by Fatma Shalabi et al.
-
Summary of Noisediffusion: Correcting Noise For Image Interpolation with Diffusion Models Beyond Spherical Linear Interpolation, by Pengfei Zheng et al.
-
Summary of Tina: Think, Interaction, and Action Framework For Zero-shot Vision Language Navigation, by Dingbang Li et al.
-
Summary of Fuzzy Fault Trees Formalized, by Thi Kim Nhung Dang et al.
-
Summary of Cross-modal Learning Of Housing Quality in Amsterdam, by Alex Levering et al.
-
Summary of Slcf-net: Sequential Lidar-camera Fusion For Semantic Scene Completion Using a 3d Recurrent U-net, by Helin Cao et al.
-
Summary of Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images, by Giuseppe Cartella et al.
-
Summary of Pathm3: a Multimodal Multi-task Multiple Instance Learning Framework For Whole Slide Image Classification and Captioning, by Qifeng Zhou et al.
-
Summary of Using Deep Learning For Morphological Classification in Pigs with a Focus on Sanitary Monitoring, by Eduardo Bedin et al.
-
Summary of A Continued Pretrained Llm Approach For Automatic Medical Note Generation, by Dong Yuan et al.
-
Summary of Semiparametric Token-sequence Co-supervision, by Hyunji Lee et al.
-
Summary of Distribution and Depth-aware Transformers For 3d Human Mesh Recovery, by Jerrin Bright et al.
-
Summary of Unicode: Learning a Unified Codebook For Multimodal Large Language Models, by Sipeng Zheng et al.
-
Summary of Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models Via Generic Fact Guidance, by Kai Xiong et al.
-
Summary of Mcfend: a Multi-source Benchmark Dataset For Chinese Fake News Detection, by Yupeng Li et al.
-
Summary of Large Language Models Are Contrastive Reasoners, by Liang Yao
-
Summary of Coronetgan: Controlled Pruning Of Gans Via Hypernetworks, by Aman Kumar et al.
-
Summary of Efficient Prompt Tuning Of Large Vision-language Model For Fine-grained Ship Classification, by Long Lan et al.
-
Summary of Liqd: a Dynamic Liquid Level Detection Model Under Tricky Small Containers, by Yukun Ma et al.
-
Summary of Mastering Text, Code and Math Simultaneously Via Fusing Highly Specialized Language Models, by Ning Ding et al.
-
Summary of Hierarchical Auto-organizing System For Open-ended Multi-agent Navigation, by Zhonghan Zhao et al.
-
Summary of Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale, by Xiang Hu et al.
-
Summary of Streamingdialogue: Prolonged Dialogue Learning Via Long Context Compression with Minimal Losses, by Jia-nan Li et al.
-
Summary of Autoregressive Score Generation For Multi-trait Essay Scoring, by Heejin Do et al.
-
Summary of Specification Overfitting in Artificial Intelligence, by Benjamin Roth et al.
-
Summary of Language-driven Visual Consensus For Zero-shot Semantic Segmentation, by Zicheng Zhang et al.
-
Summary of Masked Generative Story Transformer with Character Guidance and Caption Augmentation, by Christos Papadimitriou et al.
-
Summary of Pig Aggression Classification Using Cnn, Transformers and Recurrent Networks, by Junior Silva Souza et al.
-
Summary of Sm4depth: Seamless Monocular Metric Depth Estimation Across Multiple Cameras and Scenes by One Model, By Yihao Liu and Feng Xue and Anlong Ming and Mingshuai Zhao and Huadong Ma and Nicu Sebe
-
Summary of Generalizing Fairness to Generative Language Models Via Reformulation Of Non-discrimination Criteria, by Sara Sterlie et al.
-
Summary of Call Me When Necessary: Llms Can Efficiently and Faithfully Reason Over Structured Environments, by Sitao Cheng et al.
-
Summary of Medinsight: a Multi-source Context Augmentation Framework For Generating Patient-centric Medical Responses Using Large Language Models, by Subash Neupane et al.
-
Summary of Fastmac: Stochastic Spectral Sampling Of Correspondence Graph, by Yifei Zhang et al.
-
Summary of Finemath: a Fine-grained Mathematical Evaluation Benchmark For Chinese Large Language Models, by Yan Liu et al.
-
Summary of Beyond Memorization: the Challenge Of Random Memory Access in Language Models, by Tongyao Zhu et al.
-
Summary of Branch-train-mix: Mixing Expert Llms Into a Mixture-of-experts Llm, by Sainbayar Sukhbaatar et al.
-
Summary of Mope-clip: Structured Pruning For Efficient Vision-language Models with Module-wise Pruning Error Metric, by Haokun Lin et al.
-
Summary of Efficient Vision-and-language Pre-training with Text-relevant Image Patch Selection, by Wei Ye et al.
-
Summary of Seg-metrics: a Python Package to Compute Segmentation Metrics, by Jingnan Jia et al.
-
Summary of Mod-cl: Multi-label Object Detection with Constrained Loss, by Sota Moriyama et al.
-
Summary of Cross-modality Debiasing: Using Language to Mitigate Sub-population Shifts in Imaging, by Yijiang Pang et al.
-
Summary of Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations, by Bhishma Dedhia et al.
-
Summary of Aesopagent: Agent-driven Evolutionary System on Story-to-video Production, by Jiuniu Wang et al.
-
Summary of Worldgpt: a Sora-inspired Video Ai Agent As Rich World Models From Text and Image Inputs, by Deshun Yang et al.
-
Summary of Leveraging Llms For On-the-fly Instruction Guided Image Editing, by Rodrigo Santos et al.
-
Summary of Optimal Design and Implementation Of An Open-source Emulation Platform For User-centric Shared E-mobility Services, by Maqsood Hussain Shah et al.
-
Summary of Red Teaming Models For Hyperspectral Image Analysis Using Explainable Ai, by Vladimir Zaigrajew et al.
-
Summary of Lg-traj: Llm Guided Pedestrian Trajectory Prediction, by Pranav Singh Chib et al.
-
Summary of Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities Of Large Language Models in Hate Speech Detection, by Tharindu Kumarage et al.
-
Summary of Contextual Clarity: Generating Sentences with Transformer Models Using Context-reverso Data, by Ruslan Musaev
-
Summary of A Multimodal Intermediate Fusion Network with Manifold Learning For Stress Detection, by Morteza Bodaghi et al.
-
Summary of Lafs: Landmark-based Facial Self-supervised Learning For Face Recognition, by Zhonglin Sun et al.
-
Summary of Rethinking Loss Functions For Fact Verification, by Yuta Mukobara et al.
-
Summary of Navcot: Boosting Llm-based Vision-and-language Navigation Via Learning Disentangled Reasoning, by Bingqian Lin et al.
-
Summary of Gabor-guided Transformer For Single Image Deraining, by Sijin He et al.
-
Summary of Auxiliary Cyclegan-guidance For Task-aware Domain Translation From Duplex to Monoplex Ihc Images, by Nicolas Brieu et al.
-
Summary of From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios, by Guoshan Liu et al.
-
Summary of Complex Reasoning Over Logical Queries on Commonsense Knowledge Graphs, by Tianqing Fang et al.
-
Summary of Matrix-transformation Based Low-rank Adaptation (mtlora): a Brain-inspired Method For Parameter-efficient Fine-tuning, by Yao Liang et al.
-
Summary of Block-wise Lora: Revisiting Fine-grained Lora For Effective Personalization and Stylization in Text-to-image Generation, by Likun Li et al.
-
Summary of Relevance Score: a Landmark-like Heuristic For Planning, by Oliver Kim and Mohan Sridharan
-
Summary of An Improved Strategy For Blood Glucose Control Using Multi-step Deep Reinforcement Learning, by Weiwei Gu and Senquan Wang
-
Summary of Perennial Semantic Data Terms Of Use For Decentralized Web, by Rui Zhao et al.
-
Summary of Multiple Latent Space Mapping For Compressed Dark Image Enhancement, by Yi Zeng et al.
-
Summary of Hunting Attributes: Context Prototype-aware Learning For Weakly Supervised Semantic Segmentation, by Feilong Tang et al.
-
Summary of Annotations on a Budget: Leveraging Geo-data Similarity to Balance Model Performance and Annotation Cost, by Oana Ignat et al.
-
Summary of Large, Small or Both: a Novel Data Augmentation Framework Based on Language Models For Debiasing Opinion Summarization, by Yanyue Zhang et al.
-
Summary of Improving Reinforcement Learning From Human Feedback Using Contrastive Rewards, by Wei Shen et al.
-
Summary of Ssm Meets Video Diffusion Models: Efficient Long-term Video Generation with Structured State Spaces, by Yuta Oshima et al.
-
Summary of Multi-modal Auto-regressive Modeling Via Visual Words, by Tianshuo Peng et al.
-
Summary of Beyond Pixels: Enhancing Lime with Hierarchical Features and Segmentation Foundation Models, by Patrick Knab et al.
-
Summary of Transforming Competition Into Collaboration: the Revolutionary Role Of Multi-agent Systems and Language Models in Modern Organizations, by Carlos Jose Xavier Cruz
-
Summary of Uncertainty Quantification with Deep Ensembles For 6d Object Pose Estimation, by Kira Wursthorn et al.
-
Summary of Act-mnmt Auto-constriction Turning For Multilingual Neural Machine Translation, by Shaojie Dai et al.
-
Summary of An Image Is Worth 1/2 Tokens After Layer 2: Plug-and-play Inference Acceleration For Large Vision-language Models, by Liang Chen et al.
-
Summary of Genetic Learning For Designing Sim-to-real Data Augmentations, by Bram Vanherle et al.
-
Summary of Medical Image Synthesis Via Fine-grained Image-text Alignment and Anatomy-pathology Prompting, by Wenting Chen et al.
-
Summary of Ra-isf: Learning to Answer and Understand From Retrieval Augmentation Via Iterative Self-feedback, by Yanming Liu et al.
-
Summary of Exploring Large Language Models and Hierarchical Frameworks For Classification Of Large Unstructured Legal Documents, by Nishchal Prasad et al.
-
Summary of Exact Algorithms and Heuristics For Capacitated Covering Salesman Problems, by Lucas Porto Maziero et al.
-
Summary of On Globular T-spherical Fuzzy (g-tsf) Sets with Application to G-tsf Multi-criteria Group Decision-making, by Miin-shen Yang et al.
-
Summary of Mend: Meta Demonstration Distillation For Efficient and Effective In-context Learning, by Yichuan Li et al.
-
Summary of Lstm-based Text Generation: a Study on Historical Datasets, by Mustafa Abbas Hussein Hussein et al.