Summary of Mmad: a Comprehensive Benchmark For Multimodal Large Language Models in Industrial Anomaly Detection, by Xi Jiang et al.
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detectionby Xi Jiang,…
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detectionby Xi Jiang,…
Extended Japanese Commonsense Morality Dataset with Masked Token and Label Enhancementby Takumi Ohashi, Tsubasa Nakagawa,…
Transformer-based Language Models for Reasoning in the Description Logic ALCQby Angelos Poulis, Eleni Tsalapati, Manolis…
SimpleStrat: Diversifying Language Model Generation with Stratificationby Justin Wong, Yury Orlovskiy, Michael Luo, Sanjit A.…
Humanity in AI: Detecting the Personality of Large Language Modelsby Baohua Zhan, Yongyi Huang, Wenyao…
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Modelsby Wenting…
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Modelsby Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou, Mohsen Fayyaz,…
COMMA: A Communicative Multimodal Multi-Agent Benchmarkby Timothy Ossowski, Jixuan Chen, Danyal Maqbool, Zefan Cai, Tyler…
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Mapsby Muhammad Umair…
Large Language Models as Code Executors: An Exploratory Studyby Chenyang Lyu, Lecheng Yan, Rui Xing,…