Summary of Geoeval: Benchmark For Evaluating Llms and Multi-modal Models on Geometry Problem-solving, by Jiaxin Zhang et al.
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solvingby Jiaxin Zhang, Zhongzhi Li,…
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solvingby Jiaxin Zhang, Zhongzhi Li,…
Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based evaluation using GPT-4by…
Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problemby Davor Hafnar, Jure DemÅ¡arFirst submitted…
LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loopby Maryam Amirizaniani, Jihan Yao, Adrian…
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approachby Maryam Amirizaniani, Elias Martin,…
Magic-Me: Identity-Specific Video Customized Diffusionby Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li,…
Entropy-regularized Point-based Value Iterationby Harrison Delecki, Marcell Vazquez-Chanlatte, Esen Yel, Kyle Wray, Tomer Arnon, Stefan…
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluationby Yihao Fang, Stephen…
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Datasetby…
Mathematical Explanationsby Joseph Y. HalpernFirst submitted to arxiv on: 31 Dec 2023CategoriesMain: Artificial Intelligence (cs.AI)Secondary:…