Summary of Benchmark Data Repositories For Better Benchmarking, by Rachel Longjohn et al.
Benchmark Data Repositories for Better Benchmarkingby Rachel Longjohn, Markelle Kelly, Sameer Singh, Padhraic SmythFirst submitted…
Benchmark Data Repositories for Better Benchmarkingby Rachel Longjohn, Markelle Kelly, Sameer Singh, Padhraic SmythFirst submitted…
Matchmaker: Self-Improving Large Language Model Programs for Schema Matchingby Nabeel Seedat, Mihaela van der SchaarFirst…
On Sampling Strategies for Spectral Model Shardingby Denis Korzhenkov, Christos LouizosFirst submitted to arxiv on:…
Directly Optimizing Explanations for Desired Propertiesby Hiwot Belay Tadesse, Alihan Hüyük, Weiwei Pan, Finale Doshi-VelezFirst…
‘No’ Matters: Out-of-Distribution Detection in Multimodality Long Dialogueby Rena Gao, Xuetong Wu, Siwen Luo, Caren…
Failure Modes of LLMs for Causal Reasoning on Narrativesby Khurram Yamin, Shantanu Gupta, Gaurav R.…
DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesisby Hamidreza Eivazi, André Hebenbrock, Raphael…
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasonerby Fu-Chieh Chang, Yu-Ting Lee, Hui-Ying…
Neural Network Verification with PyRATby Augustin Lemesle, Julien Lehmann, Tristan Le GallFirst submitted to arxiv…
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Trainingby Atli Kosson, Bettina…