Summary of Enhancing Rl Safety with Counterfactual Llm Reasoning, by Dennis Gross and Helge Spieker
Enhancing RL Safety with Counterfactual LLM Reasoningby Dennis Gross, Helge SpiekerFirst submitted to arxiv on:…
Enhancing RL Safety with Counterfactual LLM Reasoningby Dennis Gross, Helge SpiekerFirst submitted to arxiv on:…
Symbolic Regression with a Learned Concept Libraryby Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles Cranmer,…
Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modelingby Jialu Tang, Tong Xia, Yuan…
Alignment with Preference Optimization Is All You Need for LLM Safetyby Reda Alami, Ali Khalifa…
Can We Count on LLMs? The Fixed-Effect Fallacy and Claims of GPT-4 Capabilitiesby Thomas Ball,…
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratioby…
MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understandingby Surbhi Madan,…
MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learningby Jianyi Zhang, Hao Frank Yang, Ang Li,…
STLLM-DF: A Spatial-Temporal Large Language Model with Diffusion for Enhanced Multi-Mode Traffic System Forecastingby Zhiqi…
How Does Code Pretraining Affect Language Model Task Performance?by Jackson Petty, Sjoerd van Steenkiste, Tal…