Summary of Sparse Attention Decomposition Applied to Circuit Tracing, by Gabriel Franco et al.
Sparse Attention Decomposition Applied to Circuit Tracingby Gabriel Franco, Mark CrovellaFirst submitted to arxiv on:…
Sparse Attention Decomposition Applied to Circuit Tracingby Gabriel Franco, Mark CrovellaFirst submitted to arxiv on:…
Evaluating the fairness of task-adaptive pretraining on unlabeled test data before few-shot text classificationby Kush…
On The Planning Abilities of OpenAI’s o1 Models: Feasibility, Optimality, and Generalizabilityby Kevin Wang, Junbo…
Can Models Learn Skill Composition from Examples?by Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal,…
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversationsby Ziyu Wang, Hao Li, Di…
Enhancing TinyBERT for Financial Sentiment Analysis Using GPT-Augmented FinBERT Distillationby Graison Jos ThomasFirst submitted to…
Cottention: Linear Transformers With Cosine Attentionby Gabriel Mongaras, Trevor Dohm, Eric C. LarsonFirst submitted to…
Experimental Evaluation of Machine Learning Models for Goal-oriented Customer Service Chatbot with Pipeline Architectureby Nurul…
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmarkby Elliot L. Epstein, Kaisheng Yao, Jing Li,…
Severity Prediction in Mental Health: LLM-based Creation, Analysis, Evaluation of a Novel Multilingual Datasetby Konstantinos…