Summary of Counterfactual Token Generation in Large Language Models, by Ivi Chatzi et al.
Counterfactual Token Generation in Large Language Modelsby Ivi Chatzi, Nina Corvelo Benz, Eleni Straitouri, Stratis…
Counterfactual Token Generation in Large Language Modelsby Ivi Chatzi, Nina Corvelo Benz, Eleni Straitouri, Stratis…
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clusteringby…
Eagle: Efficient Training-Free Router for Multi-LLM Inferenceby Zesen Zhao, Shuowei Jin, Z. Morley MaoFirst submitted…
Archon: An Architecture Search Framework for Inference-Time Techniquesby Jon Saad-Falcon, Adrian Gamarra Lafuente, Shlok Natarajan,…
Block-Attention for Efficient RAGby East Sun, Yan Wang, Lan TianFirst submitted to arxiv on: 14…
Unlocking Memorization in Large Language Models with Dynamic Soft Promptingby Zhepeng Wang, Runxue Bao, Yawen…
You can remove GPT2’s LayerNorm by fine-tuningby Stefan HeimersheimFirst submitted to arxiv on: 6 Sep…
Exploring Scaling Laws for Local SGD in Large Language Model Trainingby Qiaozhi He, Xiaomin Zhuang,…
Democratizing MLLMs in Healthcare: TinyLLaVA-Med for Efficient Healthcare Diagnostics in Resource-Constrained Settingsby Aya El Mir,…
LOLA – An Open-Source Massively Multilingual Large Language Modelby Nikit Srivastava, Denis Kuchelev, Tatiana Moteu…