Summary of Is It a Free Lunch For Removing Outliers During Pretraining?, by Baohao Liao et al.
Is It a Free Lunch for Removing Outliers during Pretraining?by Baohao Liao, Christof MonzFirst submitted…
Is It a Free Lunch for Removing Outliers during Pretraining?by Baohao Liao, Christof MonzFirst submitted…
Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearningby Yang Zhao,…
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Modelsby Gagan Bhatia, El Moatez…
Data Engineering for Scaling Language Models to 128K Contextby Yao Fu, Rameswar Panda, Xinyao Niu,…
Hardware Phi-1.5B: A Large Language Model Encodes Hardware Domain Specific Knowledgeby Weimin Fu, Shijie Li,…
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokensby Jiacheng Liu, Sewon Min, Luke…
BPDec: Unveiling the Potential of Masked Language Modeling Decoder in BERT pretrainingby Wen Liang, Youzhi…
GeoDecoder: Empowering Multimodal Map Understandingby Feng Qi, Mian Dai, Zixian Zheng, Chao WangFirst submitted to…
Unlearning Traces the Influential Training Data of Language Modelsby Masaru Isonuma, Ivan TitovFirst submitted to…
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanizationby Jaavid Aktar Husain, Raj…