Summary of Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure Bfloat16 Is Enough, by Konstantin Dobler and Gerard De Melo
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is…