Loading Now

Summary of Yulan: An Open-source Large Language Model, by Yutao Zhu et al.


YuLan: An Open-source Large Language Model

by Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen

First submitted to arxiv on: 28 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper presents the development of YuLan, a series of open-source Large Language Models (LLMs) with 12 billion parameters. The base model is pre-trained on approximately 1.7T tokens derived from a diverse corpus, including massive English, Chinese, and multilingual texts. A three-stage pre-training method is designed to enhance the overall capabilities of YuLan. Subsequent phases incorporate instruction-tuning and human alignment, employing a substantial volume of high-quality synthesized data. The paper also devises a curriculum-learning framework throughout these stages to facilitate learning complex and long-tail knowledge in an easy-to-hard manner. Finally, the training is finished on Jan 2024, achieving performance on par with state-of-the-art LLMs across various English and Chinese benchmarks.
Low GrooveSquid.com (original content) Low Difficulty Summary
YuLan is a new kind of language model that can understand and process natural language very well. Right now, many people are using these models to do lots of things like chatbots and language translation. But until now, nobody has really shared how they make these models work in the first place. This paper explains how YuLan was created from scratch. It’s a big model with 12 billion parameters that can process huge amounts of text data. The team used a special training method to teach it to learn new things and it performed just as well as other top language models on various tests.

Keywords

» Artificial intelligence  » Alignment  » Curriculum learning  » Instruction tuning  » Language model  » Translation