Summary of Skywork-moe: a Deep Dive Into Training Techniques For Mixture-of-experts Language Models, by Tianwen Wei et al.
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Modelsby Tianwen Wei, Bo Zhu,…
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Modelsby Tianwen Wei, Bo Zhu,…
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLMby Quandong Wang, Yuxuan Yuan,…
Graph Neural Network Enhanced Retrieval for Question Answering of LLMsby Zijian Li, Qingyan Guo, Jiawei…
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Modelsby Ziqiang Liu, Feiteng Fang,…
Procrastination Is All You Need: Exponent Indexed Accumulators for Floating Point, Posits and Logarithmic Numbersby…
BD-SAT: High-resolution Land Use Land Cover Dataset & Benchmark Results for Developing Division: Dhaka, BDby…
TTM-RE: Memory-Augmented Document-Level Relation Extractionby Chufan Gao, Xuan Wang, Jimeng SunFirst submitted to arxiv on:…
Hello Again! LLM-powered Personalized Agent for Long-term Dialogueby Hao Li, Chenghao Yang, An Zhang, Yang…
Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024by Jinwoo Ahn, Junhyeok Park,…
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Modelby Yebin…