Summary of Olmoe: Open Mixture-of-experts Language Models, by Niklas Muennighoff et al.
OLMoE: Open Mixture-of-Experts Language Modelsby Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison,…
OLMoE: Open Mixture-of-Experts Language Modelsby Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison,…
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruningby Soumajyoti Sarkar, Leonard…
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inferenceby Barys Liskavets, Maxim…
Self-Supervised Vision Transformers for Writer Retrievalby Tim Raven, Arthur Matei, Gernot A. FinkFirst submitted to…
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representationsby Yize Zhao, Tina…
Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillationby Lujun Gui, Bin Xiao,…
Generative Verifiers: Reward Modeling as Next-Token Predictionby Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi,…
Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Modelsby Hongfu Liu, Yuxi Xie, Ye…
LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddingsby Duo Wang,…
LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentationby Trung Dinh Quoc Dang, Huy Hoang Nguyen,…