Summary of Kv Prediction For Improved Time to First Token, by Maxwell Horton et al.
KV Prediction for Improved Time to First Tokenby Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi…
KV Prediction for Improved Time to First Tokenby Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi…
TeaserGen: Generating Teasers for Long Documentariesby Weihan Xu, Paul Pu Liang, Haven Kim, Julian McAuley,…
PixelBytes: Catching Unified Representation for Multimodal Generationby Fabien FurfaroFirst submitted to arxiv on: 16 Sep…
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generationby Liang Chen,…
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1by…
Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Promptingby Longyu Feng, Mengze Hong,…
MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone’s Potential with Masked Autoregressive Pretrainingby Yunze Liu, Li YiFirst…
CONTESTS: a Framework for Consistency Testing of Span Probabilities in Language Modelsby Eitan Wagner, Yuli…
LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automataby Jaime A. Berkovich, Markus J. BuehlerFirst…
RenderWorld: World Model with Self-Supervised 3D Labelby Ziyang Yan, Wenzhen Dong, Yihua Shao, Yuhang Lu,…