Summary of Token-budget-aware Llm Reasoning, by Tingxu Han et al.
Token-Budget-Aware LLM Reasoningby Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu ChenFirst…
Token-Budget-Aware LLM Reasoningby Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu ChenFirst…
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Surveyby Liang Chen, Zekun Wang, Shuhuai Ren,…
RDPM: Solve Diffusion Probabilistic Models via Recurrent Token Predictionby Xiaoping Wu, Jie Hu, Xiaoming WeiFirst…
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reductionby Ziyang Wu, Tianjiao Ding, Yifu Lu,…
Fast Gradient Computation for RoPE Attention in Almost Linear Timeby Yifang Chen, Jiayan Huo, Xiaoyu…
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matchingby Enshu Liu, Xuefei…
Enhancing Item Tokenization for Generative Recommendation through Self-Improvementby Runjin Chen, Mingxuan Ju, Ngoc Bui, Dimosthenis…
Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformersby Haoran You, Connelly Barnes,…
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenizationby Vivek Ramanujan, Kushal Tirumala,…
HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashingby Minghui Liu, Tahseen Rabbani, Tony…