Summary of Denoising with a Joint-embedding Predictive Architecture, by Dengsheng Chen et al.
Denoising with a Joint-Embedding Predictive Architectureby Dengsheng Chen, Jie Hu, Xiaoming Wei, Enhua WuFirst submitted…
Denoising with a Joint-Embedding Predictive Architectureby Dengsheng Chen, Jie Hu, Xiaoming Wei, Enhua WuFirst submitted…
Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluationby Jie Xiao, Qianyi Huang,…
Demystifying the Token Dynamics of Deep Selective State Space Modelsby Thieu N Vo, Tung D.…
Compute Or Load KV Cache? Why Not Both?by Shuowei Jin, Xueshen Liu, Qingzhao Zhang, Z.…
UNComp: Uncertainty-Aware Long-Context Compressor for Efficient Large Language Model Inferenceby Jing Xiong, Jianghan Shen, Fanghua…
LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferencesby Zhenxiao Fu, Fan Chen, Shan Zhou,…
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generationby Rohin Manvi,…
Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASRby Hainan Xu, Travis M. Bartley, Vladimir Bataev,…
Integrative Decoding: Improve Factuality via Implicit Self-consistencyby Yi Cheng, Xiao Liang, Yeyun Gong, Wen Xiao,…
ENTP: Encoder-only Next Token Predictionby Ethan Ewer, Daewon Chae, Thomas Zeng, Jinkyu Kim, Kangwook LeeFirst…