Summary of Flashmask: Efficient and Rich Mask Extension Of Flashattention, by Guoxia Wang et al.
FlashMask: Efficient and Rich Mask Extension of FlashAttentionby Guoxia Wang, Jinle Zeng, Xiyuan Xiao, Siming…
FlashMask: Efficient and Rich Mask Extension of FlashAttentionby Guoxia Wang, Jinle Zeng, Xiyuan Xiao, Siming…
A comprehensive study of on-device NLP applications – VQA, automated Form filling, Smart Replies for…
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoEby Xun Zhu, Ying…
Evaluating the Performance and Robustness of LLMs in Materials Science Q&A and Property Predictionsby Hongchen…
First Place Solution to the Multiple-choice Video QA Track of The Second Perception Test Challengeby…
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learningby Shivam Shandilya, Menglin Xia, Supriyo Ghosh,…
Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model Reasoningby Santosh Kumar Radha,…
Familiarity-Aware Evidence Compression for Retrieval-Augmented Generationby Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao…
Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier…
Finetuning Language Models to Emit Linguistic Expressions of Uncertaintyby Arslan Chaudhry, Sridhar Thiagarajan, Dilan GorurFirst…