Perplexity – Page 17 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Ebft: Effective and Block-wise Fine-tuning For Sparse Llms, by Song Guo et al.

EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMsby Song Guo, Fan Wu, Lei Zhang, Xiawu…

July 13, 2025

Summary of Qurating: Selecting High-quality Data For Training Language Models, by Alexander Wettig et al.

QuRating: Selecting High-Quality Data for Training Language Modelsby Alexander Wettig, Aatmik Gupta, Saumya Malik, Danqi…

July 13, 2025

Summary of Sleb: Streamlining Llms Through Redundancy Verification and Elimination Of Transformer Blocks, by Jiwon Song et al.

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocksby Jiwon Song, Kyungseok Oh,…

July 13, 2025

Summary of Permute-and-flip: An Optimally Stable and Watermarkable Decoder For Llms, by Xuandong Zhao et al.

Permute-and-Flip: An optimally stable and watermarkable decoder for LLMsby Xuandong Zhao, Lei Li, Yu-Xiang WangFirst…

July 13, 2025

Summary of Nevermind: Instruction Override and Moderation in Large Language Models, by Edward Kim

Nevermind: Instruction Override and Moderation in Large Language Modelsby Edward KimFirst submitted to arxiv on:…

July 13, 2025

Summary of Denseformer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging, by Matteo Pagliardini et al.

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averagingby Matteo Pagliardini, Amirkeivan Mohtashami, Francois…

July 13, 2025

Summary of Comparing Template-based and Template-free Language Model Probing, by Sagi Shaier et al.

Comparing Template-based and Template-free Language Model Probingby Sagi Shaier, Kevin Bennett, Lawrence E Hunter, Katharina…

July 13, 2025

Summary of Kvquant: Towards 10 Million Context Length Llm Inference with Kv Cache Quantization, by Coleman Hooper et al.

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantizationby Coleman Hooper, Sehoon…

July 13, 2025

Summary of Convergence Analysis Of T-sne As a Gradient Flow For Point Cloud on a Manifold, by Seonghyeon Jeong et al.

Convergence analysis of t-SNE as a gradient flow for point cloud on a manifoldby Seonghyeon…

July 13, 2025

Summary of Dynamic Layer Tying For Parameter-efficient Transformers, by Tamir David Hay et al.

Dynamic Layer Tying for Parameter-Efficient Transformersby Tamir David Hay, Lior WolfFirst submitted to arxiv on:…