Explorer
- .. (Parent Directory)
- 0101 Xie - mHC- Manifold-Constrained Hyper-Connections.pdf
- 0102 Qiu - Why Low-Precision Transformer Training Fails- An Analysis on Flash Attention.pdf
- 0102 Zhang - On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models.pdf
- 0103 Zhang - Recursive Language Models.pdf
- 0108 Kim - Towards a Science of Scaling Agent Systems.pdf
- 0109 Zhou - How to Set the Batch Size for Large-Scale Pre-training?.pdf
- 0109 Zhou - How to Set the Learning Rate for Large-Scale Pre-training?.pdf
- 0125 Cai - Training-Free Group Relative Policy Optimization.pdf
- 0128 Tandon - End-to-End Test-Time Training for Long Context.pdf
- 0130 Hübotter - Reinforcement Learning via Self-Distillation.pdf
- 0130 Shenfeld - Self-Distillation Enables Continual Learning.pdf
- 0131 Gopalakrishnan - Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings.pdf
- 0131 Karami - Trellis- Learning to Compress Key-Value Memory in Attention Models.pdf
- 0131 Liu - Rethinking KL Regularization in RLHF- From Value Estimation to Gradient Optimization.pdf
- 0131 Marek - Small Batch Size Training for Language Models- When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful.pdf
- 0131 Scheibner - Large language models and the entropy of English.pdf
- 0131 Tan - Self-Improving Pretraining- using post-trained models to pretrain better models.pdf
- 0131 Zhang - Deep Delta Learning.pdf
- 0203 Kalra - A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMs.pdf
- 0204 Song - Expanding the Capabilities of Reinforcement Learning via Text Feedback.pdf
- 0205 Morris - Learning to Reason in 13 Parameters.pdf
- 0217 Janson - Stabilizing Native Low-Rank LLM Pretraining.pdf
- 0218 Krasheninnikov - Fresh in memory- Training-order recency is linearly encoded in language model activations.pdf
- 0218 Team - GLM-5- from Vibe Coding to Agentic Engineering.pdf
- 0218 Treutlein - Connecting the Dots- LLMs can Infer and Verbalize Latent Structure from Disparate Training Data.pdf
- 0223 Penaloza - Privileged Information Distillation for Language Models.pdf
- metadata.jsonl