MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling Paper • 2602.11761 • Published 6 days ago • 6
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 10 days ago • 65
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10, 2025 • 217