view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 Jul 23, 2024 • 239
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR Paper • 2509.18174 • Published Sep 17 • 128
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 8 days ago • 224
Step-Audio-R1 Collection Step-Audio-R1 is the first audio language model to successfully unlock test-time compute scaling. • 3 items • Updated 17 days ago • 15
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 7 days ago • 38
Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated 9 days ago • 27
view article Article Building for an Open Future - our new partnership with Google Cloud 26 days ago • 46
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 24 days ago • 156