DeepFox-AI SeraphyneLab/DeepFox-base-prototype Text Generation • 0.1B • Updated Dec 1, 2025 • 10 • 3
Serayuki-AI SeraphyneLab/Serayuki-1B Text Generation • 2B • Updated Sep 11, 2025 • 6 • 4 Vynie/Serayuki-1B-v1.1-pre2-step-3k-1B Text Generation • 2B • Updated Aug 31, 2025 • 1 • 2
Big Data Vynie/MyStarData-Medium-5M-LLaMA-3.1-Ins-Tokenize Viewer • Updated Sep 12, 2025 • 3.45M • 2
Paper Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts Paper • 2408.15664 • Published Aug 28, 2024 • 15
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts Paper • 2408.15664 • Published Aug 28, 2024 • 15
DeepFox-AI SeraphyneLab/DeepFox-base-prototype Text Generation • 0.1B • Updated Dec 1, 2025 • 10 • 3
Big Data Vynie/MyStarData-Medium-5M-LLaMA-3.1-Ins-Tokenize Viewer • Updated Sep 12, 2025 • 3.45M • 2
Serayuki-AI SeraphyneLab/Serayuki-1B Text Generation • 2B • Updated Sep 11, 2025 • 6 • 4 Vynie/Serayuki-1B-v1.1-pre2-step-3k-1B Text Generation • 2B • Updated Aug 31, 2025 • 1 • 2
Paper Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts Paper • 2408.15664 • Published Aug 28, 2024 • 15
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts Paper • 2408.15664 • Published Aug 28, 2024 • 15