Datasets cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 275k • 623 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.33k • 244 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.11k • 468 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 11.3k • 433
Papers GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189
Math microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.11k • 468 cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 275k • 623 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.33k • 244 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 11.3k • 433
Datasets cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 275k • 623 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.33k • 244 microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.11k • 468 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 11.3k • 433
Math microsoft/orca-math-word-problems-200k Viewer • Updated Mar 4, 2024 • 200k • 7.11k • 468 cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 275k • 623 nvidia/OpenMathInstruct-1 Viewer • Updated Feb 16, 2024 • 6.08M • 2.33k • 244 meta-math/MetaMathQA Viewer • Updated Dec 21, 2023 • 395k • 11.3k • 433
Papers GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189