MegaMath, the largest open math pre-training dataset curated from diverse, math-focused sources, with over 300B tokens.