AI & ML interests
Open Reasoning Datasets
Recent Activity
A community effort to curate the best open post-training datasets.
We are currently working on OpenThoughts-Agent, a collaboration building the best open agent training datasets.
Our first project was curating open reasoning data recipes. OpenThoughts3, our best reasoning dataset recipe, is detailed in our release blog and the full paper.
About us
We are a team of researchers and engineers from Bespoke Labs, Stanford, University of California Berkeley, University of Washington, UT Austin, Juelich Supercomputing Center (JSC), LAION, UCLA, UNC Chapel Hill, and Toyota Research Institute united around building the best datasets (and thus the best models). See our previous works at datacomp.ai and mlfoundations.
Open Thoughts is supported by Bespoke Labs, Lambda Labs, NSF IFML, Juelich Supercomputing Center, UT Austin Machine Learning Lab, Toyota Research Institute.
-
open-thoughts/OpenThinker3-7B
Text Generation • 8B • Updated • 7.2k • • 130 -
open-thoughts/OpenThoughts3-1.2M
Viewer • Updated • 1.2M • 16k • 188 -
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 48 -
open-thoughts/OpenThinker3-1.5B
Text Generation • 2B • Updated • 732 • 12
-
open-thoughts/OpenThinker3-7B
Text Generation • 8B • Updated • 7.2k • • 130 -
open-thoughts/OpenThoughts3-1.2M
Viewer • Updated • 1.2M • 16k • 188 -
OpenThoughts: Data Recipes for Reasoning Models
Paper • 2506.04178 • Published • 48 -
open-thoughts/OpenThinker3-1.5B
Text Generation • 2B • Updated • 732 • 12