geodesic-research/sfm_unfiltered_cpt_misalignment_upsampled_think-DPO Text Generation • 7B • Updated 2 days ago • 77
geodesic-research/sfm_unfiltered_e2e_misalignment_upsampled_think Text Generation • 7B • Updated 3 days ago • 123