Isaac.N
Testing347
AI & ML interests
None yet
Recent Activity
upvoted
a
collection
about 8 hours ago
Medical Datasets
reacted
to
MaziyarPanahi's
post
with 🔥
about 8 hours ago
🚨 Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE
Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.
THE 7 SOURCE MODELS (Original Sample Counts):
1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples
TOTAL BEFORE DEDUPLICATION: 2,919,699 samples
TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%
Quick Start:
```
from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")
```
All datasets Apache 2.0 licensed. Free for research and commercial use.
Thank you for following OpenMed's release series. I can't wait to see what you build. 🔥
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Mega
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Trinity-Mini
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
https://huggingface.co/datasets/OpenMed/Medical-Reasonin
https://huggingface.co/collections/OpenMed/medical-datasets
upvoted
an
article
22 days ago
OpenMed: Six Months of Open-Source Medical AI and the Road Ahead
Organizations
None yet