Update README.md
Browse files
README.md
CHANGED
|
@@ -33,10 +33,11 @@ To build SmolLM-Instruct, we finetune the base models on publicly available data
|
|
| 33 |
|
| 34 |
## Changelog
|
| 35 |
|
|
|
|
| 36 |
|Release|Description|
|
| 37 |
|-|-|
|
| 38 |
-
|v0.1| Initial release of SmolLM-Instruct. We finetune on the permissive subset of the WebInstructSub dataset, combined with StarCoder2-Self-OSS-Instruct. Then, we perform DPO (Direct Preference Optimization) for one epoch on HelpSteer for the 135M and 1.7B models, and argilla/dpo-mix-7k for the 360M model.|
|
| 39 |
-
|v0.2| We changed the finetuning mix to datasets more suitable for smol models. We train on a new dataset of 2k simple everyday conversations we generated by llama3.1-70B [everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k/), [Magpie-Pro-300K-
|
| 40 |
|
| 41 |
## Usage
|
| 42 |
|
|
|
|
| 33 |
|
| 34 |
## Changelog
|
| 35 |
|
| 36 |
+
|
| 37 |
|Release|Description|
|
| 38 |
|-|-|
|
| 39 |
+
|v0.1| Initial release of SmolLM-Instruct. We finetune on the permissive subset of the [WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub) dataset, combined with [StarCoder2-Self-OSS-Instruct](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k). Then, we perform DPO (Direct Preference Optimization) for one epoch on [HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) for the 135M and 1.7B models, and [argilla/dpo-mix-7k](https://huggingface.co/datasets/argilla/dpo-mix-7k) for the 360M model.|
|
| 40 |
+
|v0.2| We changed the finetuning mix to datasets more suitable for smol models. We train on a new dataset of 2k simple everyday conversations we generated by llama3.1-70B [everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k/), [Magpie-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-300K-Filtered), [StarCoder2-Self-OSS-Instruct](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k), and a small subset of [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5)|
|
| 41 |
|
| 42 |
## Usage
|
| 43 |
|