Update README.md
Browse files
README.md
CHANGED
|
@@ -25,6 +25,22 @@ license: other
|
|
| 25 |
|
| 26 |
This is [StableLM 2 Chat 1.6B](https://huggingface.co/stabilityai/stablelm-2-1_6b-chat), quantized with the help of imatrix so it could offer better performance for being quantized, and have quantization levels available for lower-memory devices to run. [Kalomaze's "groups_merged.txt"](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) was used for the importance matrix, with context set to 4,096 (the context length according to [their paper](https://drive.google.com/file/d/1JYJHszhS8EFChTbNAf8xmqhKjogWRrQF/view)).
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
Original model card below.
|
| 29 |
|
| 30 |
***
|
|
|
|
| 25 |
|
| 26 |
This is [StableLM 2 Chat 1.6B](https://huggingface.co/stabilityai/stablelm-2-1_6b-chat), quantized with the help of imatrix so it could offer better performance for being quantized, and have quantization levels available for lower-memory devices to run. [Kalomaze's "groups_merged.txt"](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) was used for the importance matrix, with context set to 4,096 (the context length according to [their paper](https://drive.google.com/file/d/1JYJHszhS8EFChTbNAf8xmqhKjogWRrQF/view)).
|
| 27 |
|
| 28 |
+
Here's a chart that provides an approximation of the HellaSwag score (out of 1,000 tasks). Thanks to the randomization of tasks, it may be slightly unprecise:
|
| 29 |
+
|Quantization|HellaSwag|
|
| 30 |
+
|------------|---------|
|
| 31 |
+
|IQ1_S |35.4% |
|
| 32 |
+
|IQ1_M |38.7% |
|
| 33 |
+
|IQ2_XXS |51.2% |
|
| 34 |
+
|IQ2_XS |51.8% |
|
| 35 |
+
|IQ2_S |56.8% |
|
| 36 |
+
|IQ2_M |59.3% |
|
| 37 |
+
|Q2_K_S |55.2% |
|
| 38 |
+
|Q2_K |59.0% |
|
| 39 |
+
|IQ3_XXS |60.8% |
|
| 40 |
+
|Q4_0 |64.0% |
|
| 41 |
+
|Q4_K_M |66.0% |
|
| 42 |
+
|Q5_K_M |65.8% |
|
| 43 |
+
|
| 44 |
Original model card below.
|
| 45 |
|
| 46 |
***
|