Why?

by datayoda - opened 7 days ago

Discussion

datayoda

7 days ago

Why should I use yours vs unsloth?

bobchenyx

Moxin Organization org 6 days ago

Thanks a lot for the question!

As we wrote in the Acknowledgements, our work is deeply inspired by many open-source authors, and we see them as our heroes and pioneers rather than competitors. In particular, we are very grateful to the unsloth team: their BF16 GGUF releases and imatrix files made it much easier for others (including us) to imagine and share additional quantization variants.

Regarding our quantizations : we tend to keep higher precision for the attention part (since the experts are the really “chubby” part, lol), we also try to keep the quantization pattern as consistent as possible across layers. We believe these may be helpful for some edge-deployment scenarios and for people experimenting with customized accelerations.

In the end, which quant to use is entirely your choice. We are happy to answer any questions you have and equally happy if our work is just one more option on your shortlist. Thank you again for your interest and for taking the time to compare different projects!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment