YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Command : ./llama-quantize --imatrix ./imatrix.gguf --output-tensor-type q5_k /Users/volkovolko/Documents/Scripts/Bash/llama.cpp/build/bin/gemma-3-270m-it/gemma-3-270M-it-F16.gguf q4_k_m 8

So a bit more optimise than regular q4km because output tensors are much more important and we are using imatrix

Infos: total time to quantize: 19636.21 ms

(base) volkovolko@Volkos-MacBook-Air bin % ./llama-cli --version ggml_metal_library_init: using embedded metal library ggml_metal_library_init: loaded in 0.005 sec ggml_metal_device_init: GPU name: Apple M1 ggml_metal_device_init: GPU family: MTLGPUFamilyApple7 (1007) ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_device_init: GPU family: MTLGPUFamilyMetal4 (5002) ggml_metal_device_init: simdgroup reduction = true ggml_metal_device_init: simdgroup matrix mul. = true ggml_metal_device_init: has unified memory = true ggml_metal_device_init: has bfloat = true ggml_metal_device_init: use residency sets = true ggml_metal_device_init: use shared buffers = true ggml_metal_device_init: recommendedMaxWorkingSetSize = 5726.63 MB register_backend: registered backend Metal (1 devices) register_device: registered device Metal (Apple M1) register_backend: registered backend BLAS (1 devices) register_device: registered device BLAS (Accelerate) register_backend: registered backend CPU (1 devices) register_device: registered device CPU (Apple M1) version: 6582 (aa3ee0eb) built with Apple clang version 17.0.0 (clang-1700.3.19.1) for arm64-apple-darwin25.0.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support