How to use this quantized model file in music generation?
#1
by
LikeGiver
- opened
As far as I know, the input of the inspiremusic model is the embedding vector instead of text or token ids, which makes the model kind of stick to the transformers dependency, how should i actually deploy it in production using inference libs like llama.cpp? do you have any suggestions?
Top be honest, I have no clue, but llama.cpp is primary a library, so if none of the examples that came with llama .cpp fit, then you'd have to write your own program using the llama.cpp libraries.