How to specify the output language?
#26
by
dragonhunterau
- opened
It's great that parakeet v3 supports multiple language now, but it randomly generates unexpected characters of other languages when the audio is pure English. is there any parameter or token hint we can use to force it generate token of a particular language?
yeah it seems that there is a good bit of "language cross contamination". On paper it may seems like a good idea to completely ignore language, but in practice it does not seem to work really well . On use cases like dictation in Danish for example I am sometimes getting some Swedish words, and then a few English words and then back to Danish.