---
license: openrail
datasets:
- ARTPARK-IISc/Vaani
- ai4bharat/Kathbath
- ai4bharat/Shrutilipi
language:
- hi
- en
metrics:
- accuracy
base_model:
- openai/whisper-medium
pipeline_tag: automatic-speech-recognition
tags:
- Hinglish
- Codeswitching
- whisper
- Speech-to-text
- Indic
- STT
---
# Shunya Labs Hinglish ASR Model

<!-- Provide a quick summary of what the model is/does. -->

The only Hinglish code-switch STT model that generates transcripts in mixed tokens.

## Model Details

### Model Description

This is the first speech recognition model designed natively for Hinglish—the natural mix of Hindi and English commonly spoken across India. Unlike conventional approaches that force transcription into a single language, this model generates mixed-language tokens directly, preserving how people actually speak.

Base Model: OpenAI Whisper Medium
Post-trained by: Shunya Labs
Language: Hinglish (Hindi-English code-switching)

## Why This Model?

Standard ASR models treat Hindi and English as separate languages, forcing transcription into one or the other. This creates errors when speakers naturally switch between languages mid-sentence—which is how millions of people actually talk.
This model was trained specifically on code-switched speech, so it:

- Transcribes Hindi and English tokens as they naturally occur
- Handles mid-sentence language switches accurately
- Produces faster inference by avoiding language detection overhead
- Delivers higher accuracy on real-world Hinglish speech

### Demo

<!-- Provide the basic links for the model. -->

- Try the model at: https://huggingface.co/spaces/shunyalabs/Zero_STT_Hinglish_Shunya_Labs


### Use Cases

- Transcription of Hinglish conversations, podcasts, and videos
- Voice assistants serving Indian users
- Meeting transcription for Indian workplaces
- Content creation and subtitling


## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition", model="shunya-labs/hinglish-whisper-medium")
result = transcriber("audio.mp3")
print(result["text"])
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->


[Openai/whisper-medium](https://huggingface.co/openai/whisper-medium) post-trained on Google Vaani as well as proprietary datasets.