TiChi-OpMiner: LLaMA LoRA for Tibetan–Chinese Code-Mixed Opinion Mining
TiChi-OpMiner is a LoRA-fine-tuned LLaMA model for joint sentiment and stance prediction on
Tibetan–Chinese code-mixed texts.
Given a short code-mixed sentence, the model outputs:
- Sentiment:
positive,neutral, ornegative - Stance:
support,neutral, oroppose
The model was fine-tuned using LLaMA Factory with parameter-efficient training (LoRA).
1. Model Details
- Model name:
your-username/tichi-opminer - Base model:
meta-llama/Llama-2-7b-chat-hf(decoder-only LLM) - Fine-tuning method: LoRA (Low-Rank Adaptation)
- Frameworks: PyTorch, Transformers, LLaMA Factory
- Task type: joint opinion mining (sentiment + stance)
- Input languages: Tibetan (bo, Tibetan script) and Chinese (zh, simplified Chinese), with occasional English tokens / hashtags
- Output: natural language or JSON-style labels describing sentiment and stance
2. Intended Use
Intended use
- Research on code-mixed NLP for low-resource languages
- Experiments on Tibetan–Chinese opinion mining (social media posts, comments, short messages)
- As a starting point for further fine-tuning on related tasks (e.g., Tibetan-only or Chinese-only sentiment / stance classification)
Not intended / out of scope
- High-stakes decision making (e.g., legal, medical, financial, political decisions)
- Use on very long or domain-mismatched documents (e.g., technical reports, legal contracts)
- Any deployment scenario where incorrect predictions could cause serious harm without additional human review
3. Training Data
The model is trained on a 100K-instance Tibetan–Chinese code-mixed corpus:
- Each instance is a short sentence that mixes Tibetan and Chinese.
- Every sentence is annotated with:
- Sentiment:
positive,neutral,negative - Stance:
support,neutral,oppose
- Sentiment:
- The corpus was created with a template-based generation pipeline plus manual
checking to ensure:
- Natural code-mixing patterns
- Fluent Chinese glosses
- Consistent joint labels
- The label space is designed so that:
positive ↔ support,negative ↔ oppose,neutral ↔ neutral.
If you also publish the dataset on Hugging Face, add a link here, e.g.
Dataset:your-username/tichi-opminer-dataset
4. Training Procedure
- Fine-tuning framework: LLaMA Factory
- Method: LoRA adapters on top of the base LLaMA model
- Objective: instruction-style text generation; the model receives a prompt containing the code-mixed text and returns both sentiment and stance.
Example (conceptual) training prompt:
You are an assistant that predicts sentiment and stance for Tibetan–Chinese code-mixed text.
Text: ང་ཚོ今天ཡིན་ནསདགའ་བསུ་བྱེད་ཡོད།
Please answer in JSON format:
{"sentiment": ..., "stance": ...}
- Downloads last month
- 34
Model tree for dylanyang963/TibetanChinese-CodeMixed-OpMiner
Base model
meta-llama/Llama-2-7b-chat-hf