ZombitThai-GPT
A modern Thai language model with 128K token context support, built with advanced transformer architectures.
Model Details
- Model Type: Transformer with RoPE and Grouped Query Attention
- Context Length: 131,072 tokens (128K)
- Architecture: Modern GPT with dynamic RoPE scaling
- Language: Thai
- License: Apache 2.0
Key Features
- 🚀 Long Context: Supports up to 128K tokens with dynamic RoPE scaling
- 🧠 Advanced Architecture: Grouped Query Attention, RMSNorm, SwiGLU activation
- 🇹🇭 Thai Optimized: Specialized tokenizer with Thai syllable awareness
- ⚡ Memory Efficient: Optimized for long context processing
- 🔧 HF Compatible: Full Hugging Face transformers integration
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("JonusNattapong/ZombitThai-GPT-128K")
model = AutoModelForCausalLM.from_pretrained("JonusNattapong/ZombitThai-GPT-128K")
text = "สวัสดีครับ วันนี้อากาศดีมาก"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
result = tokenizer.decode(outputs[0])
Training Details
- Architecture: Custom transformer with modern optimizations
- Position Encoding: RoPE with dynamic scaling for long contexts
- Attention: Grouped Query Attention for efficiency
- Normalization: RMSNorm instead of LayerNorm
- Activation: SwiGLU for better performance
Performance
- Supports sequences up to 131,072 tokens
- Efficient memory usage with KV caching
- Optimized for Thai language processing
- Compatible with standard HF generation pipelines
Limitations
- Model weights are randomly initialized (not trained)
- Requires significant computational resources for long contexts
- Optimized for Thai text processing
Citation
@misc{zombit-thai-gpt,
title={ZombitThai-GPT: Modern Thai Language Model with 128K Context},
author={AI Assistant},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/JonusNattapong/ZombitThai-GPT-128K}
}
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support