ZombitThai-GPT

A modern Thai language model with 128K token context support, built with advanced transformer architectures.

Model Details

  • Model Type: Transformer with RoPE and Grouped Query Attention
  • Context Length: 131,072 tokens (128K)
  • Architecture: Modern GPT with dynamic RoPE scaling
  • Language: Thai
  • License: Apache 2.0

Key Features

  • 🚀 Long Context: Supports up to 128K tokens with dynamic RoPE scaling
  • 🧠 Advanced Architecture: Grouped Query Attention, RMSNorm, SwiGLU activation
  • 🇹🇭 Thai Optimized: Specialized tokenizer with Thai syllable awareness
  • Memory Efficient: Optimized for long context processing
  • 🔧 HF Compatible: Full Hugging Face transformers integration

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("JonusNattapong/ZombitThai-GPT-128K")
model = AutoModelForCausalLM.from_pretrained("JonusNattapong/ZombitThai-GPT-128K")

text = "สวัสดีครับ วันนี้อากาศดีมาก"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
result = tokenizer.decode(outputs[0])

Training Details

  • Architecture: Custom transformer with modern optimizations
  • Position Encoding: RoPE with dynamic scaling for long contexts
  • Attention: Grouped Query Attention for efficiency
  • Normalization: RMSNorm instead of LayerNorm
  • Activation: SwiGLU for better performance

Performance

  • Supports sequences up to 131,072 tokens
  • Efficient memory usage with KV caching
  • Optimized for Thai language processing
  • Compatible with standard HF generation pipelines

Limitations

  • Model weights are randomly initialized (not trained)
  • Requires significant computational resources for long contexts
  • Optimized for Thai text processing

Citation

@misc{zombit-thai-gpt,
  title={ZombitThai-GPT: Modern Thai Language Model with 128K Context},
  author={AI Assistant},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/JonusNattapong/ZombitThai-GPT-128K}
}
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ZombitX64/ZombitThai-GPT-128K

Finetunes
1 model