|
|
--- |
|
|
base_model: openai/gpt-oss-20b |
|
|
base_model_relation: merge |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- sft |
|
|
- transformers |
|
|
- trl |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- ko |
|
|
--- |
|
|
|
|
|
# Vayne-V1 |
|
|
|
|
|
**Vayne-V1** is a **compact, efficient, and high-performance enterprise LLM** optimized for **AI agent frameworks**, **MCP-based tool orchestration**, **Retrieval-Augmented Generation (RAG) pipelines**, and **secure on-premise deployment**. |
|
|
|
|
|
- ✅ Lightweight architecture for fast inference and low resource usage |
|
|
- ⚙️ Seamless integration with modern AI agent frameworks |
|
|
- 🔗 Built-in compatibility for MCP-based multi-tool orchestration |
|
|
- 🔍 Optimized for enterprise-grade RAG systems |
|
|
- 🛡️ Secure deployment in private or regulated environments |
|
|
|
|
|
--- |
|
|
|
|
|
## Key Design Principles |
|
|
|
|
|
| Feature | Description | |
|
|
|----------|-------------| |
|
|
| 🔐 Private AI Ready | Deploy fully **on-premise** or in **air-gapped** secure environments | |
|
|
| ⚡ Lightweight Inference | **Single-GPU optimized** architecture for fast and cost-efficient deployment | |
|
|
| 🧠 Enterprise Reasoning | Structured output and instruction-following for **business automation** | |
|
|
| 🔧 Agent & MCP Native | Built for **AI agent frameworks** and **MCP-based tool orchestration** | |
|
|
| 🔍 RAG Enhanced | Optimized for **retrieval workflows** with vector DBs (FAISS, Milvus, pgvector, etc.) | |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Architecture & Training |
|
|
|
|
|
| Specification | Details | |
|
|
|---------------|---------| |
|
|
| 🧬 Base Model | GPT-OSS-20B | |
|
|
| 🔢 Parameters | ~20B | |
|
|
| 🎯 Precision | FP16 / BF16 | |
|
|
| 🧱 Architecture | Decoder-only Transformer | |
|
|
| 📏 Context Length | 4K tokens | |
|
|
| ⚡ Inference | Single / Multi-GPU compatible | |
|
|
|
|
|
### Training Data |
|
|
Fine-tuned using supervised instruction tuning (SFT) on: |
|
|
- Enterprise QA datasets |
|
|
- Task reasoning + tool usage instructions |
|
|
- RAG-style retrieval prompts |
|
|
- Business reports & structured communication |
|
|
- Korean–English bilingual QA and translation |
|
|
- Synthetic instructions with safety curation |
|
|
|
|
|
--- |
|
|
|
|
|
## Secure On-Premise Deployment |
|
|
|
|
|
Vayne-V1 is built for **enterprise AI inside your firewall**. |
|
|
|
|
|
✅ No external API dependency |
|
|
✅ Compatible with **offline environments** |
|
|
✅ Proven for secure deployments |
|
|
|
|
|
--- |
|
|
|
|
|
## MCP (Model Context Protocol) Integration |
|
|
|
|
|
Vayne-V1 supports **MCP-based agent tooling**, making it easy to integrate tool-use AI. |
|
|
|
|
|
Works seamlessly with: |
|
|
|
|
|
* Claude MCP-compatible agent systems |
|
|
* Local agent runtimes |
|
|
* JSON structured execution |
|
|
|
|
|
--- |
|
|
|
|
|
## RAG Compatibility |
|
|
|
|
|
Designed for **hybrid reasoning + retrieval**. |
|
|
|
|
|
✅ Works with FAISS, Chroma, Elasticsearch |
|
|
✅ Handles long-context document QA |
|
|
✅ Ideal for enterprise knowledge bases |
|
|
|
|
|
--- |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
```bash |
|
|
pip install transformers peft accelerate bitsandbytes |
|
|
``` |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model_name = "PoSTMEDIA/Vayne-V1" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_name, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
prompt = "Explain the benefits of private AI for enterprise security." |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_length=256) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
✅ Internal enterprise AI assistant |
|
|
✅ Private AI document analysis |
|
|
✅ Business writing (reports, proposals, strategy) |
|
|
✅ AI automation agents |
|
|
✅ Secure RAG search systems |
|
|
|
|
|
--- |
|
|
|
|
|
## Safety & Limitations |
|
|
|
|
|
* Not intended for medical, legal, or financial decision-making |
|
|
* May occasionally generate hallucinations |
|
|
* Use human validation for critical outputs |
|
|
* Recommended: enable output guardrails for production |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{vayne2025, |
|
|
title={Vayne-V1: Private On-Premise LLM Optimized for Agents and RAG}, |
|
|
author={PoSTMEDIA AI Lab}, |
|
|
year={2025}, |
|
|
publisher={Hugging Face} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Contact |
|
|
|
|
|
**PoSTMEDIA AI Lab** |
|
|
📧 [dev.postmedia@gmail.com](mailto:dev.postmedia@gmail.com) |
|
|
🌐 [https://postmedia.ai](https://postmedia.ai) |
|
|
🌐 [https://postmedia.co.kr](https://postmedia.co.kr) |
|
|
|
|
|
--- |
|
|
|