A newer version of this model is available: OpceanAI/Yuuki-best

⚠️ ## Notice on Current Model Scope

Please note that Yuuki, in its current state, represents approximately 3.7% of the total training planned for version v0.1.

At this stage, Yuuki should be considered an early and incomplete snapshot of the model. The full v0.1 release, which will include the remaining training stages, additional refinements, and stabilization, will be released at a later time.

As such, performance, behavior, or capability assessments based on the current version of Yuuki do not reflect the final characteristics of the v0.1 model.

Further updates will be provided as development progresses.

🌸 Yuuki v0.1 - The $0 Code LLM

⚠️ WORK IN PROGRESS - Currently training on mobile CPU (Day 3/42)

🎯 The Mission

Prove that you DON'T need expensive GPUs to train LLMs.

Yuuki is a code generation model trained entirely on a $150 Android phone with:

❌ No cloud compute

❌ No GPU

❌ No data center

βœ… Just determination and time

The Setup

Hardware: Snapdragon 685 (8-core ARM CPU) RAM: 6GB Storage: 128GB NPU: Hexagon 686 (1 TOPS) GPU: Adreno 610 (243 GFLOPS) - NOT USED for training Cost: $0 in compute

πŸ“Š Current Status

Metric Value

Progress 1,417 / 37,500 steps (3.78%) Epoch 0.08 / 2.0 Current Loss ~1.70 - 2.23 Best Loss 1.7053 ⭐ Training Time ~3 days ETA ~39 days remaining Speed ~100 sec/step

Loss Progression

Step 0: Loss 3.35 (baseline) Step 500: Loss 2.50 ↓ -25% Step 1000: Loss 2.00 ↓ -40% Step 1265: Loss 1.83 ↓ -45% Step 1292: Loss 1.71 ↓ -49% ⭐ RECORD Step 1417: Loss 2.23 (current, oscillating 1.7-2.3)

πŸŽ“ What Yuuki Knows (So Far)

Due to alphabetically-ordered dataset:

Language Exposure Quality Status

Agda High 85/100 βœ… Excellent C Starting 30/100 ⏳ Learning Assembly Low 5/100 🌱 Minimal Python None 0/100 ❌ Not reached yet

Example Output (Step 1,300)

Agda prompt: module Main where

module Main where (x, f) in a

open import Cubical.Sigma
open import Cubical.Sigma.Core
open import Cubical.Foundations.H

βœ… Real Agda libraries! The model learned actual Cubical type theory modules.

πŸ› οΈ Training Configuration
Model: DistilGPT-2 (82M parameters)
Dataset: The Stack (75,000 examples)
Batch size: 1
Gradient accumulation: 4
Effective batch: 4
Learning rate: 5e-5
Max length: 256 tokens
Optimizer: AdamW
Epochs: 2
Total tokens: ~30M (2 epochs)

Why so slow?
100 seconds/step Γ— 37,500 steps = 3,750,000 seconds
= 1,042 hours
= 43.4 days
= ~6 weeks of continuous training
No GPU acceleration. Pure CPU grinding. πŸ’ͺ

πŸ“ˆ Roadmap

v0.1 (Current - Proof of Concept)

[x] Setup training pipeline

[x] Start training (Step 0)

[x] Reach Step 1,000

[x] Break loss 2.0 barrier

[x] Break loss 1.8 barrier ⭐

[ ] Checkpoint 2,500 (7%)

[ ] Checkpoint 5,000 (13%)

[ ] Checkpoint 10,000 (27%)

[ ] Checkpoint 18,750 (50% - Epoch 1 complete)

[ ] Checkpoint 37,500 (100% - DONE)

[ ] Quantize to INT8

[ ] Convert to ONNX

[ ] Publish final model

ETA: Mid-March 2026

v0.2 (The Full Dataset)

Dataset: 786,387 examples (full Stack)

Duration: 418 days (~14 months)

Epochs: 2.0

Total tokens: ~314M

Dataset fix: SHUFFLED (not alphabetical)

Languages: All 80+ languages balanced

Start: March 2026

End: May 2027

v0.3+ (PC Era)

Hardware upgrade: RTX 4060/4070

Larger models: 350M-1B parameters

Faster training: ~30x speedup

Advanced techniques: LoRA, QLoRA, etc.

πŸ’‘ Philosophy
"The barrier to AI isn't money. It's mindset."
This project demonstrates: βœ… You CAN train LLMs without GPUs
βœ… Patience > Hardware
βœ… $0 budget is enough to start
βœ… Limited resources inspire creativity
βœ… Anyone can contribute to AI

πŸš€ Usage (After Training Completes)

from transformers import AutoModelForCausalLM, AutoTokenizer

Load model

model = AutoModelForCausalLM.from_pretrained("OpceanAI/Yuuki")
tokenizer = AutoTokenizer.from_pretrained("OpceanAI/Yuuki")

Generate code

prompt = "def fibonacci(n):"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
code = tokenizer.decode(outputs[0])
print(code)

Quantized (4x faster, 4x smaller)

Coming after training completes

model = AutoModelForCausalLM.from_pretrained(
"OpceanAI/Yuuki",
subfolder="yuuki-v0.1-int8"
)

⚠️ Known Limitations

Dataset order: Alphabetical (not shuffled) - learns early languages best

Token count: Only ~30M tokens (vs GPT-2's 40B)

Training speed: Very slow (~100 sec/step)

Model size: Small (82M params)

Language coverage: Incomplete due to alphabetical ordering
These will be addressed in v0.2 with shuffled dataset.

πŸ”¬ Technical Details

CPU Training (100 sec/step):

Forward pass: 40 sec

Backward pass: 40 sec

Optimizer: 20 sec

Total: ~100 sec

vs GPU Training (0.5 sec/step):

200x faster

But costs $0.50-$2.00/hour

42 days = $500-$2,000

Mobile: FREE but SLOW

GPU: FAST but EXPENSIVE

For proof of concept: Mobile wins. πŸ†

πŸ“Š Benchmarks (Post-Training)

Coming soon after training completes (~March 2026).
Expected performance:

Agda: 85-95/100 (primary language)

C: 85-92/100 (secondary language)

Assembly: 75-85/100 (tertiary)

Python: 10-20/100 (barely seen due to alphabet order)

πŸ™ Acknowledgments

HuggingFace: Infrastructure and transformers library

BigCode: The Stack dataset

The ML community: For saying "you need GPUs" - best motivation 😏

πŸ“œ License

Apache 2.0 - See LICENSE file. You can use Yuuki commercially, modify it, distribute it. Just give credit. βœ…

πŸ”— Links

GitHub: (https://github.com/aguitauwu)

Discord: (https://discord.gg/j8zV2u8k)

Progress updates: Check this model card

πŸ“… Updates

2026-01-29: Training started

2026-01-29: Step 1,000 reached - Loss 2.00

2026-01-29: Step 1,292 - NEW RECORD Loss 1.7053

2026-01-29: Repository created on HuggingFace

Last updated: 2026-01-29

Follow the journey of training an LLM with $0 budget. One step at a time. 🌸

Downloads last month
12
Safetensors
Model size
81.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OpceanAI/Yuuki-3.7

Finetuned
(2062)
this model

Dataset used to train OpceanAI/Yuuki-3.7

Collection including OpceanAI/Yuuki-3.7