File size: 608 Bytes
c613b63
c0dcd9b
8c88a99
c0dcd9b
 
c613b63
8c88a99
c613b63
8c88a99
 
c613b63
 
c0dcd9b
 
 
 
 
 
 
 
 
8c88a99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
title: 3B Thinking (vLLM + Controller)
emoji: πŸ†
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: true
license: apache-2.0
---

This Space wraps `meta-llama/Llama-3.2-3B-Instruct` with a simple
**controller**: brainstorm (high T) β†’ critic (low T) β†’ finalize (low T).

**Setup**
- Attach a GPU (T4 small is fine).
- Add a Space **Secret** `HF_TOKEN` so the app can pull gated weights.

**Notes**
- Uses the tokenizer's chat template for correct formatting.
- Private reasoning stays inside `<THINK>…</THINK>`; only `<FINAL>…</FINAL>` is shown to the user.