MiniCrit-7B: Adversarial AI Critique Model

Model Description

MiniCrit-7B is a specialized adversarial AI model trained to identify flawed reasoning in autonomous AI systems before they cause catastrophic failures. Developed by Antagon Inc., MiniCrit acts as an AI "devil's advocate" that critiques trading rationales, detecting issues like:

Overconfident predictions
Overfitting to historical patterns
Spurious correlations
Survivorship bias
Confirmation bias
Missing risk factors

Model Details

Attribute	Value
Developer	Antagon Inc. (CAGE: 17E75, UEI: KBSGT7CZ4AH3)
Base Model	Qwen/Qwen2-7B-Instruct
Method	LoRA (Low-Rank Adaptation)
Trainable Parameters	40.4M (0.53% of 7.6B total)
Training Data	11.7M critique examples
Training Hardware	NVIDIA H100 PCIe (80GB) via Lambda Labs GPU Grant
License	Apache 2.0

Training Details

Dataset

Size: 11,674,598 training examples
Format: Rationale → Critique pairs
Domain: Financial trading signals (stocks, options, crypto)

Training Configuration

learning_rate: 2e-4
lr_scheduler: cosine
warmup_steps: 500
batch_size: 32 (effective)
max_sequence_length: 512
epochs: 1
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]

Training Progress

Steps Completed: 35,650 / 364,831 (9.8%)
Initial Loss: 1.8573
Final Loss: 0.7869
Loss Reduction: 57.6%

Usage

Installation

pip install transformers peft torch

Loading the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Antagon/MiniCrit-7B")

Inference

def critique_rationale(rationale: str) -> str:
    prompt = f"### Rationale:\n{rationale}\n\n### Critique:\n"
    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("### Critique:\n")[-1]

# Example
rationale = "AAPL long: MACD bullish crossover with supporting momentum."
critique = critique_rationale(rationale)
print(critique)

Example Output

Input: "META long: Bollinger Band expansion with supporting momentum."

Output: "While Bollinger Band expansion can signal volatility, META's recent 
expansion isn't necessarily predictive; it could be a reaction to news, not 
a precursor to sustained movement. Furthermore, relying solely on momentum 
without considering overbought/oversold levels may lead to premature entry, 
especially if the expansion is already near its peak."

Performance

Production Metrics (MiniCrit-1.5B)

False Signal Reduction: 35%
Sharpe Ratio Improvement: +0.28
Live Trades Processed: 38,000+

Training Metrics

Metric	Value
Initial Loss	1.8573
Final Loss	0.7869
Loss Reduction	57.6%
Gradient Norm (avg)	0.45

Intended Use

Primary Use Cases

Validating AI trading signals before execution
Identifying reasoning flaws in autonomous systems
Risk assessment for algorithmic trading
Quality assurance for AI-generated analysis

Out-of-Scope Uses

This model is NOT intended for:
- Generating trading signals
- Financial advice
- Autonomous trading decisions

Limitations

Trained primarily on trading/finance domain
May not generalize well to other critique domains without fine-tuning
Checkpoint represents partial training (9.8% of planned steps)
Should be used as a supplement to human judgment, not a replacement

Citation

@misc{minicrit7b2026,
  title={MiniCrit-7B: Adversarial AI Critique for Trading Signal Validation},
  author={Ousley, William Alexander and Ousley, Jacqueline Villamor},
  year={2026},
  publisher={Antagon Inc.},
  url={https://huggingface.co/Antagon/MiniCrit-7B}
}

Contact

Company: Antagon Inc.
Website: antagon.ai
CAGE Code: 17E75
UEI: KBSGT7CZ4AH3

Acknowledgments

We gratefully acknowledge Lambda Labs for providing GPU compute through their Research Grant program. MiniCrit-7B was trained on Lambda's H100 infrastructure, and their support has been instrumental in advancing our AI safety research.

License

This model is released under the Apache 2.0 License.

Downloads last month: 7

Model tree for wmaousley/MiniCrit-7B

Base model

Qwen/Qwen2-7B

Finetuned

Qwen/Qwen2-7B-Instruct

Adapter

(344)

this model