Trinity-Mini-DrugProt-Think
RLVR (GRPO) + LoRA post-training on Arcee Trinity Mini for DrugProt relation classification.

📝 Report | AWS deployment guide | GitHub

Trinity-Mini-DrugProt-Think

A LoRA adapter fine-tuned on Arcee Trinity Mini using GRPO (Group Relative Policy Optimization) for drug-protein relation extraction on the DrugProt (BioCreative VII) benchmark. The model classifies 13 types of drug-protein interactions from PubMed abstracts, producing structured pharmacological reasoning traces before giving its answer.

Model Details

Property	Value
Base Model	arcee-ai/Trinity-Mini
Architecture	Sparse MoE (26B total / 3B active)
Fine-tuning Method	LoRA (Low-Rank Adaptation)
Training Method	GRPO (Reinforcement Learning)
Training Data	maziyar/OpenMed_DrugProt
Task	Drug-protein relation extraction (13-way classification)
Trainable Parameters	LoRA rank=16, all projection layers
License	Apache 2.0

Training Configuration

Parameter	Value
LoRA Alpha (α)	64
LoRA Rank	16
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj + experts
Learning Rate	3e-6
Batch Size	128
Rollouts per Example	8
Max Generation Tokens	2048
Temperature	0.7

Quick Start

Installation

pip install transformers peft torch accelerate

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base_model_id = "arcee-ai/Trinity-Mini"
adapter_id = "lokahq/Trinity-Mini-DrugProt-Think"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter_id)

messages = [
    {
        "role": "system",
        "content": (
            "You are an expert biomedical relation extraction assistant. Your task is to identify the type of interaction between a drug/chemical and a gene/protein in biomedical text.\n\n"
            "For each question:\n"
            "1. First, wrap your detailed biomedical reasoning inside <think></think> tags\n"
            "2. Analyze the context around both entities to understand their relationship\n"
            "3. Consider the pharmacological and molecular mechanisms involved\n"
            "4. Then provide your final answer inside \\boxed{} using exactly one letter (A-M)\n\n"
            "The 13 DrugProt relation types are:\n"
            "A. INDIRECT-DOWNREGULATOR - Chemical indirectly decreases protein activity/expression\n"
            "B. INDIRECT-UPREGULATOR - Chemical indirectly increases protein activity/expression\n"
            "C. DIRECT-REGULATOR - Chemical directly regulates protein (mechanism unspecified)\n"
            "D. ACTIVATOR - Chemical activates the protein\n"
            "E. INHIBITOR - Chemical inhibits the protein\n"
            "F. AGONIST - Chemical acts as an agonist of the receptor/protein\n"
            "G. AGONIST-ACTIVATOR - Chemical is both agonist and activator\n"
            "H. AGONIST-INHIBITOR - Chemical is agonist but inhibits downstream effects\n"
            "I. ANTAGONIST - Chemical acts as an antagonist of the receptor/protein\n"
            "J. PRODUCT-OF - Chemical is a product of the enzyme\n"
            "K. SUBSTRATE - Chemical is a substrate of the enzyme\n"
            "L. SUBSTRATE_PRODUCT-OF - Chemical is both substrate and product\n"
            "M. PART-OF - Chemical is part of the protein complex\n\n"
            "Example format:\n"
            "<think>\n"
            "The text describes [chemical] and [protein]. Based on the context...\n"
            "- The phrase \"[relevant text]\" indicates that...\n"
            "- This suggests a [type] relationship because...\n"
            "</think>\n"
            "\\boxed{A}"
        )
    },
    {
        "role": "user",
        "content": (
            "Abstract: [PASTE PUBMED ABSTRACT HERE]\n\n"
            "Chemical entity: [DRUG NAME]\n"
            "Protein entity: [PROTEIN NAME]\n\n"
            "What is the relationship between the chemical and protein entities? "
            "Choose from: A) INHIBITOR B) SUBSTRATE C) INDIRECT-DOWNREGULATOR "
            "D) INDIRECT-UPREGULATOR E) AGONIST F) ANTAGONIST G) ACTIVATOR "
            "H) PRODUCT-OF I) AGONIST-ACTIVATOR J) INDIRECT-UPREGULATOR "
            "K) PART-OF L) SUBSTRATE_PRODUCT-OF M) NOT\n\n"
            "Think step by step, then provide your answer in \\boxed{} format."
        )
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7, top_p=0.75)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Progress

Training ran for ~100 steps on Prime Intellect infrastructure. Best accuracy reward reached ~0.83 during training.

Limitations

This is a LoRA adapter and requires the base model (arcee-ai/Trinity-Mini) to run
Evaluated on training-split held-out data; not yet benchmarked on the official DrugProt test set
Optimized specifically for 13-way DrugProt classification; may not generalize to other biomedical RE tasks

Citation

@misc{jakimovski2026drugprotrl,
  title        = {Post-Training an Open MoE Model to Extract Drug-Protein Relations: Trinity-Mini-DrugProt-Think},
  author       = {Jakimovski, Bojan and Kalinovski, Petar},
  year         = {2026},
  month        = feb,
  howpublished = {Blog post},
  url          = {https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think}
}

```

Acknowledgements

Arcee AI for the Trinity Mini base model
Prime Intellect for training infrastructure
maziyar for the OpenMed DrugProt RL environment
Hugging Face for the PEFT library

Authors

Bojan Jakimovski · Petar Kalinovski · Loka

Downloads last month: -

Model tree for lokahq/Trinity-Mini-DrugProt-Think

Base model

arcee-ai/Trinity-Mini-Base-Pre-Anneal

Finetuned

arcee-ai/Trinity-Mini-Base

Finetuned

arcee-ai/Trinity-Mini

Adapter

(1)

this model