Trinity-Mini-DrugProt-Think
RLVR (GRPO) + LoRA post-training on Arcee Trinity Mini for DrugProt relation classification.
📝 Report |
AWS deployment guide | GitHub
Trinity-Mini-DrugProt-Think
A LoRA adapter fine-tuned on Arcee Trinity Mini using GRPO (Group Relative Policy Optimization) for drug-protein relation extraction on the DrugProt (BioCreative VII) benchmark. The model classifies 13 types of drug-protein interactions from PubMed abstracts, producing structured pharmacological reasoning traces before giving its answer.
Model Details
| Property | Value |
|---|---|
| Base Model | arcee-ai/Trinity-Mini |
| Architecture | Sparse MoE (26B total / 3B active) |
| Fine-tuning Method | LoRA (Low-Rank Adaptation) |
| Training Method | GRPO (Reinforcement Learning) |
| Training Data | maziyar/OpenMed_DrugProt |
| Task | Drug-protein relation extraction (13-way classification) |
| Trainable Parameters | LoRA rank=16, all projection layers |
| License | Apache 2.0 |
Training Configuration
| Parameter | Value |
|---|---|
| LoRA Alpha (α) | 64 |
| LoRA Rank | 16 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj + experts |
| Learning Rate | 3e-6 |
| Batch Size | 128 |
| Rollouts per Example | 8 |
| Max Generation Tokens | 2048 |
| Temperature | 0.7 |
Quick Start
Installation
pip install transformers peft torch accelerate
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base_model_id = "arcee-ai/Trinity-Mini"
adapter_id = "lokahq/Trinity-Mini-DrugProt-Think"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter_id)
messages = [
{
"role": "system",
"content": (
"You are an expert biomedical relation extraction assistant. Your task is to identify the type of interaction between a drug/chemical and a gene/protein in biomedical text.\n\n"
"For each question:\n"
"1. First, wrap your detailed biomedical reasoning inside <think></think> tags\n"
"2. Analyze the context around both entities to understand their relationship\n"
"3. Consider the pharmacological and molecular mechanisms involved\n"
"4. Then provide your final answer inside \\boxed{} using exactly one letter (A-M)\n\n"
"The 13 DrugProt relation types are:\n"
"A. INDIRECT-DOWNREGULATOR - Chemical indirectly decreases protein activity/expression\n"
"B. INDIRECT-UPREGULATOR - Chemical indirectly increases protein activity/expression\n"
"C. DIRECT-REGULATOR - Chemical directly regulates protein (mechanism unspecified)\n"
"D. ACTIVATOR - Chemical activates the protein\n"
"E. INHIBITOR - Chemical inhibits the protein\n"
"F. AGONIST - Chemical acts as an agonist of the receptor/protein\n"
"G. AGONIST-ACTIVATOR - Chemical is both agonist and activator\n"
"H. AGONIST-INHIBITOR - Chemical is agonist but inhibits downstream effects\n"
"I. ANTAGONIST - Chemical acts as an antagonist of the receptor/protein\n"
"J. PRODUCT-OF - Chemical is a product of the enzyme\n"
"K. SUBSTRATE - Chemical is a substrate of the enzyme\n"
"L. SUBSTRATE_PRODUCT-OF - Chemical is both substrate and product\n"
"M. PART-OF - Chemical is part of the protein complex\n\n"
"Example format:\n"
"<think>\n"
"The text describes [chemical] and [protein]. Based on the context...\n"
"- The phrase \"[relevant text]\" indicates that...\n"
"- This suggests a [type] relationship because...\n"
"</think>\n"
"\\boxed{A}"
)
},
{
"role": "user",
"content": (
"Abstract: [PASTE PUBMED ABSTRACT HERE]\n\n"
"Chemical entity: [DRUG NAME]\n"
"Protein entity: [PROTEIN NAME]\n\n"
"What is the relationship between the chemical and protein entities? "
"Choose from: A) INHIBITOR B) SUBSTRATE C) INDIRECT-DOWNREGULATOR "
"D) INDIRECT-UPREGULATOR E) AGONIST F) ANTAGONIST G) ACTIVATOR "
"H) PRODUCT-OF I) AGONIST-ACTIVATOR J) INDIRECT-UPREGULATOR "
"K) PART-OF L) SUBSTRATE_PRODUCT-OF M) NOT\n\n"
"Think step by step, then provide your answer in \\boxed{} format."
)
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7, top_p=0.75)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Progress
Training ran for ~100 steps on Prime Intellect infrastructure. Best accuracy reward reached ~0.83 during training.
Limitations
- This is a LoRA adapter and requires the base model (arcee-ai/Trinity-Mini) to run
- Evaluated on training-split held-out data; not yet benchmarked on the official DrugProt test set
- Optimized specifically for 13-way DrugProt classification; may not generalize to other biomedical RE tasks
Citation
@misc{jakimovski2026drugprotrl,
title = {Post-Training an Open MoE Model to Extract Drug-Protein Relations: Trinity-Mini-DrugProt-Think},
author = {Jakimovski, Bojan and Kalinovski, Petar},
year = {2026},
month = feb,
howpublished = {Blog post},
url = {https://github.com/LokaHQ/Trinity-Mini-DrugProt-Think}
}
Acknowledgements
- Arcee AI for the Trinity Mini base model
- Prime Intellect for training infrastructure
- maziyar for the OpenMed DrugProt RL environment
- Hugging Face for the PEFT library
Authors
- Downloads last month
- -
Model tree for lokahq/Trinity-Mini-DrugProt-Think
Base model
arcee-ai/Trinity-Mini-Base-Pre-Anneal
Finetuned
arcee-ai/Trinity-Mini-Base
Finetuned
arcee-ai/Trinity-Mini