Model Card for Jay24-AI/bloom-7b1-lora-tagger
This model is a LoRA fine-tuned version of BigScience’s BLOOM-7B1 model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the PEFT (Parameter-Efficient Fine-Tuning) approach with LoRA, making it lightweight to train and efficient for deployment.
Model Details
Model Description
- Developed by: Jay24-AI
- Funded by [optional]: N/A
- Shared by [optional]: Jay24-AI
- Model type: Causal Language Model with LoRA adapters
- Language(s): English
- License: Apache-2.0 (inherited from
bigscience/bloom-7b1; LoRA adapters are MIT-compatible) - Finetuned from model: bigscience/bloom-7b1
Model Sources
Uses
Direct Use
The model can be used for text generation and tagging based on quote-like prompts. For example, you can input a quote, and the model will generate descriptive tags.
Downstream Use
- Can be further fine-tuned on custom tagging or classification datasets.
- Could be integrated into applications that require lightweight quote classification, text annotation, or prompt-based generation.
Out-of-Scope Use
- Not suitable for factual question answering.
- Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice).
Bias, Risks, and Limitations
- Inherits limitations and biases from BLOOM-7B1 (trained on large-scale internet data).
- The fine-tuned dataset (
Abirate/english_quotes) is relatively small, so the model may overfit and generalize poorly outside similar data. - Risk of generating irrelevant or biased tags if prompted outside the intended scope.
- Limited training (50 steps) may result in suboptimal performance.
Recommendations
Users should:
- Validate outputs before production use.
- Avoid relying on the model for critical applications.
How to Get Started with the Model
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "Jay24-AI/bloom-7b1-lora-tagger"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)
batch = tokenizer("“The only way to do great work is to love what you do.” ->: ", return_tensors='pt')
with torch.cuda.amp.autocast():
output_tokens = model.generate(**batch, max_new_tokens=50)
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
Training Details
Training Data
- Dataset used: Abirate/english_quotes
- Subset: Entire training split (exact size not specified in script).
- Structure: Each entry includes a
quoteand its correspondingtags. - Preprocessing:
- Combined the
quoteandtagsinto a single text string:<quote> ->: <tags> - Tokenized using the
AutoTokenizerfrom bigscience/bloom-7b1. - Applied batching via Hugging Face
datasets.mapwithbatched=True.
- Combined the
Training Procedure
Preprocessing
- Converted text examples into the
"quote ->: tags"format. - Tokenized using Bloom’s tokenizer with default settings.
- Applied
DataCollatorForLanguageModelingwithmlm=False(causal LM objective).
Training Hyperparameters
- Base model: bigscience/bloom-7b1
- Adapter method: LoRA via PEFT
- LoRA configuration:
r: 8lora_alpha: 16lora_dropout: 0.05bias: "none"task_type: "CAUSAL_LM"
- TrainingArguments:
per_device_train_batch_size: 2gradient_accumulation_steps: 2warmup_steps: 100max_steps: 50learning_rate: 2e-4fp16: Truelogging_steps: 1output_dir:outputs/
- Precision regime: Mixed precision (fp16) with 8-bit quantization via
bitsandbytes. - Caching:
model.config.use_cache = Falseduring training to suppress warnings. - Additional Settings:
- Original model weights frozen; small parameters (e.g., layer normalization) cast to FP32 for stability.
- Gradient checkpointing enabled to reduce memory usage.
lm_headmodified to output FP32 for stability.
Hyperparameter Summary
| Hyperparameter | Value |
|---|---|
| Base model | bigscience/bloom-7b1 |
| Adapter method | LoRA (via PEFT) |
| LoRA r | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Bias | none |
| Task type | Causal LM |
| Batch size (per device) | 2 |
| Gradient accumulation steps | 2 |
| Effective batch size | 4 |
| Warmup steps | 100 |
| Max steps | 50 |
| Learning rate | 2e-4 |
| Precision | fp16 (mixed precision) |
| Quantization | 8-bit (bitsandbytes) |
| Logging steps | 1 |
| Output directory | outputs/ |
| Gradient checkpointing | Enabled |
| Use cache | False (during training) |
Speeds, Sizes, Times
- Trainable parameters: LoRA adapters only (~0.1% of BLOOM-7B1’s ~7.1 billion parameters, exact count printed via
print_trainable_parameters). - Approx. size: Much smaller than 7B full checkpoint since only adapters are stored.
- Max steps: 50 (~100 updates with gradient accumulation).
- Training runtime: Not logged in script; depends on GPU.
- Batch size effective: 4 (2 × accumulation steps of 2).
Compute Infrastructure
- Hardware: Single CUDA GPU (set with
os.environ["CUDA_VISIBLE_DEVICES"]="0"; specific GPU model not specified, e.g., A100, T4, V100). - Software:
- PyTorch
- Hugging Face Transformers (main branch from GitHub)
- Hugging Face PEFT (main branch from GitHub)
- Hugging Face Datasets
- Accelerate
- Bitsandbytes (for 8-bit quantization)
- Gradient checkpointing: Enabled to save memory.
- Mixed precision: Enabled with fp16.
- Quantization: 8-bit with double quantization,
nf8type,torch.float16compute dtype.
Evaluation
Testing Data
- Same dataset (
Abirate/english_quotes). - No held-out test set reported in training script.
Metrics
- No formal metrics logged; evaluation was qualitative (checking generated tags).
Results
- The model successfully learns to generate tags for English quotes after training, as demonstrated by the inference example.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator.
- Hardware Type: CUDA Single GPU: T4
- Cloud Provider: Colab
Technical Specifications
Model Architecture and Objective
- Base model: BLOOM-7B1, causal language modeling objective.
- Fine-tuned with LoRA adapters using PEFT.
Compute Infrastructure
- Hardware: Single GPU (CUDA device 0).
- Software:
- PyTorch
- Hugging Face Transformers
- Hugging Face PEFT
- Hugging Face Datasets
- Accelerate
- Bitsandbytes
Citation
If you use this model, please cite:
@misc{jay24ai2025bloomlora,
title={LoRA Fine-Tuned BLOOM-7B1 for Quote Tagging},
author={Jay24-AI},
year={2025},
howpublished={\url{https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger}}
}
Model Card Contact
For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger/discussions
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support