Model Card for Jay24-AI/bloom-7b1-lora-tagger

This model is a LoRA fine-tuned version of BigScience’s BLOOM-7B1 model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the PEFT (Parameter-Efficient Fine-Tuning) approach with LoRA, making it lightweight to train and efficient for deployment.

Model Details

Model Description

Developed by: Jay24-AI
Funded by [optional]: N/A
Shared by [optional]: Jay24-AI
Model type: Causal Language Model with LoRA adapters
Language(s): English
License: Apache-2.0 (inherited from bigscience/bloom-7b1; LoRA adapters are MIT-compatible)
Finetuned from model: bigscience/bloom-7b1

Model Sources

Repository: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger

Uses

Direct Use

The model can be used for text generation and tagging based on quote-like prompts. For example, you can input a quote, and the model will generate descriptive tags.

Downstream Use

Can be further fine-tuned on custom tagging or classification datasets.
Could be integrated into applications that require lightweight quote classification, text annotation, or prompt-based generation.

Out-of-Scope Use

Not suitable for factual question answering.
Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice).

Bias, Risks, and Limitations

Inherits limitations and biases from BLOOM-7B1 (trained on large-scale internet data).
The fine-tuned dataset (Abirate/english_quotes) is relatively small, so the model may overfit and generalize poorly outside similar data.
Risk of generating irrelevant or biased tags if prompted outside the intended scope.
Limited training (50 steps) may result in suboptimal performance.

Recommendations

Users should:

Validate outputs before production use.
Avoid relying on the model for critical applications.

How to Get Started with the Model

import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "Jay24-AI/bloom-7b1-lora-tagger"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

batch = tokenizer("“The only way to do great work is to love what you do.” ->: ", return_tensors='pt')

with torch.cuda.amp.autocast():
  output_tokens = model.generate(**batch, max_new_tokens=50)

print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))

Training Details

Training Data

Dataset used: Abirate/english_quotes
Subset: Entire training split (exact size not specified in script).
Structure: Each entry includes a quote and its corresponding tags.
Preprocessing:
- Combined the quote and tags into a single text string: <quote> ->: <tags>
- Tokenized using the AutoTokenizer from bigscience/bloom-7b1.
- Applied batching via Hugging Face datasets.map with batched=True.

Training Procedure

Preprocessing

Converted text examples into the "quote ->: tags" format.
Tokenized using Bloom’s tokenizer with default settings.
Applied DataCollatorForLanguageModeling with mlm=False (causal LM objective).

Training Hyperparameters

Base model: bigscience/bloom-7b1
Adapter method: LoRA via PEFT
LoRA configuration:
- r: 8
- lora_alpha: 16
- lora_dropout: 0.05
- bias: "none"
- task_type: "CAUSAL_LM"
TrainingArguments:
- per_device_train_batch_size: 2
- gradient_accumulation_steps: 2
- warmup_steps: 100
- max_steps: 50
- learning_rate: 2e-4
- fp16: True
- logging_steps: 1
- output_dir: outputs/
Precision regime: Mixed precision (fp16) with 8-bit quantization via bitsandbytes.
Caching: model.config.use_cache = False during training to suppress warnings.
Additional Settings:
- Original model weights frozen; small parameters (e.g., layer normalization) cast to FP32 for stability.
- Gradient checkpointing enabled to reduce memory usage.
- lm_head modified to output FP32 for stability.

Hyperparameter Summary

Hyperparameter	Value
Base model	bigscience/bloom-7b1
Adapter method	LoRA (via PEFT)
LoRA r	8
LoRA alpha	16
LoRA dropout	0.05
Bias	none
Task type	Causal LM
Batch size (per device)	2
Gradient accumulation steps	2
Effective batch size	4
Warmup steps	100
Max steps	50
Learning rate	2e-4
Precision	fp16 (mixed precision)
Quantization	8-bit (bitsandbytes)
Logging steps	1
Output directory	outputs/
Gradient checkpointing	Enabled
Use cache	False (during training)

Speeds, Sizes, Times

Trainable parameters: LoRA adapters only (~0.1% of BLOOM-7B1’s ~7.1 billion parameters, exact count printed via print_trainable_parameters).
Approx. size: Much smaller than 7B full checkpoint since only adapters are stored.
Max steps: 50 (~100 updates with gradient accumulation).
Training runtime: Not logged in script; depends on GPU.
Batch size effective: 4 (2 × accumulation steps of 2).

Compute Infrastructure

Hardware: Single CUDA GPU (set with os.environ["CUDA_VISIBLE_DEVICES"]="0"; specific GPU model not specified, e.g., A100, T4, V100).
Software:
- PyTorch
- Hugging Face Transformers (main branch from GitHub)
- Hugging Face PEFT (main branch from GitHub)
- Hugging Face Datasets
- Accelerate
- Bitsandbytes (for 8-bit quantization)
Gradient checkpointing: Enabled to save memory.
Mixed precision: Enabled with fp16.
Quantization: 8-bit with double quantization, nf8 type, torch.float16 compute dtype.

Evaluation

Testing Data

Same dataset (Abirate/english_quotes).
No held-out test set reported in training script.

Metrics

No formal metrics logged; evaluation was qualitative (checking generated tags).

Results

The model successfully learns to generate tags for English quotes after training, as demonstrated by the inference example.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

Hardware Type: CUDA Single GPU: T4
Cloud Provider: Colab

Technical Specifications

Model Architecture and Objective

Base model: BLOOM-7B1, causal language modeling objective.
Fine-tuned with LoRA adapters using PEFT.

Compute Infrastructure

Hardware: Single GPU (CUDA device 0).
Software:
- PyTorch
- Hugging Face Transformers
- Hugging Face PEFT
- Hugging Face Datasets
- Accelerate
- Bitsandbytes

Citation

If you use this model, please cite:

@misc{jay24ai2025bloomlora,
  title={LoRA Fine-Tuned BLOOM-7B1 for Quote Tagging},
  author={Jay24-AI},
  year={2025},
  howpublished={\url{https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger}}
}

Model Card Contact

For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger/discussions

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support