Model Card for Jay24-AI/bloom-7b1-lora-tagger

This model is a LoRA fine-tuned version of BigScience’s BLOOM-7B1 model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the PEFT (Parameter-Efficient Fine-Tuning) approach with LoRA, making it lightweight to train and efficient for deployment.

Model Details

Model Description

  • Developed by: Jay24-AI
  • Funded by [optional]: N/A
  • Shared by [optional]: Jay24-AI
  • Model type: Causal Language Model with LoRA adapters
  • Language(s): English
  • License: Apache-2.0 (inherited from bigscience/bloom-7b1; LoRA adapters are MIT-compatible)
  • Finetuned from model: bigscience/bloom-7b1

Model Sources

Uses

Direct Use

The model can be used for text generation and tagging based on quote-like prompts. For example, you can input a quote, and the model will generate descriptive tags.

Downstream Use

  • Can be further fine-tuned on custom tagging or classification datasets.
  • Could be integrated into applications that require lightweight quote classification, text annotation, or prompt-based generation.

Out-of-Scope Use

  • Not suitable for factual question answering.
  • Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice).

Bias, Risks, and Limitations

  • Inherits limitations and biases from BLOOM-7B1 (trained on large-scale internet data).
  • The fine-tuned dataset (Abirate/english_quotes) is relatively small, so the model may overfit and generalize poorly outside similar data.
  • Risk of generating irrelevant or biased tags if prompted outside the intended scope.
  • Limited training (50 steps) may result in suboptimal performance.

Recommendations

Users should:

  • Validate outputs before production use.
  • Avoid relying on the model for critical applications.

How to Get Started with the Model

import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "Jay24-AI/bloom-7b1-lora-tagger"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

batch = tokenizer("“The only way to do great work is to love what you do.” ->: ", return_tensors='pt')

with torch.cuda.amp.autocast():
  output_tokens = model.generate(**batch, max_new_tokens=50)

print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))

Training Details

Training Data

  • Dataset used: Abirate/english_quotes
  • Subset: Entire training split (exact size not specified in script).
  • Structure: Each entry includes a quote and its corresponding tags.
  • Preprocessing:
    • Combined the quote and tags into a single text string: <quote> ->: <tags>
    • Tokenized using the AutoTokenizer from bigscience/bloom-7b1.
    • Applied batching via Hugging Face datasets.map with batched=True.

Training Procedure

Preprocessing

  • Converted text examples into the "quote ->: tags" format.
  • Tokenized using Bloom’s tokenizer with default settings.
  • Applied DataCollatorForLanguageModeling with mlm=False (causal LM objective).

Training Hyperparameters

  • Base model: bigscience/bloom-7b1
  • Adapter method: LoRA via PEFT
  • LoRA configuration:
    • r: 8
    • lora_alpha: 16
    • lora_dropout: 0.05
    • bias: "none"
    • task_type: "CAUSAL_LM"
  • TrainingArguments:
    • per_device_train_batch_size: 2
    • gradient_accumulation_steps: 2
    • warmup_steps: 100
    • max_steps: 50
    • learning_rate: 2e-4
    • fp16: True
    • logging_steps: 1
    • output_dir: outputs/
  • Precision regime: Mixed precision (fp16) with 8-bit quantization via bitsandbytes.
  • Caching: model.config.use_cache = False during training to suppress warnings.
  • Additional Settings:
    • Original model weights frozen; small parameters (e.g., layer normalization) cast to FP32 for stability.
    • Gradient checkpointing enabled to reduce memory usage.
    • lm_head modified to output FP32 for stability.

Hyperparameter Summary

Hyperparameter Value
Base model bigscience/bloom-7b1
Adapter method LoRA (via PEFT)
LoRA r 8
LoRA alpha 16
LoRA dropout 0.05
Bias none
Task type Causal LM
Batch size (per device) 2
Gradient accumulation steps 2
Effective batch size 4
Warmup steps 100
Max steps 50
Learning rate 2e-4
Precision fp16 (mixed precision)
Quantization 8-bit (bitsandbytes)
Logging steps 1
Output directory outputs/
Gradient checkpointing Enabled
Use cache False (during training)

Speeds, Sizes, Times

  • Trainable parameters: LoRA adapters only (~0.1% of BLOOM-7B1’s ~7.1 billion parameters, exact count printed via print_trainable_parameters).
  • Approx. size: Much smaller than 7B full checkpoint since only adapters are stored.
  • Max steps: 50 (~100 updates with gradient accumulation).
  • Training runtime: Not logged in script; depends on GPU.
  • Batch size effective: 4 (2 × accumulation steps of 2).

Compute Infrastructure

  • Hardware: Single CUDA GPU (set with os.environ["CUDA_VISIBLE_DEVICES"]="0"; specific GPU model not specified, e.g., A100, T4, V100).
  • Software:
    • PyTorch
    • Hugging Face Transformers (main branch from GitHub)
    • Hugging Face PEFT (main branch from GitHub)
    • Hugging Face Datasets
    • Accelerate
    • Bitsandbytes (for 8-bit quantization)
  • Gradient checkpointing: Enabled to save memory.
  • Mixed precision: Enabled with fp16.
  • Quantization: 8-bit with double quantization, nf8 type, torch.float16 compute dtype.

Evaluation

Testing Data

  • Same dataset (Abirate/english_quotes).
  • No held-out test set reported in training script.

Metrics

  • No formal metrics logged; evaluation was qualitative (checking generated tags).

Results

  • The model successfully learns to generate tags for English quotes after training, as demonstrated by the inference example.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

  • Hardware Type: CUDA Single GPU: T4
  • Cloud Provider: Colab

Technical Specifications

Model Architecture and Objective

  • Base model: BLOOM-7B1, causal language modeling objective.
  • Fine-tuned with LoRA adapters using PEFT.

Compute Infrastructure

  • Hardware: Single GPU (CUDA device 0).
  • Software:
    • PyTorch
    • Hugging Face Transformers
    • Hugging Face PEFT
    • Hugging Face Datasets
    • Accelerate
    • Bitsandbytes

Citation

If you use this model, please cite:

@misc{jay24ai2025bloomlora,
  title={LoRA Fine-Tuned BLOOM-7B1 for Quote Tagging},
  author={Jay24-AI},
  year={2025},
  howpublished={\url{https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger}}
}

Model Card Contact

For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger/discussions

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support