a28f2cdd2ca81f74fa4189e51bc2a573

This model is a fine-tuned version of facebook/mbart-large-50-one-to-many-mmt on the Helsinki-NLP/opus_books [en-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6652
  • Data Size: 1.0
  • Epoch Runtime: 13.0733
  • Bleu: 35.6936

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 5.2877 0 1.3414 4.4799
No log 1 35 3.4020 0.0078 1.8052 9.3876
No log 2 70 2.4452 0.0156 3.4253 20.3433
No log 3 105 2.2423 0.0312 4.6224 21.0147
No log 4 140 2.0691 0.0625 5.7155 19.3118
No log 5 175 1.9055 0.125 7.6372 16.6085
No log 6 210 1.6750 0.25 10.4390 21.5632
No log 7 245 1.3032 0.5 10.7081 28.6433
0.3148 8.0 280 1.2410 1.0 14.1789 37.2413
0.8051 9.0 315 1.3000 1.0 13.5443 40.7802
0.3813 10.0 350 1.4309 1.0 13.7919 41.2118
0.3813 11.0 385 1.5523 1.0 14.6303 36.4598
0.1544 12.0 420 1.6652 1.0 13.0733 35.6936

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/a28f2cdd2ca81f74fa4189e51bc2a573

Finetuned
(43)
this model

Evaluation results