1f49ccad99afd0252277a0f8d4ff2823

This model is a fine-tuned version of google-bert/bert-base-multilingual-uncased on the fancyzhx/dbpedia_14 dataset. It achieves the following results on the evaluation set:

Loss: 0.0688
Data Size: 1.0
Epoch Runtime: 866.1259
Accuracy: 0.9881
F1 Macro: 0.9881
Rouge1: 0.9882
Rouge2: 0.0
Rougel: 0.9881
Rougelsum: 0.9881

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Accuracy	F1 Macro	Rouge1	Rougel	Rougelsum
No log	0	0	2.6316	0	30.1918	0.0908	0.0462	0.0907	0.0908	0.0908
0.1599	1	17500	0.1108	0.0078	37.0392	0.9780	0.9780	0.9780	0.9780	0.9780
0.0706	2	35000	0.0854	0.0156	43.4134	0.9829	0.9829	0.9829	0.9829	0.9829
0.0472	3	52500	0.0851	0.0312	57.1504	0.9827	0.9827	0.9827	0.9827	0.9827
0.0856	4	70000	0.0601	0.0625	83.5203	0.9880	0.9880	0.9880	0.9880	0.9880
0.0743	5	87500	0.0774	0.125	137.6034	0.9831	0.9831	0.9831	0.9831	0.9831
0.0652	6	105000	0.0687	0.25	241.3291	0.9866	0.9866	0.9866	0.9866	0.9866
0.0004	7	122500	0.0678	0.5	456.9015	0.9873	0.9873	0.9873	0.9873	0.9873
0.0576	8.0	140000	0.0688	1.0	866.1259	0.9881	0.9881	0.9882	0.9881	0.9881

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.3.0
Tokenizers 0.22.1

Downloads last month: 5

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for contemmcm/1f49ccad99afd0252277a0f8d4ff2823

Base model

google-bert/bert-base-multilingual-uncased

Finetuned

(1876)

this model

contemmcm
/

1f49ccad99afd0252277a0f8d4ff2823

1f49ccad99afd0252277a0f8d4ff2823

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/1f49ccad99afd0252277a0f8d4ff2823

Evaluation results