Update README.md
Browse files
README.md
CHANGED
|
@@ -30,6 +30,18 @@ pipeline_tag: text-generation
|
|
| 30 |
# SpydazWeb AGI
|
| 31 |
( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## SpydazWeb AI model :
|
| 34 |
|
| 35 |
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
|
|
|
|
| 30 |
# SpydazWeb AGI
|
| 31 |
( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)
|
| 32 |
|
| 33 |
+
## Training Note :
|
| 34 |
+
This is the base FP16 model ! ( very hard to get out! i had to use transformers only and NOT unsloth !):
|
| 35 |
+
to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model)
|
| 36 |
+
I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 :
|
| 37 |
+
### REASON :
|
| 38 |
+
Unsloth issues : they load thier own model if your loading a 16bit model ...
|
| 39 |
+
and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !?
|
| 40 |
+
you think.... WHAT? they download your own tensors but use another model ??
|
| 41 |
+
yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ?
|
| 42 |
+
SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine !
|
| 43 |
+
|
| 44 |
+
|
| 45 |
## SpydazWeb AI model :
|
| 46 |
|
| 47 |
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
|