Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs
Dicta-LM 3.0 is a powerful open-weight collection of LLMs, trained on extensive corpora of Hebrew and English texts. The models are available for download and for unlimited use. The models set a new SOTA for their weight-class for Hebrew, both as base models and chat models.
This is the 24-billion-parameter base model, with full precision (BF16), originally initialized from Mistral-Small-3.1-24B-Base-2503.
For full details of this model please read our release blog post or the technical report.
Note: This is not a chat model; rather this is a base model that can be further fine-tuned. Chat model variants are available at the link below.
You can view and access the full collection of base/instruct unquantized/quantized versions of DictaLM 3.0 here.
Usage
Transformers
from transformers import pipeline
import torch
# This loads the model onto the GPU in bfloat16 precision
model = pipeline('text-generation', 'dicta-il/DictaLM-3.0-24B-Base', torch_dtype=torch.bfloat16, device_map='auto')
# Sample few shot examples
prompt = """
עבר: הלכתי
עתיד: אלך
עבר: שמרתי
עתיד: אשמור
עבר: שמעתי
עתיד: אשמע
עבר: הבנתי
עתיד:
"""
print(model(prompt.strip(), do_sample=False, max_new_tokens=8, stop_sequence='\n'))
# [{'generated_text': 'עבר: הלכתי\nעתיד: אלך\n\nעבר: שמרתי\nעתיד: אשמור\n\nעבר: שמעתי\nעתיד: אשמע\n\nעבר: הבנתי\nעתיד: אבין\n\nעבר: קרא'}]
vLLM
vllm serve dicta-il/DictaLM-3.0-24B-Base
If you run out of memory, you can try limiting the context window by setting
--max-model-len 8192
Notice
DictaLM-3.0-24-Base is a pretrained base model and therefore does not have any moderation mechanisms.
Citation
If you use this model, please cite:
@article{Shmidman2025DictaLM3,
title={{Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs}},
author={Shaltiel Shmidman and Avi Shmidman and Amir DN Cohen and Moshe Koppel},
year={2025},
publisher={{DICTA / Jerusalem, Israel}},
note={https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf}
}
- Downloads last month
- 276
