OCR PaddlePaddle/PP-OCRv5_server_rec Image-to-Text • Updated Jul 22, 2025 • 68.3k • 19 ibm-granite/granite-docling-258M Image-Text-to-Text • Updated Sep 23, 2025 • 187k • 1.13k
text<->image De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
Machine Translation A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 33 tencent/Hunyuan-MT-7B Translation • 8B • Updated Dec 30, 2025 • 11.1k • 550
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 33
ASR Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 23 utter-project/mHuBERT-147 Feature Extraction • 94.4M • Updated Dec 19, 2024 • 26.7k • • 98
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 23
Multimodal LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
OCR PaddlePaddle/PP-OCRv5_server_rec Image-to-Text • Updated Jul 22, 2025 • 68.3k • 19 ibm-granite/granite-docling-258M Image-Text-to-Text • Updated Sep 23, 2025 • 187k • 1.13k
Machine Translation A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 33 tencent/Hunyuan-MT-7B Translation • 8B • Updated Dec 30, 2025 • 11.1k • 550
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 33
ASR Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 23 utter-project/mHuBERT-147 Feature Extraction • 94.4M • Updated Dec 19, 2024 • 26.7k • • 98
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 23
text<->image De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
Multimodal LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51