Fuyuhana-30B-VL


中文介绍

Fuyuhana-30B-VL 是基于 [Qwen3-VL-30B-A3B-Instruct] 的微调版本。我们在原模型强大的多模态能力基础上,针对人类偏好对齐(Human Preference Alignment)进行了进一步的训练与优化。

本模型旨在提供更贴合中文社区语境的交互体验,同时保持高效的推理性能,非常适合个人开发者或中小型企业进行私有化部署。

核心亮点

  1. 深度中文社区对齐: 相比于基座模型,Fuyuhana-30B-VL 在中文互联网亚文化、俚语及特定社区语境方面进行了增强。它能够更准确地理解和生成带有百度贴吧小红书等平台风格的内容,懂梗、懂情绪,对话更加“接地气”。

  2. 高效的端侧/私有化部署: 得益于 MoE 架构设计,虽然模型总参数量为 30B,但其推理时的激活参数极小。配合强大的Qwen视觉编码器,使其成为自建多模态聊天机器人(Chatbot)的理想选择,在消费级显卡上也能获得流畅的体验。

  3. 性能提升: 在权威评测基准 Arena-Hard-Auto-V2 中,本模型不仅继承了基座的优秀能力,更实现了性能的进一步突破,总分超越了基座模型。

评测结果 (Arena-Hard-Auto-V2)

在最新的 Arena-Hard-Auto-V2 榜单中,Fuyuhana-30B-VL 取得了 47.1% 的优异成绩。

Rank Model Scores (%) CI (%)
10 Qwen3-235B-A22B 48.6 (-2.1 / +2.1)
11 Fuyuhana-30B-VL 47.1 (-1.8 / +1.9)
12 Qwen3-VL-30B-A3B-Instruct 46.8 (-1.8 / +1.5)
13 gpt-4.5-preview 41.5 (-2.3 / +2.4)

快速开始(vLLM示例)

CUDA_VISIBLE_DEVICES=0,1,2,3 VLLM_USE_V1=1 vllm serve \
    flymyd/Fuyuhana-30B-VL \
    --served-model-name Fuyuhana-30B-VL \
    --tensor-parallel-size 4 \
    --gpu-memory-utilization 0.9 \
    --trust-remote-code \
    --disable-log-requests \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --host 0.0.0.0 \
    --port 9997

English Introduction

Fuyuhana-30B-VL is a fine-tuned version based on [Qwen3-VL-30B-A3B-Instruct]. Building upon the original model's powerful multimodal capabilities, we have conducted further training and optimization specifically focusing on Human Preference Alignment.

This model aims to provide an interactive experience that better fits the context of the Chinese community while maintaining high inference efficiency, making it highly suitable for individual developers or small-to-medium enterprises for private deployment.

Key Features

  1. Deep Chinese Community Alignment: Compared to the base model, Fuyuhana-30B-VL is enhanced in the areas of Chinese internet subculture, slang, and specific community contexts. It can more accurately understand and generate content with the style of platforms like Baidu Tieba and Xiaohongshu (Little Red Book). It understands memes and emotions, making conversations more grounded and authentic.

  2. Efficient Edge/Private Deployment: Benefiting from the MoE (Mixture of Experts) architecture design, although the model has a total parameter size of 30B, its active parameters during inference are extremely small. Combined with the powerful Qwen visual encoder, it is an ideal choice for self-hosted multimodal chatbots, delivering a smooth experience even on consumer-grade graphics cards.

  3. Performance Improvement: On the authoritative benchmark Arena-Hard-Auto-V2, this model not only inherits the excellent capabilities of the base model but also achieves a further breakthrough in performance, surpassing the base model in total score.

Benchmark Results (Arena-Hard-Auto-V2)

In the latest Arena-Hard-Auto-V2 leaderboard, Fuyuhana-30B-VL achieved an excellent score of 47.1%.

Rank Model Scores (%) CI (%)
10 Qwen3-235B-A22B 48.6 (-2.1 / +2.1)
11 Fuyuhana-30B-VL 47.1 (-1.8 / +1.9)
12 Qwen3-VL-30B-A3B-Instruct 46.8 (-1.8 / +1.5)
13 gpt-4.5-preview 41.5 (-2.3 / +2.4)

Quick Start (vLLM Example)

CUDA_VISIBLE_DEVICES=0,1,2,3 VLLM_USE_V1=1 vllm serve \
    flymyd/Fuyuhana-30B-VL \
    --served-model-name Fuyuhana-30B-VL \
    --tensor-parallel-size 4 \
    --gpu-memory-utilization 0.9 \
    --trust-remote-code \
    --disable-log-requests \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --host 0.0.0.0 \
    --port 9997

Contact / 联系方式

If you have any questions or suggestions, please feel free to reach out: 如有任何疑问或建议,欢迎通过邮件联系:

📧 flymyd@foxmail.com

Downloads last month
23
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flymyd/Fuyuhana-30B-VL

Quantizations
2 models