Add files using upload-large-folder tool
Browse files
README.md
CHANGED
|
@@ -1,11 +1,18 @@
|
|
| 1 |
---
|
| 2 |
tags:
|
| 3 |
- unsloth
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
base_model:
|
| 5 |
- Qwen/Qwen3-8B
|
| 6 |
-
license: apache-2.0
|
| 7 |
---
|
|
|
|
| 8 |
# Qwen3-8B
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
## Qwen3 Highlights
|
| 11 |
|
|
@@ -87,21 +94,23 @@ print("thinking content:", thinking_content)
|
|
| 87 |
print("content:", content)
|
| 88 |
```
|
| 89 |
|
| 90 |
-
For deployment, you can use `
|
| 91 |
-
-
|
| 92 |
```shell
|
| 93 |
-
|
| 94 |
```
|
| 95 |
-
-
|
| 96 |
```shell
|
| 97 |
-
|
| 98 |
```
|
| 99 |
|
|
|
|
|
|
|
| 100 |
## Switching Between Thinking and Non-Thinking Mode
|
| 101 |
|
| 102 |
> [!TIP]
|
| 103 |
-
> The `enable_thinking` switch is also available in APIs created by
|
| 104 |
-
> Please refer to our documentation for [
|
| 105 |
|
| 106 |
### `enable_thinking=True`
|
| 107 |
|
|
@@ -199,7 +208,7 @@ if __name__ == "__main__":
|
|
| 199 |
print(f"Bot: {response_3}")
|
| 200 |
```
|
| 201 |
|
| 202 |
-
>
|
| 203 |
> For API compatibility, when `enable_thinking=True`, regardless of whether the user uses `/think` or `/no_think`, the model will always output a block wrapped in `<think>...</think>`. However, the content inside this block may be empty if thinking is disabled.
|
| 204 |
> When `enable_thinking=False`, the soft switches are not valid. Regardless of any `/think` or `/no_think` tags input by the user, the model will not generate think content and will not include a `<think>...</think>` block.
|
| 205 |
|
|
|
|
| 1 |
---
|
| 2 |
tags:
|
| 3 |
- unsloth
|
| 4 |
+
library_name: transformers
|
| 5 |
+
license: apache-2.0
|
| 6 |
+
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
base_model:
|
| 9 |
- Qwen/Qwen3-8B
|
|
|
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# Qwen3-8B
|
| 13 |
+
<a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
|
| 14 |
+
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
| 15 |
+
</a>
|
| 16 |
|
| 17 |
## Qwen3 Highlights
|
| 18 |
|
|
|
|
| 94 |
print("content:", content)
|
| 95 |
```
|
| 96 |
|
| 97 |
+
For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.4` or to create an OpenAI-compatible API endpoint:
|
| 98 |
+
- SGLang:
|
| 99 |
```shell
|
| 100 |
+
python -m sglang.launch_server --model-path Qwen/Qwen3-8B --reasoning-parser qwen3
|
| 101 |
```
|
| 102 |
+
- vLLM:
|
| 103 |
```shell
|
| 104 |
+
vllm serve Qwen/Qwen3-8B --enable-reasoning --reasoning-parser deepseek_r1
|
| 105 |
```
|
| 106 |
|
| 107 |
+
For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
|
| 108 |
+
|
| 109 |
## Switching Between Thinking and Non-Thinking Mode
|
| 110 |
|
| 111 |
> [!TIP]
|
| 112 |
+
> The `enable_thinking` switch is also available in APIs created by SGLang and vLLM.
|
| 113 |
+
> Please refer to our documentation for [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) and [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) users.
|
| 114 |
|
| 115 |
### `enable_thinking=True`
|
| 116 |
|
|
|
|
| 208 |
print(f"Bot: {response_3}")
|
| 209 |
```
|
| 210 |
|
| 211 |
+
> [!NOTE]
|
| 212 |
> For API compatibility, when `enable_thinking=True`, regardless of whether the user uses `/think` or `/no_think`, the model will always output a block wrapped in `<think>...</think>`. However, the content inside this block may be empty if thinking is disabled.
|
| 213 |
> When `enable_thinking=False`, the soft switches are not valid. Regardless of any `/think` or `/no_think` tags input by the user, the model will not generate think content and will not include a `<think>...</think>` block.
|
| 214 |
|