Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,70 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: creativeml-openrail-m
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: creativeml-openrail-m
|
| 3 |
+
---
|
| 4 |
+
# Image Generation NPU Models – SD 1.5
|
| 5 |
+
|
| 6 |
+
This repository contains the ONNX models and runtime libraries required to run the image generation pipeline on AMD NPUs.
|
| 7 |
+
|
| 8 |
+
The folder structure is organized to mirror the main components of the diffusion pipeline (UNet, VAE decoder, text encoder, tokenizer and scheduler), plus the platform-specific runtime libraries.
|
| 9 |
+
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
## Repository structure
|
| 13 |
+
|
| 14 |
+
```
|
| 15 |
+
.
|
| 16 |
+
├─ libs/
|
| 17 |
+
├─ scheduler/
|
| 18 |
+
├─ text_encoder/
|
| 19 |
+
├─ tokenizer/
|
| 20 |
+
├─ unet/
|
| 21 |
+
└─ vae_decoder/
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
### `libs/`
|
| 25 |
+
This folder contains the dynamic libraries (`.dll`) required at runtime by the NPU backend.
|
| 26 |
+
They must be placed in a location where the application can load them (e.g., in the same folder as the executable or in the system `PATH`).
|
| 27 |
+
|
| 28 |
+
### `unet/`
|
| 29 |
+
This folder contains the UNet model used in the diffusion process.
|
| 30 |
+
The UNet is exported and structured specifically to leverage the AMD NPU accelerator for the denoising steps.
|
| 31 |
+
|
| 32 |
+
### `vae_decoder/`
|
| 33 |
+
This folder contains the VAE decoder model used to map latent representations back to the image space.
|
| 34 |
+
The VAE decoder is also structured to make use of the NPU accelerator for efficient image reconstruction.
|
| 35 |
+
|
| 36 |
+
### `text_encoder/`
|
| 37 |
+
This folder contains the text encoder model used to convert the input prompt into conditioning embeddings for the diffusion model.
|
| 38 |
+
|
| 39 |
+
### `tokenizer/`
|
| 40 |
+
This folder contains the tokenizer configuration and vocabulary files required to preprocess the text prompt before it is fed to the text encoder.
|
| 41 |
+
|
| 42 |
+
### `scheduler/`
|
| 43 |
+
This folder contains the scheduler configuration (timesteps, betas, alphas, etc.) used during the diffusion sampling process.
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## Release 1113
|
| 48 |
+
|
| 49 |
+
This release corresponds to the **Procyon Image Generation – 1113 build**.
|
| 50 |
+
|
| 51 |
+
**Included in this version:**
|
| 52 |
+
- Updated UNet and VAE Decoder models optimized for AMD NPU execution.
|
| 53 |
+
- Synchronized text encoder, tokenizer, and scheduler components aligned with the 1113 pipeline.
|
| 54 |
+
- Updated runtime DLLs in the `libs/` folder.
|
| 55 |
+
- Improved model folder structure for compatibility with Procyon and NPU execution environments.
|
| 56 |
+
|
| 57 |
+
**Notes:**
|
| 58 |
+
- All ONNX models in this release are validated with the 1113 test package.
|
| 59 |
+
- Ensure that the DLLs from `libs/` are correctly placed in the application’s search path.
|
| 60 |
+
- This release is intended for NPU execution; GPU versions are hosted separately.
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## Notes
|
| 65 |
+
|
| 66 |
+
- UNet and VAE decoder models are optimized and structured to run on AMD NPUs.
|
| 67 |
+
- The other components (text encoder, tokenizer and scheduler) are shared between GPU and NPU pipelines, but are provided here for completeness.
|
| 68 |
+
- Please refer to the associated application or benchmark documentation for detailed integration and usage instructions (e.g., how to set model paths, environment variables and library search paths).
|
| 69 |
+
|
| 70 |
+
---
|