|
|
--- |
|
|
license: creativeml-openrail-m |
|
|
--- |
|
|
# Image Generation NPU Models – SD 1.5 |
|
|
|
|
|
This repository contains the ONNX models and runtime libraries required to run the image generation pipeline on AMD NPUs. |
|
|
|
|
|
The folder structure is organized to mirror the main components of the diffusion pipeline (UNet, VAE decoder, text encoder, tokenizer and scheduler), plus the platform-specific runtime libraries. |
|
|
|
|
|
--- |
|
|
|
|
|
## Repository structure |
|
|
|
|
|
``` |
|
|
. |
|
|
├─ libs/ |
|
|
├─ scheduler/ |
|
|
├─ text_encoder/ |
|
|
├─ tokenizer/ |
|
|
├─ unet/ |
|
|
└─ vae_decoder/ |
|
|
``` |
|
|
|
|
|
### `libs/` |
|
|
This folder contains the dynamic libraries (`.dll`) required at runtime by the NPU backend. |
|
|
They must be placed in a location where the application can load them (e.g., in the same folder as the executable or in the system `PATH`). |
|
|
|
|
|
### `unet/` |
|
|
This folder contains the UNet model used in the diffusion process. |
|
|
The UNet is exported and structured specifically to leverage the AMD NPU accelerator for the denoising steps. |
|
|
|
|
|
### `vae_decoder/` |
|
|
This folder contains the VAE decoder model used to map latent representations back to the image space. |
|
|
The VAE decoder is also structured to make use of the NPU accelerator for efficient image reconstruction. |
|
|
|
|
|
### `text_encoder/` |
|
|
This folder contains the text encoder model used to convert the input prompt into conditioning embeddings for the diffusion model. |
|
|
|
|
|
### `tokenizer/` |
|
|
This folder contains the tokenizer configuration and vocabulary files required to preprocess the text prompt before it is fed to the text encoder. |
|
|
|
|
|
### `scheduler/` |
|
|
This folder contains the scheduler configuration (timesteps, betas, alphas, etc.) used during the diffusion sampling process. |
|
|
|
|
|
--- |
|
|
|
|
|
## Release 1113 |
|
|
|
|
|
This release corresponds to the **1113 build**. |
|
|
|
|
|
**Included in this version:** |
|
|
- Updated UNet and VAE Decoder models optimized for AMD NPU execution. |
|
|
- Synchronized text encoder, tokenizer, and scheduler components aligned with the 1113 pipeline. |
|
|
- Updated runtime DLLs in the `libs/` folder. |
|
|
- Improved model folder structure for compatibility with Procyon and NPU execution environments. |
|
|
|
|
|
**Notes:** |
|
|
- All ONNX models in this release are validated with the 1113 test package. |
|
|
- Ensure that the DLLs from `libs/` are correctly placed in the application’s search path. |
|
|
- This release is intended for NPU execution; GPU versions are hosted separately. |
|
|
|
|
|
--- |
|
|
|
|
|
## Notes |
|
|
|
|
|
- UNet and VAE decoder models are optimized and structured to run on AMD NPUs. |
|
|
- The other components (text encoder, tokenizer and scheduler) are shared between GPU and NPU pipelines, but are provided here for completeness. |
|
|
- Please refer to the associated application or benchmark documentation for detailed integration and usage instructions (e.g., how to set model paths, environment variables and library search paths). |
|
|
|
|
|
--- |
|
|
|