update readme
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ tags:
|
|
| 17 |
|
| 18 |
We are thrilled to announce the alpha release of Flux.1 Lite, an 8B parameter transformer model distilled from the FLUX.1-dev model.
|
| 19 |
|
| 20 |
-
Our goal? To distill FLUX.1-dev
|
| 21 |
|
| 22 |

|
| 23 |
|
|
@@ -25,7 +25,7 @@ Our goal? To distill FLUX.1-dev into a lighter model, reducing the parameters to
|
|
| 25 |
|
| 26 |
As stated by other members of the community like [Ostris](https://ostris.com/2024/09/07/skipping-flux-1-dev-blocks/), it seems that blocks of the Flux1.dev transformer have a different contribution to the final image generation. To explore this, we analyzed the Mean Squared Error (MSE) between the input and output of each block, revealing significant variability.
|
| 27 |
|
| 28 |
-
Our findings? Not all blocks
|
| 29 |
|
| 30 |

|
| 31 |

|
|
|
|
| 17 |
|
| 18 |
We are thrilled to announce the alpha release of Flux.1 Lite, an 8B parameter transformer model distilled from the FLUX.1-dev model.
|
| 19 |
|
| 20 |
+
Our goal? To distill FLUX.1-dev further until we achieve to reduce the parameters to just 24 GB, so it can run smoothly on most consumer-grade GPU cards, making high-quality AI models accessible to everyone.
|
| 21 |
|
| 22 |

|
| 23 |
|
|
|
|
| 25 |
|
| 26 |
As stated by other members of the community like [Ostris](https://ostris.com/2024/09/07/skipping-flux-1-dev-blocks/), it seems that blocks of the Flux1.dev transformer have a different contribution to the final image generation. To explore this, we analyzed the Mean Squared Error (MSE) between the input and output of each block, revealing significant variability.
|
| 27 |
|
| 28 |
+
Our findings? Not all blocks contribute equally. The results are striking: skipping just one of the early MMDIT blocks can significantly impact model performance, whereas skipping the rest of the blocks do not have a significant impact over the final image quality.
|
| 29 |
|
| 30 |

|
| 31 |

|