Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model
Paper
β’
2510.18573
β’
Published
β’
1
This repository contains the official implementation of Kaleido, proposed in our paper:
Use the following commands to download the model weights (We have integrated both Wan VAE and T5 modules into this checkpoint for convenience).
# Download the repository (skip automatic LFS file downloads)
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Crilias/Kaleido-14B-S2V
# Enter the repository folder
cd Kaleido-14B-S2V
# Merge the checkpoint files
python merge_kaleido.py
Arrange the model files into the following structure:
.
βββ Kaleido-14B-S2V
β βββ model
β β βββ ....
β βββ Wan2.1_VAE.pth
β β
β βββ umt5-xxl
β βββ ....
βββ configs
βββ sat
βββ sgm
If you find our work helpful, please cite our paper:
@misc{zhang2025kaleidoopensourcedmultisubjectreference,
title={Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model},
author={Zhenxing Zhang and Jiayan Teng and Zhuoyi Yang and Tiankun Cao and Cheng Wang and Xiaotao Gu and Jie Tang and Dan Guo and Meng Wang},
year={2025},
eprint={2510.18573},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.18573},
}
Base model
Wan-AI/Wan2.1-T2V-14B