# Deployment Guide for Wan2.2 on Hugging Face Spaces This guide explains how to deploy the Wan2.2 video generation model to Hugging Face Spaces with Zero GPU support. ## Prerequisites 1. A Hugging Face account (create one at https://huggingface.co/join) 2. Git installed on your local machine 3. Git LFS (Large File Storage) installed ## Deployment Steps ### Option 1: Deploy via Hugging Face Web Interface 1. **Create a New Space** - Go to https://huggingface.co/new-space - Choose a name for your Space (e.g., "wan2-video-gen") - Select "Gradio" as the SDK - Choose "Public" or "Private" visibility - Click "Create Space" 2. **Upload Files** - Use the web interface to upload files: - `app.py` - `requirements.txt` - `README.md` - `.gitignore` 3. **Enable Zero GPU** - In your Space settings, enable "Zero GPU" - This provides automatic GPU allocation during inference 4. **Wait for Build** - Hugging Face will automatically build your Space - This may take 10-15 minutes for the first build - Check the build logs for any errors ### Option 2: Deploy via Git (Recommended) 1. **Clone Your Space** ```bash git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME ``` 2. **Copy Files** ```bash # Copy all files from huggingface-wan2.2 directory cp /path/to/huggingface-wan2.2/* . ``` 3. **Commit and Push** ```bash git add . git commit -m "Initial deployment of Wan2.2 video generation" git push ``` 4. **Enable Zero GPU** - Go to your Space settings on Hugging Face - Navigate to "Settings" → "Zero GPU" - Enable Zero GPU support ### Option 3: Deploy from This Repository If you've already cloned this repository: ```bash cd /home/user/Kakka/huggingface-wan2.2 # Initialize git if not already done git init # Add Hugging Face Space as remote git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME # Commit files git add . git commit -m "Initial deployment of Wan2.2 video generation" # Push to Hugging Face git push hf main ``` ## Configuration ### Zero GPU Settings The app is configured to use Zero GPU with the following settings: - **Duration**: 180 seconds (3 minutes) per generation - **Allocation**: Automatic (triggered by generation request) - **Optimized defaults**: Reduced frames (73) and steps (35) to fit within time limit This is configured in `app.py` with the decorator: ```python @spaces.GPU(duration=180) # 3 minutes max for Pro accounts ``` **Important**: Even with Pro subscription, the maximum GPU duration is limited to 180 seconds (3 minutes). The default settings have been optimized to complete generation within this time: - Default frames: 73 (3 seconds of video at 24fps) - Default inference steps: 35 (balanced speed/quality) - Maximum frames slider: 145 (6 seconds) - Maximum inference steps: 60 ### Memory Requirements The Wan2.2-TI2V-5B model requires: - **Minimum**: 24GB VRAM - **Recommended**: 40GB+ VRAM for Zero GPU Zero GPU on Hugging Face Spaces provides sufficient VRAM for this model (H200 GPU with 70GB). ## Testing Your Deployment 1. **Wait for Build to Complete** - Check the build logs in your Space - Wait for "Running" status 2. **Test Basic Generation** - Try the default example: "Two anthropomorphic cats in comfy boxing gear fight on stage" - Generation should take 5-10 minutes 3. **Test Image-to-Video** - Upload a test image - Add a descriptive prompt - Verify video generation works ## Troubleshooting ### Critical: Import Order Issue **Issue**: `RuntimeError: CUDA has been initialized before importing the 'spaces' package` **Solution**: This is CRITICAL! The `spaces` package MUST be imported BEFORE any CUDA-related packages (torch, diffusers, etc.) **Correct import order in app.py:** ```python # IMPORTANT: spaces must be imported first import spaces # Standard library imports import os # Third-party imports (non-CUDA) import numpy as np from PIL import Image import gradio as gr # CUDA-related imports (must come after spaces) import torch from diffusers import WanPipeline, AutoencoderKLWan ``` **Why this matters**: Hugging Face Zero GPU needs to manage CUDA initialization. If torch or other CUDA libraries initialize CUDA before `spaces` is imported, Zero GPU cannot properly manage GPU allocation. ### Build Fails **Issue**: Requirements installation fails - **Solution**: Check `requirements.txt` for compatibility issues - Ensure PyTorch version is compatible with CUDA on Zero GPU - Make sure using latest Gradio version (5.49.0+) for security **Issue**: Out of memory during build - **Solution**: Zero GPU should have enough memory; check model loading code **Issue**: "Can't initialize NVML" warnings - **Solution**: These are normal in Zero GPU environment during build time - They should not affect runtime when GPU is allocated ### Runtime Errors **Issue**: "CUDA out of memory" - **Solution**: Reduce `num_frames` or image resolution - Check if Zero GPU is properly enabled in settings **Issue**: "Model not found" - **Solution**: Verify internet connection for model download - Check Hugging Face Hub status **Issue**: Generation timeout - **Solution**: Reduce inference steps or video length - Increase GPU duration in `@spaces.GPU(duration=XX)` **Issue**: Gradio security vulnerability warning - **Solution**: Update to Gradio 5.49.0 or later in requirements.txt - Check README.md YAML front matter has correct `sdk_version: 5.49.0` **Issue**: "ZeroGPU illegal duration! The requested GPU duration (Xs) is larger than the maximum allowed" - **Solution**: Reduce the duration parameter in `@spaces.GPU(duration=XX)` - For Pro accounts, use 180 seconds or less: `@spaces.GPU(duration=180)` - Free tier typically limited to 60 seconds - Optimize your default settings to complete within the time limit: - Reduce `num_frames` (e.g., 73 for 3 seconds instead of 121 for 5 seconds) - Reduce `num_inference_steps` (e.g., 35 instead of 50) ### Slow Generation **Issue**: Generation takes too long - **Solution**: This is expected; video generation is compute-intensive - Typical time: 2-3 minutes for 3-second video with optimized settings (73 frames, 35 steps) - Consider reducing `num_inference_steps` to 25-30 for faster (but lower quality) results - Note: Must complete within 180 seconds (3 minutes) for Pro, 60 seconds for Free tier ## Optimization Tips 1. **Current Optimized Settings** - Already optimized: `num_frames=73` (3 seconds) and `num_inference_steps=35` - These settings are designed to complete within 180-second Zero GPU limit - For even faster testing, reduce steps to 25-30 2. **Add Caching (Optional)** - Enable example caching with `cache_examples=True` to pre-generate examples - Note: This increases build time and storage requirements - Current setting: `cache_examples=False` for faster builds 3. **Queue Management** - Current setting: `demo.queue(max_size=20)` - Adjust based on expected traffic - Larger queue = more concurrent users but more resource usage ## Customization ### Change Default Model To use a different Wan2.2 variant, modify `app.py`: ```python # For larger model with better quality MODEL_ID = "Wan-AI/Wan2.2-T2V-A14B-Diffusers" # For image-to-video focused MODEL_ID = "Wan-AI/Wan2.2-I2V-A14B-Diffusers" ``` ### Adjust UI Modify the Gradio interface in `app.py`: - Change default values in sliders - Add more examples - Customize theme and styling ### Add Features Consider adding: - Video upscaling - Multiple video outputs - Batch generation - Download history - Custom aspect ratios ## Monitoring ### Check Space Status - Visit your Space URL - Check "Settings" → "Logs" for runtime logs - Monitor usage in "Settings" → "Analytics" ### Usage Limits Zero GPU on Hugging Face has: - Time limits per session - Concurrent user limits - Monthly compute quotas (check your tier) ## Support If you encounter issues: 1. **Check Logs**: Space logs often contain error details 2. **Hugging Face Forums**: https://discuss.huggingface.co/ 3. **Model Issues**: Report at Wan-AI's GitHub or model card 4. **Space Settings**: Verify Zero GPU is enabled and quota is available ## License This deployment uses: - Wan2.2 model (Apache 2.0) - Gradio (Apache 2.0) - Diffusers (Apache 2.0) Ensure compliance with all licenses when deploying. --- **Happy Deploying!** 🚀