Smikke's picture
Deploy optimized Wan2.2 video generation with Zero GPU support
d16eb70 verified

A newer version of the Gradio SDK is available: 6.0.2

Upgrade
metadata
title: Wan2.2 Video Generation
emoji: πŸŽ₯
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.49.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - video-generation
  - text-to-video
  - image-to-video
  - diffusers
  - wan
  - ai-video
  - zero-gpu
python_version: '3.10'

Wan2.2 Video Generation πŸŽ₯

Generate high-quality videos from text prompts or images using the powerful Wan2.2-TI2V-5B model!

This Space provides an easy-to-use interface for creating videos with state-of-the-art AI technology.

Features ✨

  • Text-to-Video: Generate videos from descriptive text prompts
  • Image-to-Video: Animate your images by adding an input image
  • High Quality: 720P resolution at 24fps
  • Customizable: Adjust resolution, number of frames, guidance scale, and more
  • Reproducible: Use seeds to recreate your favorite generations

Model Information πŸ€–

Wan2.2-TI2V-5B is a unified text-to-video and image-to-video generation model with:

  • 5 billion parameters optimized for consumer-grade GPUs
  • 720P resolution support (1280x704 default)
  • 24 fps smooth video output
  • Optimized duration: Default 3 seconds (optimized for Zero GPU limits)

The model uses a Mixture-of-Experts (MoE) architecture and delivers outstanding video generation quality, surpassing many commercial models.

How to Use πŸš€

Text-to-Video Generation

  1. Enter your prompt describing the video you want to create
  2. Adjust settings in "Advanced Settings" if desired
  3. Click "Generate Video"
  4. Wait for generation (typically 2-3 minutes on Zero GPU with default settings)

Image-to-Video Generation

  1. Upload an input image
  2. Enter a prompt describing how the image should animate
  3. Click "Generate Video"
  4. The output will maintain the aspect ratio of your input image
  5. Generation takes 2-3 minutes with optimized settings

Advanced Settings βš™οΈ

  • Width/Height: Video resolution (default: 1280x704)
  • Number of Frames: Longer videos need more frames (default: 73 frames β‰ˆ 3 seconds, max: 145)
  • Inference Steps: More steps = better quality but slower (default: 35, optimized for speed)
  • Guidance Scale: How closely to follow the prompt (default: 5.0)
  • Seed: Set a specific seed for reproducible results

Note: Settings are optimized to complete within Zero GPU's 3-minute time limit for Pro users.

Tips for Best Results πŸ’‘

  1. Detailed Prompts: Be specific about what you want to see

    • Good: "Two anthropomorphic cats in comfy boxing gear fight on stage with dramatic lighting"
    • Basic: "cats fighting"
  2. Image-to-Video: Use clear, high-quality input images that match your prompt

  3. Quality vs Speed (optimized for Zero GPU limits):

    • Fast: 25-30 steps (~2 minutes)
    • Balanced: 35 steps (default, ~2-3 minutes)
    • Higher Quality: 40-50 steps (~3+ minutes, may timeout)
  4. Experiment: Try different guidance scales:

    • Lower (3-4): More creative, less literal
    • Default (5): Good balance
    • Higher (7-10): Strictly follows prompt

Example Prompts πŸ“

  • "Two anthropomorphic cats in comfy boxing gear fight on stage"
  • "A serene underwater scene with colorful coral reefs and tropical fish swimming gracefully"
  • "A bustling futuristic city at night with neon lights and flying cars"
  • "A peaceful mountain landscape with snow-capped peaks and a flowing river"
  • "An astronaut riding a horse through a nebula in deep space"
  • "A dragon flying over a medieval castle at sunset"

Technical Details πŸ”§

  • Model: Wan-AI/Wan2.2-TI2V-5B-Diffusers
  • Framework: Hugging Face Diffusers
  • Backend: PyTorch with bfloat16 precision
  • GPU: Hugging Face Zero GPU (H200 with 70GB VRAM, automatically allocated)
  • GPU Duration: 180 seconds (3 minutes) for Pro users
  • Generation Time: ~2-3 minutes with optimized settings (73 frames, 35 steps)

Limitations ⚠️

  • Generation requires compute time (2-3 minutes with default settings)
  • Zero GPU allocation is time-limited (3 minutes for Pro, 60 seconds for Free)
  • Videos longer than 6 seconds (145 frames) may timeout
  • Higher quality settings (50+ steps) may timeout on Zero GPU
  • Complex scenes with many objects may be challenging

Credits πŸ™

License πŸ“„

This Space uses the Wan2.2 model which is released under Apache 2.0 license.

Related Links πŸ”—


Note: This is a community-created Space for easy access to Wan2.2 video generation. Generation times may vary based on current GPU availability.