Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

This repository contains the Gen-3Diffusion model, which achieves realistic image-to-3D generation by leveraging a pre-trained 2D diffusion model and a 3D diffusion model, as presented in the paper: Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Project Page: https://yuxuan-xue.com/gen-3diffusion Code: https://github.com/YuxuanSnow/Gen3Diffusion

Key Insight :raised_hands:

2D foundation models are powerful but output lacks 3D consistency!
3D generative models can reconstruct 3D representation but is poor in generalization!
How to combine 2D foundation models with 3D generative models?:
- they are both diffusion-based generative models => Can be synchronized at each diffusion step
- 2D foundation model helps 3D generation => provides strong prior informations about 3D shape
- 3D representation guides 2D diffusion sampling => use rendered output from 3D reconstruction for reverse sampling, where 3D consistency is guaranteed

Install

Same Conda environment to Human-3Diffusion. Please skip if you already installed it.

# Conda environment
conda create -n gen3diffusion python=3.10
conda activate gen3diffusion
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install xformers==0.0.22.post4 --index-url https://download.pytorch.org/whl/cu121

# Gaussian Opacity Fields
git clone https://github.com/YuxuanSnow/gaussian-opacity-fields.git
cd gaussian-opacity-fields && pip install submodules/diff-gaussian-rasterization
pip install submodules/simple-knn/ && cd ..
export CPATH=/usr/local/cuda-12.1/targets/x86_64-linux/include:$CPATH

# Dependencies
pip install -r requirements.txt

# TSDF Fusion (Mesh extraction) Dependencies
pip install --user numpy opencv-python scikit-image numba
pip install --user pycuda
pip install scipy==1.11

Pretrained Weights

Our pretrained weight can be downloaded from huggingface.

mkdir checkpoints_obj && cd checkpoints_obj
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model.safetensors
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model_1.safetensors
wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/pifuhd.pt
cd ..

The avatar reconstruction module is same to Human-3Diffusion. Please skip if you already installed Human-3Diffusion.

mkdir checkpoints_avatar && cd checkpoints_avatar
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model.safetensors
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model_1.safetensors
wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/pifuhd.pt
cd ..

Inference

# given one image of object, generate 3D-GS object
# subject should be centered in a square image, please crop properly 
# recenter plays a huge role in object reconstruction. Please adjust the recentering if the reconstruction doesn't work well
python infer.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj

# given generated 3D-GS, perform TSDF mesh extraction
python infer_mesh.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj --mesh_quality high

# given one image of human, generate 3D-GS avatar
# subject should be centered in a square image, please crop properly
python infer.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar

# given generated 3D-GS, perform TSDF mesh extraction
python infer_mesh.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar --mesh_quality high

Citation :writing_hand:

@inproceedings{xue2024gen3diffusion,
  title     = {{Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy }},
  author    = {Xue, Yuxuan and Xie, Xianghui and Marin, Riccardo and Pons-Moll, Gerard.},\
  journal   = {Arxiv},\
  year      = {2024},\
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

Image-to-3D

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for yuxuanx/human3diffusion

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Paper • 2412.06698 • Published Dec 9, 2024