Visual Document Retrieval
π
2
Demo for multimodal embedding models
Detect objects in images or videos
Create a new picture of yourself in any style with a single prompt
Swap faces between two images
Modify images using text guidance
Generate captions for music audio
Chat with an AI assistant using text and images
Create a custom story with characters and plot
BLIP2 (cutting edge image captioning) in π€transformers