Spaces:
Sleeping
Sleeping
| title: MCP Video Analysis with Llama 3 | |
| emoji: π₯ | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 5.33.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| short_description: AI-powered video analysis with Llama 3 and Modal | |
| # π₯ MCP Video Analysis with Llama 3 | |
| This application provides comprehensive video analysis using the Model Context Protocol (MCP) to integrate multiple AI technologies: | |
| ## π§ Technology Stack | |
| - **Modal Backend**: Scalable cloud compute for video processing | |
| - **Whisper**: Speech-to-text transcription | |
| - **Computer Vision Models**: Object detection, action recognition, and captioning | |
| - **Meta Llama 3**: Advanced AI for intelligent content analysis, hosted on Modal | |
| - **MCP Protocol**: Model Context Protocol for seamless integration | |
| ## π― Features | |
| - **Transcription**: Extract spoken content from videos | |
| - **Visual Analysis**: Identify objects, actions, and scenes | |
| - **Content Understanding**: AI-powered insights and summaries | |
| - **Custom Queries**: Ask specific questions about video content | |
| ## π Usage | |
| 1. Enter a video URL (YouTube or direct link) | |
| 2. Optionally ask a specific question | |
| 3. Click "Analyze Video" to get comprehensive insights | |
| 4. Review both Llama 3's intelligent analysis and raw data | |
| ## π Environment Variables Required | |
| - `MODAL_LLAMA3_ENDPOINT_URL`: The URL for the deployed Llama 3 Modal service. | |
| - `MODAL_VIDEO_ANALYSIS_ENDPOINT_URL`: The URL for the video processing Modal service (optional, has a default value). | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |