# Real Data Implementation Guide ## Overview The crypto monitoring API has been upgraded from mock data to **real provider-backed data**. This document explains the changes and how to use the new functionality. ## What Changed ### Files Modified 1. **`api_server_extended.py`** - Main API server - Added imports for `ProviderFetchHelper`, `CryptoDatabase`, and `os` - Added `fetch_helper` and `db` global instances - Added `USE_MOCK_DATA` environment flag - Replaced 5 mock endpoints with real implementations: - `GET /api/market` - Now fetches from CoinGecko - `GET /api/sentiment` - Now fetches from Alternative.me - `GET /api/trending` - Now fetches from CoinGecko - `GET /api/defi` - Returns 503 (requires DeFi provider) - `POST /api/hf/run-sentiment` - Returns 501 (requires ML models) - Added new endpoint: `GET /api/market/history` - Historical data from SQLite 2. **`provider_fetch_helper.py`** - New file - Implements `ProviderFetchHelper` class - Provides `fetch_from_pool()` method for pool-based fetching - Provides `fetch_from_provider()` method for direct provider access - Integrates with existing ProviderManager, circuit breakers, and logging - Handles automatic failover and retry logic 3. **`test_real_data.py`** - New file - Test script to verify real data endpoints - Tests all modified endpoints - Provides clear pass/fail results ## Architecture ### Data Flow ``` Client Request ↓ FastAPI Endpoint (api_server_extended.py) ↓ ProviderFetchHelper.fetch_from_provider() ↓ ProviderManager → Get Provider Config ↓ aiohttp → HTTP Request to External API ↓ Response Processing & Normalization ↓ Database Storage (SQLite) ↓ JSON Response to Client ``` ### Provider Integration The implementation uses the **existing provider management system**: - **Provider Configs**: Loaded from JSON files (providers_config_extended.json, etc.) - **Circuit Breakers**: Automatic failure detection and recovery - **Metrics**: Success rate, response time, request counts - **Logging**: All requests logged with provider_id and details - **Health Checks**: Existing health check system continues to work ## API Endpoints ### 1. GET /api/market **Real Data Mode** (default): ```bash curl http://localhost:8000/api/market ``` Response: ```json { "mode": "real", "cryptocurrencies": [ { "rank": 1, "name": "Bitcoin", "symbol": "BTC", "price": 43250.50, "change_24h": 2.35, "market_cap": 845000000000, "volume_24h": 28500000000 } ], "source": "CoinGecko", "timestamp": "2025-01-15T10:30:00Z", "response_time_ms": 245 } ``` **Mock Mode**: ```bash USE_MOCK_DATA=true python main.py curl http://localhost:8000/api/market ``` ### 2. GET /api/market/history **New endpoint** for historical price data from database: ```bash curl "http://localhost:8000/api/market/history?symbol=BTC&limit=10" ``` Response: ```json { "symbol": "BTC", "count": 10, "history": [ { "symbol": "BTC", "name": "Bitcoin", "price_usd": 43250.50, "volume_24h": 28500000000, "market_cap": 845000000000, "percent_change_24h": 2.35, "rank": 1, "timestamp": "2025-01-15 10:30:00" } ] } ``` ### 3. GET /api/sentiment **Real Data Mode**: ```bash curl http://localhost:8000/api/sentiment ``` Response: ```json { "mode": "real", "fear_greed_index": { "value": 62, "classification": "Greed", "timestamp": "1705315800", "time_until_update": "43200" }, "source": "alternative.me" } ``` ### 4. GET /api/trending **Real Data Mode**: ```bash curl http://localhost:8000/api/trending ``` Response: ```json { "mode": "real", "trending": [ { "name": "Solana", "symbol": "SOL", "thumb": "https://...", "market_cap_rank": 5, "score": 0 } ], "source": "CoinGecko", "timestamp": "2025-01-15T10:30:00Z" } ``` ### 5. GET /api/defi **Status**: Not implemented (requires DeFi provider) ```bash curl http://localhost:8000/api/defi ``` Response: ```json { "detail": "DeFi TVL data provider not configured. Add DefiLlama or similar provider to enable this endpoint." } ``` **Status Code**: 503 Service Unavailable ### 6. POST /api/hf/run-sentiment **Status**: Not implemented (requires ML models) ```bash curl -X POST http://localhost:8000/api/hf/run-sentiment \ -H "Content-Type: application/json" \ -d '{"texts": ["Bitcoin is bullish"]}' ``` Response: ```json { "detail": "Real ML-based sentiment analysis is not yet implemented. This endpoint is reserved for future integration with HuggingFace transformer models. Set USE_MOCK_DATA=true for demo mode with keyword-based sentiment." } ``` **Status Code**: 501 Not Implemented ## Environment Variables ### USE_MOCK_DATA Controls whether endpoints return real or mock data. **Default**: `false` (real data) **Usage**: ```bash # Real data (default) python main.py # Mock data (for demos) USE_MOCK_DATA=true python main.py # Docker docker run -e USE_MOCK_DATA=false -p 8000:8000 crypto-monitor ``` **Behavior**: - `false` or unset: All endpoints fetch real data from providers - `true`: Endpoints return mock data (for testing/demos) ## Provider Configuration ### Required Providers The following providers must be configured in `providers_config_extended.json`: 1. **coingecko** - For market data and trending - Endpoints: `simple_price`, `trending` - No API key required (free tier) - Rate limit: 50 req/min 2. **alternative.me** - For sentiment (Fear & Greed Index) - Direct HTTP call (not in provider config) - No API key required - Public API ### Optional Providers 3. **DefiLlama** - For DeFi TVL data - Not currently configured - Would enable `/api/defi` endpoint ### Adding New Providers To add a new provider: 1. Edit `providers_config_extended.json`: ```json { "providers": { "your_provider": { "name": "Your Provider", "category": "market_data", "base_url": "https://api.example.com", "endpoints": { "prices": "/v1/prices" }, "rate_limit": { "requests_per_minute": 60 }, "requires_auth": false, "priority": 8, "weight": 80 } } } ``` 2. Use in endpoint: ```python result = await fetch_helper.fetch_from_provider( "your_provider", "prices", params={"symbols": "BTC,ETH"} ) ``` ## Database Integration ### Schema The SQLite database (`data/crypto_aggregator.db`) stores: **prices table**: - symbol, name, price_usd, volume_24h, market_cap - percent_change_1h, percent_change_24h, percent_change_7d - rank, timestamp ### Automatic Storage When `/api/market` is called: 1. Real data is fetched from CoinGecko 2. Each asset is automatically saved to the database 3. Historical data accumulates over time 4. Query with `/api/market/history` ### Manual Queries ```python from database import CryptoDatabase db = CryptoDatabase() # Get recent prices with db.get_connection() as conn: cursor = conn.cursor() cursor.execute(""" SELECT * FROM prices WHERE symbol = 'BTC' ORDER BY timestamp DESC LIMIT 100 """) rows = cursor.fetchall() ``` ## Testing ### Automated Tests ```bash # Start server python main.py # In another terminal, run tests python test_real_data.py ``` ### Manual Testing ```bash # Test market data curl http://localhost:8000/api/market # Test with parameters curl "http://localhost:8000/api/market/history?symbol=ETH&limit=5" # Test sentiment curl http://localhost:8000/api/sentiment # Test trending curl http://localhost:8000/api/trending # Check health curl http://localhost:8000/health # View API docs open http://localhost:8000/docs ``` ## Error Handling ### Provider Unavailable If a provider is down: ```json { "detail": "All providers in pool 'market_primary' failed. Last error: Connection timeout" } ``` **Status Code**: 503 ### Provider Not Configured If required provider missing: ```json { "detail": "Market data provider (CoinGecko) not configured" } ``` **Status Code**: 503 ### Database Error If database operation fails: ```json { "detail": "Database error: unable to open database file" } ``` **Status Code**: 500 ## Monitoring ### Logs All requests are logged to `logs/` directory: ``` INFO - Successfully fetched from CoinGecko provider_id: coingecko endpoint: simple_price response_time_ms: 245 pool: market_primary ``` ### Metrics Provider metrics are updated automatically: - `total_requests` - `successful_requests` - `failed_requests` - `avg_response_time` - `success_rate` - `consecutive_failures` View metrics: ```bash curl http://localhost:8000/api/providers/coingecko ``` ### Health Checks Existing health check system continues to work: ```bash curl http://localhost:8000/api/providers/coingecko/health-check ``` ## Deployment ### Docker ```bash # Build docker build -t crypto-monitor . # Run with real data (default) docker run -p 8000:8000 crypto-monitor # Run with mock data docker run -e USE_MOCK_DATA=true -p 8000:8000 crypto-monitor ``` ### Hugging Face Spaces The service is ready for HF Spaces deployment: 1. Push to HF Space repository 2. Set Space SDK to "Docker" 3. Optionally set `USE_MOCK_DATA` in Space secrets 4. Service will start automatically ## Future Enhancements ### Planned 1. **Pool-based fetching**: Use provider pools instead of direct provider access 2. **ML sentiment analysis**: Load HuggingFace models for real sentiment 3. **DeFi integration**: Add DefiLlama provider 4. **Caching layer**: Redis for frequently accessed data 5. **Rate limiting**: Per-client rate limits 6. **Authentication**: API key management ### Contributing To add real data for a new endpoint: 1. Identify the provider and endpoint 2. Add provider to config if needed 3. Use `fetch_helper.fetch_from_provider()` in endpoint 4. Normalize response to consistent schema 5. Add database storage if applicable 6. Update tests and documentation ## Troubleshooting ### "Provider not configured" **Solution**: Check `providers_config_extended.json` has the required provider ### "All providers failed" **Solution**: - Check internet connectivity - Verify provider URLs are correct - Check rate limits haven't been exceeded - View logs for detailed error messages ### "Database error" **Solution**: - Ensure `data/` directory exists and is writable - Check disk space - Verify SQLite is installed ### Mock data still showing **Solution**: - Ensure `USE_MOCK_DATA` is not set or is set to `false` - Restart the server - Check environment variables: `env | grep USE_MOCK_DATA` ## Summary ✅ **Real data** is now the default for all crypto endpoints ✅ **Database integration** stores historical prices ✅ **Provider management** uses existing sophisticated system ✅ **Graceful degradation** with clear error messages ✅ **Mock mode** available for demos via environment flag ✅ **Production-ready** for deployment The API is now a fully functional crypto data service, not just a monitoring platform!