Spaces:
Running
Running
| title: MLRC-BENCH | |
| emoji: 📊 | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: streamlit | |
| sdk_version: 1.39.0 | |
| app_file: app.py | |
| pinned: false | |
| license: cc-by-4.0 | |
| ## Installation & Setup | |
| 1. Clone the repository | |
| ```bash | |
| git clone https://huggingface.co/spaces/launch/MLRC_Bench | |
| cd MLRC_Bench | |
| ``` | |
| 2. Setup virtual env and install the required dependencies | |
| ```bash | |
| python -m venv env | |
| source env/bin/activate | |
| pip install -r requirements.txt | |
| ``` | |
| 3. Run the application | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| ### Updating Metrics | |
| To update the table, update the respective metric file in `src/data/metrics` directory | |
| ### Updating Text | |
| To update the tab on Benchmark details, make changes to the the following file - `src/components/tasks.py` | |
| To update the metric definitions, make changes to the following file - `src/components/tasks.py` | |
| ### Adding New Metrics | |
| To add a new metric: | |
| 1. Create a new JSON data file in the `src/data/metrics/` directory (e.g., `src/data/metrics/new_metric.json`) | |
| 2. Update `metrics_config` in `src/utils/config.py`: | |
| ```python | |
| metrics_config = { | |
| "Margin to Human": { ... }, | |
| "New Metric Name": { | |
| "file": "src/data/metrics/new_metric.json", | |
| "description": "Description of the new metric", | |
| "min_value": 0, | |
| "max_value": 100, | |
| "color_map": "viridis" | |
| } | |
| } | |
| ``` | |
| 3. Ensure your metric JSON file follows the same format as existing metrics: | |
| ```json | |
| { | |
| "task-name": { | |
| "model-name-1": value, | |
| "model-name-2": value | |
| }, | |
| "another-task": { | |
| "model-name-1": value, | |
| "model-name-2": value | |
| } | |
| } | |
| ``` | |
| ### Adding New Agent Types | |
| To add new agent types: | |
| 1. Update `model_categories` in `src/utils/config.py`: | |
| ```python | |
| model_categories = { | |
| "Existing Model": "Category", | |
| "New Model Name": "New Category" | |
| } | |
| ``` | |
| ## License | |
| [MIT License](LICENSE) | |