jeuko commited on
Commit
8018595
·
verified ·
1 Parent(s): 78c7282

Sync from GitHub (main)

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .codespell-ignore.txt +4 -0
  2. .devcontainer/devcontainer.json +23 -0
  3. .devcontainer/setup.sh +65 -0
  4. .dockerignore +57 -0
  5. .env.example +6 -0
  6. .github/ISSUE_TEMPLATE/build.yaml +19 -0
  7. .github/ISSUE_TEMPLATE/chore.yaml +12 -0
  8. .github/ISSUE_TEMPLATE/ci.yaml +11 -0
  9. .github/ISSUE_TEMPLATE/docs.yaml +12 -0
  10. .github/ISSUE_TEMPLATE/feat.yaml +19 -0
  11. .github/ISSUE_TEMPLATE/fix.yaml +24 -0
  12. .github/ISSUE_TEMPLATE/perf.yaml +12 -0
  13. .github/ISSUE_TEMPLATE/refactor.yaml +19 -0
  14. .github/ISSUE_TEMPLATE/style.yaml +12 -0
  15. .github/ISSUE_TEMPLATE/test.yaml +12 -0
  16. .github/actions/tools/huggingface/action.yaml +68 -0
  17. .github/actions/tools/huggingface/secrets.py +91 -0
  18. .github/actions/tools/pr-title-generator/action.yaml +52 -0
  19. .github/actions/tools/pre-commit/action.yaml +74 -0
  20. .github/actions/tools/pytest/action.yaml +104 -0
  21. .github/actions/tools/pytest/markdown.py +89 -0
  22. .github/pull_request_template.md +5 -0
  23. .github/workflows/chore.yaml +28 -0
  24. .github/workflows/main.yaml +67 -0
  25. .gitignore +100 -0
  26. .pre-commit-config.yaml +111 -0
  27. .streamlit/config.toml +6 -0
  28. AGENTS.md +55 -0
  29. Dockerfile +38 -0
  30. GEMINI.md +55 -0
  31. README.md +169 -8
  32. RISK_MODELS.md +587 -0
  33. apps/__init__.py +1 -0
  34. apps/api/__init__.py +1 -0
  35. apps/api/main.py +121 -0
  36. apps/cli/__init__.py +1 -0
  37. apps/cli/main.py +539 -0
  38. apps/streamlit_ui/__init__.py +1 -0
  39. apps/streamlit_ui/main.py +71 -0
  40. apps/streamlit_ui/page_versions/profile/v1.py +20 -0
  41. apps/streamlit_ui/page_versions/profile/v2.py +246 -0
  42. apps/streamlit_ui/pages/1_Profile.py +266 -0
  43. apps/streamlit_ui/pages/2_Configuration.py +131 -0
  44. apps/streamlit_ui/pages/3_Assessment.py +249 -0
  45. apps/streamlit_ui/pages/4_Risk_Scores.py +62 -0
  46. apps/streamlit_ui/pages/__init__.py +0 -0
  47. apps/streamlit_ui/ui_utils.py +41 -0
  48. configs/config.yaml +14 -0
  49. configs/knowledge_base/dx_protocols/mammography_screening.yaml +46 -0
  50. configs/model/chatgpt_o1.yaml +2 -0
.codespell-ignore.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ Demog
2
+ ONS
3
+ Claus
4
+ claus
.devcontainer/devcontainer.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // """Dev container Local development"""
2
+ {
3
+ "name": "sentinel",
4
+ // "dockerFile": "Dockerfile",
5
+ "image": "python:3.12-slim",
6
+ // "initializeCommand": ". ./.env",
7
+ "postCreateCommand": "bash ./.devcontainer/setup.sh",
8
+ "build": {
9
+ "args": {},
10
+ "options": [
11
+ "--platform=linux/amd64"
12
+ ]
13
+ },
14
+ "runArgs": [
15
+ "--platform=linux/amd64",
16
+ "--add-host=host.docker.internal:host-gateway"
17
+ ],
18
+ "remoteUser": "root",
19
+ "containerUser": "root",
20
+ "mounts": [
21
+ "source=/var/run/docker.sock,target=/var/run/docker.sock,type=bind"
22
+ ]
23
+ }
.devcontainer/setup.sh ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ set -ex
3
+
4
+ # Update package lists
5
+ apt-get update
6
+
7
+ # ----- Linux Packages ----- #
8
+
9
+ apt-get install -y curl wget
10
+
11
+ # ----- Locales ----- #
12
+ # Install locales and configure
13
+ apt-get install -y locales
14
+ echo "en_US.UTF-8 UTF-8" > /etc/locale.gen
15
+ locale-gen en_US.UTF-8
16
+ update-locale LANG=en_US.UTF-8
17
+
18
+ # ----------------- Python -----------------
19
+
20
+ # Update package lists
21
+ apt-get update
22
+
23
+ # Install necessary packages
24
+ apt-get install -y ssh locales git
25
+
26
+ # Configure locale
27
+ echo "en_US.UTF-8 UTF-8" > /etc/locale.gen
28
+ locale-gen
29
+
30
+ # Git configuration
31
+ git config --global --add safe.directory /workspaces/sentinel
32
+
33
+ # Install Python package in editable mode
34
+ pip install --editable .
35
+
36
+ # Stash any changes before rebuilding the container
37
+ git stash push -m "Stashed changes before (re)building the container"
38
+ git stash apply 0
39
+
40
+
41
+ # ----------------- Docker -----------------
42
+
43
+ apt-get update && apt-get install -y docker.io && apt-get clean -y
44
+
45
+ # ----------------- Google Cloud SDK -----------------
46
+
47
+ # Install prerequisites for Google Cloud SDK
48
+ apt-get install -y apt-transport-https ca-certificates gnupg curl
49
+
50
+ # Import the Google Cloud public key
51
+ curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg
52
+
53
+ # Add the Google Cloud SDK repository
54
+ echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | tee /etc/apt/sources.list.d/google-cloud-sdk.list
55
+
56
+ # Update package lists again with new repository
57
+ apt-get update
58
+
59
+ # Install Google Cloud CLI
60
+ apt-get install -y google-cloud-cli
61
+
62
+ # Authenticate Docker with Google Cloud
63
+ gcloud auth configure-docker -q gcr.io
64
+
65
+ # gcloud auth login --project <sentinel> --no-launch-browser
.dockerignore ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python cache
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+
8
+ # Virtual environments
9
+ .venv/
10
+ venv/
11
+ ENV/
12
+ env/
13
+
14
+ # IDE
15
+ .vscode/
16
+ .idea/
17
+ *.swp
18
+ *.swo
19
+ *~
20
+
21
+ # Git
22
+ .git/
23
+ .gitignore
24
+ .github/
25
+
26
+ # Testing
27
+ .pytest_cache/
28
+ .coverage
29
+ htmlcov/
30
+ *.cover
31
+
32
+ # Documentation
33
+ *.md
34
+ !README.md
35
+ docs/
36
+
37
+ # Environment files
38
+ .env
39
+ .env.*
40
+
41
+ # Build artifacts
42
+ build/
43
+ dist/
44
+ *.egg-info/
45
+
46
+ # Jupyter
47
+ .ipynb_checkpoints/
48
+ *.ipynb
49
+
50
+ # OS
51
+ .DS_Store
52
+ Thumbs.db
53
+
54
+ # Temporary files
55
+ tmp/
56
+ temp/
57
+ *.log
.env.example ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ # Rename this file to .env and fill in your API keys
2
+ GOOGLE_API_KEY="your_google_api_key_here"
3
+ OPENAI_API_KEY="your_openai_api_key_here"
4
+
5
+ # Local Ollama server
6
+ OLLAMA_BASE_URL=http://localhost:11434
.github/ISSUE_TEMPLATE/build.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Build request"
2
+ description: Changes to the build system or dependencies, such as build scripts or configuration updates.
3
+ title: "build(): "
4
+ labels: [build]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: unclear_section
9
+ attributes:
10
+ label: What dependency/dockerfile/image should be changed?
11
+ description: Inform what should be changed.
12
+ placeholder: Inform where and what should be changed.
13
+
14
+ - type: textarea
15
+ id: solution_description
16
+ attributes:
17
+ label: Describe the change you'd like
18
+ description: Inform the change requested.
19
+ placeholder: Bump package X from version 1.0.0 to version 1.0.1
.github/ISSUE_TEMPLATE/chore.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Chore request"
2
+ description: Routine tasks or administrative updates not directly related to code functionality.
3
+ title: "chore(): "
4
+ labels: [chore]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: solution_description
9
+ attributes:
10
+ label: Describe the solution you'd like
11
+ description: The solution that should be implemented.
12
+ placeholder: E.g. Add secrets to GitHub Secret.
.github/ISSUE_TEMPLATE/ci.yaml ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "CI request"
2
+ description: Modifications related to continuous integration and deployment processes.
3
+ title: "ci: "
4
+ labels: [ci]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: unclear_section
9
+ attributes:
10
+ label: What CI modification is needed?
11
+ description: Provide a clear and concise description of what should be done and where.
.github/ISSUE_TEMPLATE/docs.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Documentation request"
2
+ description: Request documentation for functions, scripts, modules, etc.
3
+ title: "docs(): "
4
+ labels: [docs]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: unclear_section
9
+ attributes:
10
+ label: What is not clear for you?
11
+ description: Provide a clear and concise description of what is unclear. For example, mention specific parts of the code or documentation that are difficult to understand.
12
+ placeholder: Describe the problem you encountered.
.github/ISSUE_TEMPLATE/feat.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Feature request"
2
+ description: Request a new feature or enhancement to existing functionality.
3
+ title: "feat(): "
4
+ labels: [feat]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: tasks
9
+ attributes:
10
+ label: What are the tasks?
11
+ description: Provide a clear and concise description of the tasks to be completed. Include details about priority, urgency, and due dates.
12
+ placeholder: List and describe the tasks.
13
+
14
+ - type: textarea
15
+ id: deliverables
16
+ attributes:
17
+ label: What are the expected deliverables?
18
+ description: Provide a detailed description of the expected deliverables, including minimal deliverables, nice-to-have features, and follow-up actions.
19
+ placeholder: Describe the deliverables and outcomes.
.github/ISSUE_TEMPLATE/fix.yaml ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Fix request"
2
+ description: Bug fixes or patches to resolve issues in the codebase.
3
+ title: "fix(): "
4
+ labels: [bug, fix]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: description
9
+ attributes:
10
+ label: Describe the bug
11
+ description: A clear and concise description of what the bug is. Include the current behavior versus the expected behavior. Add any other context about the problem here as well.
12
+ placeholder: What brings you to realize this bug?
13
+
14
+ - type: textarea
15
+ id: to_reproduce
16
+ attributes:
17
+ label: To reproduce
18
+ description: Code snippet to reproduce the bug if possible.
19
+
20
+ - type: textarea
21
+ id: solution
22
+ attributes:
23
+ label: Proposed solution
24
+ description: What solution do you propose to fix the bug? What are the alternatives?
.github/ISSUE_TEMPLATE/perf.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Performance refactor request"
2
+ description: Performance improvements, such as optimizations to make the code faster or more efficient.
3
+ title: "perf(): "
4
+ labels: [perf]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: challenges
9
+ attributes:
10
+ label: What should have performance improvement?
11
+ description: Performance improvements, such as optimizations to make the code faster or more efficient.
12
+ placeholder: A clear and concise description of what should have a performance improvement.
.github/ISSUE_TEMPLATE/refactor.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Code refactor request"
2
+ description: Request for code refactoring to improve code quality and structure.
3
+ title: "refactor(): "
4
+ labels: [refactor]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: challenges
9
+ attributes:
10
+ label: What should be changed?
11
+ description: A clear and concise description of what and where should be changed.
12
+ placeholder: What issues have you identified in the current code?
13
+
14
+ - type: textarea
15
+ id: suggestions
16
+ attributes:
17
+ label: What are the suggestions?
18
+ description: A description of the proposed code or file structure. Include advantages and disadvantages. Note refactoring should not break existing functionality.
19
+ placeholder: What changes do you propose?
.github/ISSUE_TEMPLATE/style.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: " Style refactor request"
2
+ description: Changes to code formatting and style, without affecting functionality.
3
+ title: "style(): "
4
+ labels: [style]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: challenges
9
+ attributes:
10
+ label: What are the style that needs to be changed?
11
+ description: A clear and concise description of what needs to be changed.
12
+ placeholder: What issues have you identified in the current code?
.github/ISSUE_TEMPLATE/test.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "Tests"
2
+ description: Updates related to testing, including adding or modifying test cases.
3
+ title: "test(): "
4
+ labels: [test]
5
+ projects: [instadeepai/141]
6
+ body:
7
+ - type: textarea
8
+ id: challenges
9
+ attributes:
10
+ label: What needs to be tested?
11
+ description: A clear and concise description of what needs to be tested and where.
12
+ placeholder: What needs to be tested in the current code?
.github/actions/tools/huggingface/action.yaml ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: 'HuggingFace Space'
2
+ description: 'Push to a HuggingFace Space repository'
3
+ inputs:
4
+ token:
5
+ description: 'Hugging Face API token'
6
+ required: true
7
+ space:
8
+ description: 'Hugging Face Space name'
9
+ required: true
10
+ branch:
11
+ description: 'Branch to push to'
12
+ required: true
13
+ runtime-secrets:
14
+ description: 'Runtime secrets to sync to HuggingFace Space'
15
+ required: false
16
+ runs:
17
+ using: 'composite'
18
+ steps:
19
+ - name: Checkout repository
20
+ uses: actions/checkout@v5
21
+ with:
22
+ fetch-depth: 0
23
+ lfs: true
24
+
25
+ - name: Check large files
26
+ uses: ActionsDesk/lfs-warning@v2.0
27
+ with:
28
+ filesizelimit: 10485760 # this is 10MB so we can sync to HF Spaces
29
+
30
+ - name: Install HuggingFace CLI
31
+ shell: bash
32
+ run: pip install -U "huggingface_hub[cli]"
33
+
34
+ - name: Push to HuggingFace Space
35
+ shell: bash
36
+ env:
37
+ HF_TOKEN: ${{ inputs.token }}
38
+ run: |
39
+ export PATH="$HOME/.local/bin:$PATH"
40
+ hf auth login --token $HF_TOKEN
41
+ hf upload ${{ inputs.space }} . . --repo-type=space --revision=${{ inputs.branch }} --commit-message="Sync from GitHub (${{ inputs.branch }})"
42
+
43
+ - name: Configure Space Secrets
44
+ if: ${{ inputs.runtime-secrets != '' }}
45
+ shell: bash
46
+ env:
47
+ HF_SPACE: ${{ inputs.space }}
48
+ HF_TOKEN: ${{ inputs.token }}
49
+ RUNTIME_SECRETS: ${{ inputs.runtime-secrets }}
50
+ run: |
51
+ python3 ${GITHUB_ACTION_PATH}/secrets.py
52
+
53
+ - name: Create deployment summary
54
+ shell: bash
55
+ run: |
56
+ if [ "${{ inputs.branch }}" = "main" ]; then
57
+ SPACE_URL="https://huggingface.co/spaces/${{ inputs.space }}"
58
+ BRANCH_TEXT="main"
59
+ else
60
+ SPACE_URL="https://huggingface.co/spaces/${{ inputs.space }}/tree/${{ inputs.branch }}"
61
+ BRANCH_TEXT="${{ inputs.branch }}"
62
+ fi
63
+
64
+ echo "## 🚀 HuggingFace Space Deployment" >> $GITHUB_STEP_SUMMARY
65
+ echo "" >> $GITHUB_STEP_SUMMARY
66
+ echo "✅ Successfully deployed to **${BRANCH_TEXT}** branch" >> $GITHUB_STEP_SUMMARY
67
+ echo "" >> $GITHUB_STEP_SUMMARY
68
+ echo "🔗 **App URL:** ${SPACE_URL}" >> $GITHUB_STEP_SUMMARY
.github/actions/tools/huggingface/secrets.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Sync secrets from GitHub Actions to HuggingFace Space.
4
+
5
+ Reads configuration from environment variables:
6
+ HF_SPACE: HuggingFace Space repository ID (e.g., "InstaDeepAI/sentinel")
7
+ HF_TOKEN: HuggingFace API token
8
+ RUNTIME_SECRETS: Multi-line string with secrets in format "KEY: value"
9
+ """
10
+
11
+ import logging
12
+ import os
13
+
14
+ from huggingface_hub import HfApi
15
+
16
+ # Configure logging
17
+ logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
18
+
19
+
20
+ def extract(payload):
21
+ """Parse secrets from YAML-like format.
22
+
23
+ Args:
24
+ payload: Multi-line string with secrets in format "KEY: value"
25
+
26
+ Returns:
27
+ Dictionary mapping secret keys to values
28
+ """
29
+ secrets = {}
30
+ if not payload:
31
+ return secrets
32
+
33
+ for line in payload.strip().split("\n"):
34
+ line = line.strip()
35
+ if ":" in line and line:
36
+ key, value = line.split(":", 1)
37
+ key = key.strip()
38
+ value = value.strip()
39
+
40
+ if key and value: # Only add if both key and value are non-empty
41
+ secrets[key] = value
42
+
43
+ return secrets
44
+
45
+
46
+ def upload(repository, token, payload):
47
+ """Sync secrets to HuggingFace Space.
48
+
49
+ Args:
50
+ repository: HuggingFace Space repository ID
51
+ token: HuggingFace API token
52
+ payload: Multi-line string with secrets in format "KEY: value"
53
+
54
+ Raises:
55
+ RuntimeError: If any secret fails to sync
56
+ """
57
+ client = HfApi(token=token)
58
+ secrets = extract(payload)
59
+
60
+ if not secrets:
61
+ logging.info("No runtime secrets to configure")
62
+ return
63
+
64
+ count = 0
65
+ for key, value in secrets.items():
66
+ try:
67
+ client.add_space_secret(repo_id=repository, key=key, value=value)
68
+ logging.info("Added %s secret to HuggingFace Space", key)
69
+ count += 1
70
+ except Exception as e:
71
+ logging.error("Failed to add %s: %s", key, e)
72
+ raise RuntimeError(f"Failed to sync secret {key}") from e
73
+
74
+ logging.info("Successfully configured %d secret(s)", count)
75
+
76
+
77
+ if __name__ == "__main__":
78
+ # Read configuration from environment variables
79
+ repository = os.getenv("HF_SPACE")
80
+ token = os.getenv("HF_TOKEN")
81
+ payload = os.getenv("RUNTIME_SECRETS", "")
82
+
83
+ # Validate required environment variables
84
+ if not repository:
85
+ raise ValueError("HF_SPACE environment variable is required")
86
+
87
+ if not token:
88
+ raise ValueError("HF_TOKEN environment variable is required")
89
+
90
+ # Run the sync - any exceptions will naturally exit with code 1
91
+ upload(repository, token, payload)
.github/actions/tools/pr-title-generator/action.yaml ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: 'PR Title Generator'
2
+ description: 'Updates PR title and body based on issue number from branch name'
3
+
4
+ inputs:
5
+ github-token:
6
+ description: 'GitHub token for API access'
7
+ required: true
8
+
9
+ runs:
10
+ using: 'composite'
11
+ steps:
12
+ - name: Checkout repository
13
+ uses: actions/checkout@v4
14
+
15
+ - name: Git config
16
+ shell: bash
17
+ run: |
18
+ git config --global --add safe.directory '*'
19
+
20
+ - name: Install GitHub CLI
21
+ shell: bash
22
+ run: |
23
+ (type -p wget >/dev/null || (sudo apt update && sudo apt-get install wget -y)) \
24
+ && sudo mkdir -p -m 755 /etc/apt/keyrings \
25
+ && out=$(mktemp) && wget -nv -O$out https://cli.github.com/packages/githubcli-archive-keyring.gpg \
26
+ && cat $out | sudo tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \
27
+ && sudo chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
28
+ && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
29
+ && sudo apt update \
30
+ && sudo apt install gh -y
31
+
32
+ - name: Update PR Title and Body
33
+ shell: bash
34
+ env:
35
+ GITHUB_TOKEN: ${{ inputs.github-token }}
36
+ run: |
37
+ branch_name="${{ github.event.pull_request.head.ref }}"
38
+ issue_number=$(echo "$branch_name" | grep -o '^[0-9]\+')
39
+
40
+ if [ -z "$issue_number" ]; then
41
+ echo "Error: Branch name does not start with an issue number"
42
+ exit 1
43
+ fi
44
+
45
+ # Update PR title
46
+ issue_title=$(gh api "/repos/instadeepai/sentinel/issues/$issue_number" --jq '.title')
47
+ gh pr edit ${{ github.event.pull_request.number }} --title "$issue_title"
48
+
49
+ # Update PR body
50
+ current_body=$(gh pr view ${{ github.event.pull_request.number }} --json body --jq '.body')
51
+ updated_body=$(echo "$current_body" | sed "s/(issue)/#$issue_number/g")
52
+ gh pr edit ${{ github.event.pull_request.number }} --body "$updated_body"
.github/actions/tools/pre-commit/action.yaml ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: 'Pre-commit'
2
+ description: 'Pre-commit'
3
+
4
+ runs:
5
+ using: 'composite'
6
+ steps:
7
+ - name: Set up Python
8
+ uses: actions/setup-python@v5
9
+ with:
10
+ python-version: '3.12'
11
+
12
+ - name: Install uv
13
+ uses: astral-sh/setup-uv@v6
14
+ with:
15
+ enable-cache: true
16
+
17
+ - name: Install dependencies
18
+ shell: bash
19
+ run: |
20
+ uv sync --frozen --all-extras
21
+
22
+ - name: Install pre-commit hooks
23
+ shell: bash
24
+ run: |
25
+ source .venv/bin/activate
26
+ uv run pre-commit install-hooks
27
+
28
+ - name: Run Pre-commit
29
+ id: precommit
30
+ shell: bash
31
+ run: |
32
+ echo "## Pre-commit Results" >> $GITHUB_STEP_SUMMARY
33
+ echo "" >> $GITHUB_STEP_SUMMARY
34
+
35
+ if uv run pre-commit run --all-files 2>&1 | tee output.txt; then
36
+ echo "✅ **All pre-commit hooks passed!**" >> $GITHUB_STEP_SUMMARY
37
+ echo "" >> $GITHUB_STEP_SUMMARY
38
+ echo "| Hook | Status |" >> $GITHUB_STEP_SUMMARY
39
+ echo "|------|--------|" >> $GITHUB_STEP_SUMMARY
40
+ grep -E "\.\.\.*Passed|\.\.\.*Skipped" output.txt | while read line; do
41
+ hook=$(echo "$line" | sed 's/\.\.\..*Passed.*//' | sed 's/\.\.\..*Skipped.*//' | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$//')
42
+ if echo "$line" | grep -q "Passed"; then
43
+ echo "| $hook | ✅ Passed |" >> $GITHUB_STEP_SUMMARY
44
+ else
45
+ echo "| $hook | ⏭️ Skipped |" >> $GITHUB_STEP_SUMMARY
46
+ fi
47
+ done
48
+ else
49
+ echo "❌ **Some pre-commit hooks failed**" >> $GITHUB_STEP_SUMMARY
50
+ echo "" >> $GITHUB_STEP_SUMMARY
51
+ echo "| Hook | Status |" >> $GITHUB_STEP_SUMMARY
52
+ echo "|------|--------|" >> $GITHUB_STEP_SUMMARY
53
+ grep -E "\.\.\.*Passed|\.\.\.*Failed|\.\.\.*Skipped" output.txt | while read line; do
54
+ hook=$(echo "$line" | sed 's/\.\.\..*Passed.*//' | sed 's/\.\.\..*Failed.*//' | sed 's/\.\.\..*Skipped.*//' | sed 's/^[[:space:]]*//' | sed 's/[[:space:]]*$//')
55
+ if echo "$line" | grep -q "Passed"; then
56
+ echo "| $hook | ✅ Passed |" >> $GITHUB_STEP_SUMMARY
57
+ elif echo "$line" | grep -q "Failed"; then
58
+ echo "| $hook | ❌ Failed |" >> $GITHUB_STEP_SUMMARY
59
+ else
60
+ echo "| $hook | ⏭️ Skipped |" >> $GITHUB_STEP_SUMMARY
61
+ fi
62
+ done
63
+
64
+ echo "" >> $GITHUB_STEP_SUMMARY
65
+ echo "<details>" >> $GITHUB_STEP_SUMMARY
66
+ echo "<summary>📋 Click to see detailed error output</summary>" >> $GITHUB_STEP_SUMMARY
67
+ echo "" >> $GITHUB_STEP_SUMMARY
68
+ echo '```' >> $GITHUB_STEP_SUMMARY
69
+ cat output.txt >> $GITHUB_STEP_SUMMARY
70
+ echo '```' >> $GITHUB_STEP_SUMMARY
71
+ echo "</details>" >> $GITHUB_STEP_SUMMARY
72
+
73
+ exit 1
74
+ fi
.github/actions/tools/pytest/action.yaml ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: 'Pytest'
2
+ description: 'Pytest'
3
+
4
+ runs:
5
+ using: 'composite'
6
+ steps:
7
+ - name: Checkout code
8
+ uses: actions/checkout@v5
9
+
10
+ - name: Set up Python 3.12
11
+ uses: actions/setup-python@v5
12
+ with:
13
+ python-version: '3.12'
14
+
15
+ - name: Install uv
16
+ uses: astral-sh/setup-uv@v6
17
+ with:
18
+ enable-cache: true
19
+
20
+ - name: Install dependencies
21
+ shell: bash
22
+ run: |
23
+ uv sync --frozen
24
+
25
+ - name: Run all tests with coverage
26
+ id: pytest
27
+ shell: bash
28
+ run: |
29
+ echo "## Pytest Results" >> $GITHUB_STEP_SUMMARY
30
+ echo "" >> $GITHUB_STEP_SUMMARY
31
+
32
+ if uv run pytest tests/ -v --tb=short --junitxml=pytest-results.xml --cov=src --cov-report=term-missing --cov-report=xml --cov-report=html 2>&1 | tee pytest-output.txt; then
33
+ echo "✅ **All tests passed!**" >> $GITHUB_STEP_SUMMARY
34
+ echo "" >> $GITHUB_STEP_SUMMARY
35
+
36
+ # Extract test summary from pytest output
37
+ if grep -q "passed" pytest-output.txt; then
38
+ passed_count=$(grep -o '[0-9]\+ passed' pytest-output.txt | grep -o '[0-9]\+' | head -1)
39
+ echo "| Status | Count |" >> $GITHUB_STEP_SUMMARY
40
+ echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
41
+ echo "| ✅ Passed | $passed_count |" >> $GITHUB_STEP_SUMMARY
42
+ fi
43
+
44
+ if grep -q "skipped" pytest-output.txt; then
45
+ skipped_count=$(grep -o '[0-9]\+ skipped' pytest-output.txt | grep -o '[0-9]\+' | head -1)
46
+ echo "| ⏭️ Skipped | $skipped_count |" >> $GITHUB_STEP_SUMMARY
47
+ fi
48
+
49
+ if grep -q "warnings" pytest-output.txt; then
50
+ warnings_count=$(grep -o '[0-9]\+ warnings' pytest-output.txt | grep -o '[0-9]\+' | head -1)
51
+ echo "| ⚠️ Warnings | $warnings_count |" >> $GITHUB_STEP_SUMMARY
52
+ fi
53
+
54
+ echo "" >> $GITHUB_STEP_SUMMARY
55
+ echo "### 📊 Test Summary" >> $GITHUB_STEP_SUMMARY
56
+ echo '```' >> $GITHUB_STEP_SUMMARY
57
+ tail -10 pytest-output.txt >> $GITHUB_STEP_SUMMARY
58
+ echo '```' >> $GITHUB_STEP_SUMMARY
59
+
60
+ else
61
+ echo "❌ **Some tests failed**" >> $GITHUB_STEP_SUMMARY
62
+ echo "" >> $GITHUB_STEP_SUMMARY
63
+
64
+ # Extract test summary from pytest output
65
+ if grep -q "passed" pytest-output.txt; then
66
+ passed_count=$(grep -o '[0-9]\+ passed' pytest-output.txt | grep -o '[0-9]\+' | head -1)
67
+ echo "| Status | Count |" >> $GITHUB_STEP_SUMMARY
68
+ echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
69
+ echo "| ✅ Passed | $passed_count |" >> $GITHUB_STEP_SUMMARY
70
+ fi
71
+
72
+ if grep -q "failed" pytest-output.txt; then
73
+ failed_count=$(grep -o '[0-9]\+ failed' pytest-output.txt | grep -o '[0-9]\+' | head -1)
74
+ echo "| ❌ Failed | $failed_count |" >> $GITHUB_STEP_SUMMARY
75
+ fi
76
+
77
+ if grep -q "skipped" pytest-output.txt; then
78
+ skipped_count=$(grep -o '[0-9]\+ skipped' pytest-output.txt | grep -o '[0-9]\+' | head -1)
79
+ echo "| ⏭️ Skipped | $skipped_count |" >> $GITHUB_STEP_SUMMARY
80
+ fi
81
+
82
+ if grep -q "warnings" pytest-output.txt; then
83
+ warnings_count=$(grep -o '[0-9]\+ warnings' pytest-output.txt | grep -o '[0-9]\+' | head -1)
84
+ echo "| ⚠️ Warnings | $warnings_count |" >> $GITHUB_STEP_SUMMARY
85
+ fi
86
+
87
+ echo "" >> $GITHUB_STEP_SUMMARY
88
+ echo "<details>" >> $GITHUB_STEP_SUMMARY
89
+ echo "<summary>📋 Click to see detailed test output</summary>" >> $GITHUB_STEP_SUMMARY
90
+ echo "" >> $GITHUB_STEP_SUMMARY
91
+ echo '```' >> $GITHUB_STEP_SUMMARY
92
+ cat pytest-output.txt >> $GITHUB_STEP_SUMMARY
93
+ echo '```' >> $GITHUB_STEP_SUMMARY
94
+ echo "</details>" >> $GITHUB_STEP_SUMMARY
95
+
96
+ exit 1
97
+ fi
98
+
99
+ # Coverage report (shown for both success and failure)
100
+ echo "" >> $GITHUB_STEP_SUMMARY
101
+ echo "### 📈 Coverage Report" >> $GITHUB_STEP_SUMMARY
102
+ echo "" >> $GITHUB_STEP_SUMMARY
103
+ # Convert coverage output to markdown table
104
+ python3 ${GITHUB_ACTION_PATH}/markdown.py pytest-output.txt >> $GITHUB_STEP_SUMMARY
.github/actions/tools/pytest/markdown.py ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Convert pytest coverage output to markdown table format.
3
+ """
4
+
5
+ import re
6
+ import sys
7
+
8
+
9
+ def coverage_to_markdown(output_file: str) -> None:
10
+ """Convert pytest coverage output to markdown table.
11
+
12
+ Args:
13
+ output_file: Path to the pytest coverage text report.
14
+
15
+ Returns:
16
+ None: The function prints the markdown table to stdout.
17
+ """
18
+ try:
19
+ with open(output_file) as f:
20
+ content = f.read()
21
+ except FileNotFoundError:
22
+ print("| Error | Coverage output file not found | - | - | - |")
23
+ return
24
+
25
+ # Find the coverage section
26
+ lines = content.split("\n")
27
+ in_coverage_section = False
28
+ coverage_lines = []
29
+ total_line = ""
30
+
31
+ for line in lines:
32
+ if "Name" in line and "Stmts" in line and "Miss" in line and "Cover" in line:
33
+ in_coverage_section = True
34
+ continue
35
+ elif in_coverage_section:
36
+ if line.strip() == "" or line.startswith("="):
37
+ continue
38
+ elif line.startswith("TOTAL"):
39
+ total_line = line.strip()
40
+ break
41
+ elif line.strip():
42
+ coverage_lines.append(line.strip())
43
+
44
+ # Print markdown table header
45
+ print("| File | Statements | Missing | Coverage | Missing Lines |")
46
+ print("|------|------------|---------|----------|---------------|")
47
+
48
+ # Parse each coverage line
49
+ for line in coverage_lines:
50
+ # Match pattern: filename.py 123 45 67% 12, 34-56, 78
51
+ match = re.match(r"^([^\s]+\.py)\s+(\d+)\s+(\d+)\s+(\d+)%\s*(.*)$", line)
52
+ if match:
53
+ filename = match.group(1)
54
+ statements = int(match.group(2))
55
+ missing = int(match.group(3))
56
+ coverage_pct = int(match.group(4))
57
+ missing_details = match.group(5).strip()
58
+
59
+ # Clean up filename (remove src/ prefix if present)
60
+ clean_filename = filename.replace("src/", "")
61
+
62
+ # Format missing lines
63
+ if missing_details and missing_details != "-":
64
+ # Limit the missing details to avoid overly long tables
65
+ if len(missing_details) > 40:
66
+ missing_details = missing_details[:37] + "..."
67
+ missing_cell = f"`{missing_details}`"
68
+ else:
69
+ missing_cell = "None"
70
+
71
+ print(
72
+ f"| {clean_filename} | {statements} | {missing} | {coverage_pct}% | {missing_cell} |"
73
+ )
74
+
75
+ # Add total row
76
+ if total_line:
77
+ match = re.match(r"^TOTAL\s+(\d+)\s+(\d+)\s+(\d+)%", total_line)
78
+ if match:
79
+ statements = int(match.group(1))
80
+ missing = int(match.group(2))
81
+ coverage_pct = int(match.group(3))
82
+ print(
83
+ f"| **TOTAL** | **{statements}** | **{missing}** | **{coverage_pct}%** | - |"
84
+ )
85
+
86
+
87
+ if __name__ == "__main__":
88
+ output_file = sys.argv[1] if len(sys.argv) > 1 else "pytest-output.txt"
89
+ coverage_to_markdown(output_file)
.github/pull_request_template.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ ### Description
2
+
3
+
4
+
5
+ Fixes (issue)
.github/workflows/chore.yaml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Chore
2
+
3
+ on:
4
+ pull_request:
5
+ types: [opened, edited]
6
+
7
+ jobs:
8
+ update-pr-title:
9
+ if: github.event_name == 'pull_request' && (github.event.action == 'opened' || github.event.action == 'edited')
10
+ runs-on: instadeep-ci
11
+ container:
12
+ image: ghcr.io/catthehacker/ubuntu:runner-latest
13
+ credentials:
14
+ username: ${{ github.actor }}
15
+ password: ${{ secrets.github_token }}
16
+ permissions: write-all
17
+
18
+ steps:
19
+ - name: Checkout repository
20
+ uses: actions/checkout@v5
21
+ with:
22
+ ref: ${{ github.head_ref }}
23
+ fetch-depth: 0
24
+
25
+ - name: Update PR Title and Body
26
+ uses: ./.github/actions/tools/pr-title-generator
27
+ with:
28
+ github-token: ${{ github.token }}
.github/workflows/main.yaml ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Main Workflow
2
+
3
+ on:
4
+ push:
5
+
6
+ concurrency:
7
+ group: ${{ github.ref_name }}
8
+ cancel-in-progress: true
9
+
10
+ jobs:
11
+ pre-commit:
12
+ runs-on:
13
+ group: kao-products-runners
14
+ labels: instadeep-ci-4
15
+ container:
16
+ image: ghcr.io/catthehacker/ubuntu:runner-latest
17
+ steps:
18
+ - name: Checkout repository
19
+ uses: actions/checkout@v5
20
+ with:
21
+ ref: ${{ github.head_ref }}
22
+ fetch-depth: 0
23
+
24
+ - name: Pre-commit
25
+ uses: ./.github/actions/tools/pre-commit
26
+
27
+ pytest:
28
+ runs-on:
29
+ group: kao-products-runners
30
+ labels: instadeep-ci
31
+ container:
32
+ image: ghcr.io/catthehacker/ubuntu:runner-latest
33
+ env:
34
+ CI: 1
35
+ steps:
36
+ - name: Checkout repository
37
+ uses: actions/checkout@v5
38
+ with:
39
+ ref: ${{ github.head_ref }}
40
+ fetch-depth: 0
41
+
42
+ - name: Pytest
43
+ uses: ./.github/actions/tools/pytest
44
+
45
+ hugging-face:
46
+ if: github.ref == 'refs/heads/main'
47
+ runs-on:
48
+ group: kao-products-runners
49
+ labels: instadeep-ci
50
+ container:
51
+ image: ghcr.io/catthehacker/ubuntu:runner-latest
52
+ steps:
53
+ - name: Checkout repository
54
+ uses: actions/checkout@v5
55
+ with:
56
+ ref: ${{ github.head_ref || github.ref_name }}
57
+ fetch-depth: 0
58
+ lfs: true
59
+
60
+ - name: Hugging Face
61
+ uses: ./.github/actions/tools/huggingface
62
+ with:
63
+ token: ${{ secrets.HF_TOKEN }}
64
+ space: "InstaDeepAI/sentinel"
65
+ branch: main
66
+ runtime-secrets: |
67
+ GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
.gitignore ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python general
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.so
5
+ *.egg
6
+ *.egg-info/
7
+ *.pyd
8
+ .DS_Store
9
+
10
+ # Virtual environments
11
+ .venv/
12
+
13
+ # Byte-compiled / optimized / DLL files
14
+ *.pyc
15
+
16
+ # Distribution / packaging
17
+ .Python
18
+ build/
19
+ develop-eggs/
20
+ dist/
21
+ downloads/
22
+ .eggs/
23
+ .eggs-info/
24
+ lib/
25
+ lib64/
26
+ parts/
27
+ sdist/
28
+ var/
29
+ wheels/
30
+ share/python-wheels/
31
+ *.egg-info/
32
+ .installed.cfg
33
+ *.egg
34
+ MANIFEST
35
+
36
+ # PyInstaller
37
+ # Usually these files are written by a python script from a template
38
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
39
+ *.manifest
40
+ *.spec
41
+
42
+ # Installer logs
43
+ pip-log.txt
44
+ pip-delete-this-directory.txt
45
+
46
+ # Unit test / coverage reports
47
+ htmlcov/
48
+ .tox/
49
+ .nox/
50
+ .coverage
51
+ .coverage.*
52
+ .cache
53
+ nosetests.xml
54
+ coverage.xml
55
+ *.cover
56
+ .hypothesis/
57
+ .pytest_cache/
58
+
59
+ # Jupyter Notebook
60
+ .ipynb_checkpoints
61
+
62
+ # IPython
63
+ profile_default/
64
+ ipython_config.py
65
+
66
+ # pyenv
67
+ .python-version
68
+
69
+ # celery beat schedule file
70
+ celerybeat-schedule
71
+
72
+ # dotenv
73
+ .env
74
+ .env.*
75
+ !.env.example
76
+
77
+ # mypy
78
+ .mypy_cache/
79
+ .dmypy.json
80
+ compiled/
81
+
82
+ # Pyre type checker
83
+ .pyre/
84
+
85
+ # pyright type checker
86
+ pyrightconfig.json
87
+
88
+ # pytype
89
+ .pytype/
90
+
91
+ # Cython debug symbols
92
+ cython_debug/
93
+
94
+ # Reports
95
+ outputs/
96
+ *.xlsx
97
+ *.pdf
98
+
99
+ # Cursor
100
+ .cursor/
.pre-commit-config.yaml ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ default_language_version:
2
+ python: python3.12
3
+
4
+ default_stages: [pre-commit]
5
+
6
+ repos:
7
+ - repo: https://github.com/hakancelikdev/unimport
8
+ rev: 1.3.0
9
+ hooks:
10
+ - id: unimport
11
+ args:
12
+ - --remove
13
+ - repo: https://github.com/astral-sh/ruff-pre-commit
14
+ rev: v0.13.1
15
+ hooks:
16
+ - id: ruff-format
17
+ - id: ruff-check
18
+ args: [--fix, --exit-non-zero-on-fix]
19
+
20
+ - repo: https://github.com/kynan/nbstripout
21
+ rev: 0.8.1
22
+ hooks:
23
+ - id: nbstripout
24
+
25
+ - repo: https://github.com/codespell-project/codespell
26
+ rev: v2.4.1
27
+ hooks:
28
+ - id: codespell
29
+ name: codespell
30
+ description: Checks for common misspellings in text files.
31
+ entry: codespell --skip="*.js,*.html,*.css, *.svg" --ignore-words=.codespell-ignore.txt
32
+ language: python
33
+ types: [text]
34
+
35
+ - repo: https://github.com/pre-commit/pre-commit-hooks
36
+ rev: v6.0.0
37
+ hooks:
38
+ - id: debug-statements
39
+ - id: check-ast # Simply check whether the files parse as valid python
40
+ - id: check-case-conflict # Check for files that would conflict in case-insensitive filesystems
41
+ - id: check-builtin-literals # Require literal syntax when initializing empty or zero Python builtin types
42
+ - id: check-docstring-first # Check a common error of defining a docstring after code
43
+ - id: check-merge-conflict # Check for files that contain merge conflict strings
44
+ - id: check-yaml # Check yaml files
45
+ args: ["--unsafe"] # Allows special tags in mkdocs.yaml
46
+ - id: end-of-file-fixer # Ensure that a file is either empty, or ends with one newline
47
+ exclude: end-to-end-pipeline/web/.*
48
+ - id: mixed-line-ending # Replace or checks mixed line ending
49
+ - id: trailing-whitespace # This hook trims trailing whitespace
50
+ - id: file-contents-sorter # Sort the lines in specified files
51
+ files: .*requirements*\.txt$
52
+
53
+ - repo: https://github.com/google/yamlfmt
54
+ rev: v0.17.2
55
+ hooks:
56
+ - id: yamlfmt
57
+ args: ["-formatter", "retain_line_breaks_single=true,pad_line_comments=2"]
58
+
59
+ - repo: https://github.com/asottile/pyupgrade
60
+ rev: v3.20.0
61
+ hooks:
62
+ - id: pyupgrade
63
+ args: [--py312-plus]
64
+
65
+ # The following hook sorts and formats toml files
66
+ - repo: https://github.com/pappasam/toml-sort
67
+ rev: v0.24.3
68
+ hooks:
69
+ - id: toml-sort
70
+ description: "Sort and format toml files."
71
+ args:
72
+ - --all
73
+ - --in-place
74
+
75
+ # The following hook checks for secrets in the code
76
+ - repo: https://github.com/zricethezav/gitleaks
77
+ rev: v8.28.0
78
+ hooks:
79
+ - id: gitleaks
80
+
81
+ # The following hook checks for secrets in the code
82
+ - repo: https://github.com/trufflesecurity/trufflehog
83
+ rev: v3.90.8
84
+ hooks:
85
+ - id: trufflehog
86
+
87
+ - repo: local
88
+ hooks:
89
+ - id: pylint
90
+ name: pylint
91
+ entry: pylint
92
+ language: python
93
+ additional_dependencies: ["pylint"]
94
+ types: [python]
95
+ args: ["--disable=all", "--enable=missing-docstring,unused-argument"]
96
+ exclude: 'test_\.py$'
97
+
98
+ # The following hook check docstrings quality
99
+ - repo: https://github.com/terrencepreilly/darglint
100
+ rev: v1.8.1
101
+ hooks:
102
+ - id: darglint
103
+ args: ["--docstring-style=google"]
104
+ exclude: 'src/sentinel/risk_models/qcancer\.py$'
105
+
106
+ # The following hook checks for docstring in functions
107
+ - repo: https://github.com/pycqa/pydocstyle
108
+ rev: 6.3.0
109
+ hooks:
110
+ - id: pydocstyle
111
+ args: ["--select=D103", "--match-dir=(genomics_research|projects)"]
.streamlit/config.toml ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ [theme]
2
+ backgroundColor = "#FFFFFF"
3
+ font = "Roboto"
4
+ primaryColor = "#007AFF"
5
+ secondaryBackgroundColor = "#F8FBFF"
6
+ textColor = "#0059B3"
AGENTS.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Repo Guidelines
2
+
3
+ This repository contains the LLM-based Cancer Risk Assessment Assistant.
4
+
5
+ ## Core Technologies
6
+ - **FastAPI** for the web framework
7
+ - **LangChain** for LLM orchestration
8
+ - **uv** for environment and dependency management
9
+ - **hydra:** for configuration management
10
+
11
+ ## Coding Philosophy
12
+ - Prioritize clarity and reusability.
13
+ - Favor simple replication over heavy abstraction.
14
+ - Keep comments short and only where the code isn't self-explanatory.
15
+ - Avoid verbose docstrings for simple functions.
16
+
17
+ ## Testing
18
+ - Write meaningful tests that verify core functionality and prevent regressions.
19
+ - Run tests with `uv run pytest`.
20
+
21
+ ## Development Setup
22
+ - Create the virtual environment (at '.venv') with `uv sync`.
23
+
24
+ ## Running commands
25
+ - As the repository uses uv, the uv should be used to run all commands, e.g., "uv run python ..." NOT "python ...".
26
+
27
+ These guidelines apply to the entire repository. A multi-page Streamlit
28
+ interface for expert feedback can be launched with `uv run streamlit run
29
+ apps/streamlit_ui/main.py`.
30
+ The first page, **User Profile**, allows experts to load or create a profile
31
+ stored in `st.session_state.user_profile`.
32
+ The second page, **Configuration**, lets experts choose the model and knowledge base modules while previewing the generated prompt.
33
+ The third page, **Assessment**, runs the AI analysis, displays a results dashboard, and provides export and chat options.
34
+
35
+ ## Important Note for Developers
36
+
37
+ When making changes to the project, ensure that the following files are updated to reflect the changes:
38
+
39
+ - `README.md`
40
+ - `AGENTS.md`
41
+ - `GEMINI.md`
42
+
43
+ ## Risk Model Coverage
44
+
45
+ Implemented risk calculators include:
46
+ - **Gail** - Breast cancer risk
47
+ - **Claus** - Breast cancer risk based on family history
48
+ - **PLCOm2012** - Lung cancer risk
49
+ - **CRC-PRO** - Colorectal cancer risk
50
+ - **PCPT** - Prostate cancer risk
51
+ - **Extended PBCG** - Prostate cancer risk (extended model)
52
+ - **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
53
+ - **QCancer** - Multi-site cancer differential
54
+
55
+ Additional models should follow the interfaces under `src/sentinel/risk_models`.
Dockerfile ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ # Set working directory
4
+ WORKDIR /app
5
+
6
+ # Install uv
7
+ COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
8
+
9
+ # Copy dependency files first for better caching
10
+ COPY pyproject.toml uv.lock* ./
11
+
12
+ # Copy the entire project
13
+ COPY . .
14
+
15
+ # Set UV cache directory to a writable location
16
+ ENV UV_CACHE_DIR=/tmp/uv-cache
17
+ ENV HOME=/tmp
18
+
19
+ # Install dependencies with uv
20
+ RUN uv sync --frozen --no-dev
21
+
22
+ # Create cache directory and set permissions
23
+ RUN mkdir -p /tmp/uv-cache && chmod -R 777 /tmp/uv-cache
24
+
25
+ # Make /app directory writable for non-root users (required for HuggingFace Spaces)
26
+ RUN chmod -R 777 /app
27
+
28
+ # Expose Streamlit port
29
+ EXPOSE 8501
30
+
31
+ # Set environment variables for Streamlit
32
+ ENV STREAMLIT_SERVER_PORT=8501
33
+ ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
34
+ ENV STREAMLIT_SERVER_HEADLESS=true
35
+ ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
36
+
37
+ # Run Streamlit app
38
+ CMD ["uv", "run", "streamlit", "run", "apps/streamlit_ui/main.py"]
GEMINI.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Repo Guidelines
2
+
3
+ This repository contains the LLM-based Cancer Risk Assessment Assistant.
4
+
5
+ ## Core Technologies
6
+ - **FastAPI** for the web framework
7
+ - **LangChain** for LLM orchestration
8
+ - **uv** for environment and dependency management
9
+ - **hydra:** for configuration management
10
+
11
+ ## Coding Philosophy
12
+ - Prioritize clarity and reusability.
13
+ - Favor simple replication over heavy abstraction.
14
+ - Keep comments short and only where the code isn't self-explanatory.
15
+ - Avoid verbose docstrings for simple functions.
16
+
17
+ ## Testing
18
+ - Write meaningful tests that verify core functionality and prevent regressions.
19
+ - Run tests with `uv run pytest`.
20
+
21
+ ## Development Setup
22
+ - Create the virtual environment (at '.venv') with `uv sync`.
23
+
24
+ ## Running commands
25
+ - As the repository uses uv, the uv should be used to run all commands, e.g., "uv run python ..." NOT "python ...".
26
+
27
+ These guidelines apply to the entire repository. A multi-page Streamlit
28
+ interface for expert feedback can be launched with `uv run streamlit run
29
+ apps/streamlit_ui/main.py`.
30
+ The first page, **User Profile**, allows experts to load or create a profile
31
+ stored in `st.session_state.user_profile`.
32
+ The second page, **Configuration**, lets experts choose the model and knowledge base modules while previewing the generated prompt.
33
+ The third page, **Assessment**, runs the AI analysis, displays a results dashboard, and provides export and chat options.
34
+
35
+ ## Important Note for Developers
36
+
37
+ When making changes to the project, ensure that the following files are updated to reflect the changes:
38
+
39
+ - `README.md`
40
+ - `AGENTS.md`
41
+ - `GEMINI.md`
42
+
43
+ ## Risk Model Availability
44
+
45
+ Risk calculators exposed to Gemini-based agents include:
46
+ - **Gail** - Breast cancer risk
47
+ - **Claus** - Breast cancer risk based on family history
48
+ - **PLCOm2012** - Lung cancer risk
49
+ - **CRC-PRO** - Colorectal cancer risk
50
+ - **PCPT** - Prostate cancer risk
51
+ - **Extended PBCG** - Prostate cancer risk (extended model)
52
+ - **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
53
+ - **QCancer** - Multi-site cancer differential
54
+
55
+ Register additional models in `src/sentinel/risk_models/__init__.py` so they are available system-wide.
README.md CHANGED
@@ -1,12 +1,173 @@
1
  ---
2
- title: Sentinel
3
- emoji: 📚
4
- colorFrom: red
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.49.1
8
- app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Sentinel - Cancer Risk Assessment Assistant
3
+ emoji: 🏥
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 8501
 
8
  pinned: false
9
  ---
10
 
11
+ # LLM-based Cancer Risk Assessment Assistant
12
+
13
+ This project is an API service that provides preliminary cancer risk assessments based on user-provided data. It is built using FastAPI and LangChain, with a flexible architecture that supports both local and API-based LLMs.
14
+
15
+ ## Development Setup
16
+
17
+ 1. Create the virtual environment:
18
+
19
+ ```bash
20
+ uv sync
21
+ ```
22
+
23
+ ## External API Configuration
24
+
25
+ For risk models that require external APIs, such as CanRisk (BOADICEA model), fill in the following section of the `.env` file:
26
+
27
+ ```bash
28
+ # .env
29
+ CANRISK_USERNAME=your_canrisk_username
30
+ CANRISK_PASSWORD=your_canrisk_password
31
+ ```
32
+
33
+ Then source it: `source .env`
34
+
35
+ For CanRisk API access , register at https://www.canrisk.org/.
36
+
37
+ ## Using a Local LLM (Ollama)
38
+
39
+ 1. Install [Ollama](https://ollama.com) for your platform.
40
+ 2. Pull the default model from the command line:
41
+
42
+ ```bash
43
+ ollama pull gemma3:4b
44
+ ```
45
+ 3. Ensure the Ollama desktop app or server is running. You can check your installed models with `ollama list`.
46
+
47
+ ## Using API-based LLMs (Google)
48
+
49
+ 1. Create a `.env` file in the project root with your `GOOGLE_API_KEY`:
50
+
51
+ ```bash
52
+ echo "GOOGLE_API_KEY=your_key_here" > .env
53
+ ```
54
+
55
+ Make sure the Generative AI API is enabled for your Google Cloud project.
56
+
57
+ 2. Run the command line demo with the Google provider (default):
58
+
59
+ ```bash
60
+ uv run python apps/cli/main.py
61
+ ```
62
+
63
+ Switch to the local model with:
64
+
65
+ ```bash
66
+ uv run python apps/cli/main.py model=gemma3_4b
67
+ ```
68
+
69
+ 3. The `model` override also works with the Streamlit and FastAPI interfaces.
70
+
71
+
72
+ ## Interactive Demo
73
+
74
+ Run a simple command line demo with:
75
+
76
+ ```bash
77
+ uv run python apps/cli/main.py
78
+ ```
79
+
80
+ Enable developer mode and load user data from a file with:
81
+
82
+ ```bash
83
+ uv run python apps/cli/main.py dev_mode=true user_file=examples/user_example.yaml
84
+ ```
85
+
86
+ The script collects user data, prints the structured JSON assessment, and then allows follow-up questions in a chat-like loop. Type `quit` to exit.
87
+
88
+ The multi-page Streamlit interface provides an expert feedback interface located at
89
+ `apps/streamlit_ui/main.py`.
90
+ The first page, **User Profile**, lets you upload or manually create a profile
91
+ before running assessments.
92
+ The **Configuration** page allows you to choose the model and knowledge base modules and shows a live preview of the full LLM prompt.
93
+ The **Assessment** page runs the model, shows a dashboard of results, and lets you export or chat with the assistant.
94
+
95
+ ### Exporting Reports
96
+
97
+ After the initial assessment is displayed in the terminal, you will be prompted to export the full report to a formatted file. You can choose to generate a PDF, an Excel file, or both. The generated files (e.g., `Cancer_Risk_Report_20250626_213000.pdf`) will be saved in the root directory of the project.
98
+
99
+ **Note:** This feature requires the `openpyxl` and `reportlab` libraries.
100
+
101
+ You can also provide a JSON or YAML file with all user information to skip the
102
+ interactive prompts:
103
+
104
+ ```bash
105
+ uv run python apps/cli/main.py user_file=examples/user_example.yaml
106
+ ```
107
+
108
+ To launch the Streamlit interface, run the following command from the root of the
109
+ project:
110
+
111
+ ```bash
112
+ uv run streamlit run apps/streamlit_ui/main.py
113
+ ```
114
+
115
+ *Note* To serve the app locally you can use `ngrok`
116
+ ```bash
117
+ ngrok http 8501
118
+ ```
119
+
120
+ ## Important Note for Developers
121
+
122
+ When making changes to the project, check if the following files should also updated to reflect the changes:
123
+
124
+ - `README.md`
125
+ - `AGENTS.md`
126
+ - `GEMINI.md`
127
+
128
+ ## Available Risk Models
129
+
130
+ The assistant currently includes the following built-in risk calculators:
131
+
132
+ - Gail Model (Breast Cancer)
133
+ - PLCOm2012 (Lung Cancer)
134
+ - CRC-PRO (Colorectal Cancer)
135
+ - PCPT (Prostate Cancer)
136
+ - QCancer (Multi-site cancer differential)
137
+
138
+ ## Generating Documentation
139
+
140
+ The project includes a comprehensive PDF documentation generator that creates detailed documentation of all implemented risk models and their input requirements.
141
+
142
+ ### Generate Risk Model Documentation
143
+
144
+ To generate the PDF documentation:
145
+
146
+ ```bash
147
+ uv run python scripts/generate_documentation.py
148
+ ```
149
+
150
+ This will create a comprehensive PDF document (`docs/risk_model_documentation.pdf`) that includes:
151
+
152
+ 1. **Overview Section**:
153
+ - Cancer type coverage chart
154
+ - Statistics on implemented risk scores and cancer types covered
155
+
156
+ 2. **Detailed Model Information**:
157
+ - Description, interpretation, and references for each risk model
158
+ - Complete input requirements with field details, required status, units, and possible values/choices
159
+
160
+ 3. **Input-to-Cancer Mapping**:
161
+ - Reverse mapping showing which cancer types use each input field
162
+ - Possible values for each field
163
+ - Comprehensive coverage analysis
164
+
165
+ The documentation is automatically regenerated based on the current codebase, ensuring it stays up-to-date as new risk models and input fields are added.
166
+
167
+ ### Documentation Features
168
+
169
+ - **Comprehensive Coverage**: Documents all risk models and their input requirements
170
+ - **Visual Charts**: Includes cancer type coverage visualization
171
+ - **Detailed Tables**: Shows field specifications, constraints, and valid values
172
+ - **Professional Layout**: Clean, readable PDF format suitable for sharing
173
+ - **Auto-Generated**: Stays synchronized with code changes automatically
RISK_MODELS.md ADDED
@@ -0,0 +1,587 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Risk Models Specification
2
+
3
+ This document outlines the requirements and specifications for implementing risk models in the Sentinel cancer risk assessment system.
4
+
5
+ ## Overview
6
+
7
+ Risk models in Sentinel are designed to calculate cancer risk scores using structured user input data. All risk models must follow a consistent architecture, use the new `UserInput` structure, implement proper validation, and maintain comprehensive test coverage.
8
+
9
+ ## Core Architecture
10
+
11
+ ### Base Class
12
+
13
+ All risk models must inherit from `RiskModel` in `src/sentinel/risk_models/base.py`:
14
+
15
+ ```python
16
+ from sentinel.risk_models.base import RiskModel
17
+
18
+ class YourRiskModel(RiskModel):
19
+ def __init__(self):
20
+ super().__init__("your_model_name")
21
+ ```
22
+
23
+ ### Required Methods
24
+
25
+ Every risk model must implement these abstract methods:
26
+
27
+ ```python
28
+ def compute_score(self, user: UserInput) -> str:
29
+ """Compute the risk score for a given user profile.
30
+
31
+ Args:
32
+ user: The user profile containing demographics, medical history, etc.
33
+
34
+ Returns:
35
+ str: Risk percentage as a string or an N/A message if inapplicable.
36
+
37
+ Raises:
38
+ ValueError: If required inputs are missing or invalid.
39
+ """
40
+
41
+ def cancer_type(self) -> str:
42
+ """Return the cancer type this model assesses."""
43
+ return "breast" # or "lung", "prostate", etc.
44
+
45
+ def description(self) -> str:
46
+ """Return a detailed description of the model."""
47
+
48
+ def interpretation(self) -> str:
49
+ """Return guidance on how to interpret the results."""
50
+
51
+ def references(self) -> list[str]:
52
+ """Return list of reference citations."""
53
+ ```
54
+
55
+ ## UserInput Structure
56
+
57
+ ### Required Imports
58
+
59
+ ```python
60
+ from typing import Annotated
61
+ from pydantic import Field
62
+ from sentinel.risk_models.base import RiskModel
63
+ from sentinel.user_input import (
64
+ # Import specific enums and models you need
65
+ CancerType,
66
+ ChronicCondition,
67
+ Demographics,
68
+ Ethnicity,
69
+ FamilyMemberCancer,
70
+ FamilyRelation,
71
+ FamilySide,
72
+ RelationshipDegree,
73
+ Sex,
74
+ SymptomEntry,
75
+ UserInput,
76
+ # ... other specific imports
77
+ )
78
+ ```
79
+
80
+ ### UserInput Hierarchy
81
+
82
+ The `UserInput` class follows a hierarchical structure:
83
+
84
+ ```
85
+ UserInput
86
+ ├── demographics: Demographics
87
+ │ ├── age_years: int
88
+ │ ├── sex: Sex (enum)
89
+ │ ├── ethnicity: Ethnicity | None
90
+ │ └── anthropometrics: Anthropometrics
91
+ │ ├── height_cm: float | None
92
+ │ └── weight_kg: float | None
93
+ ├── lifestyle: Lifestyle
94
+ │ ├── smoking: SmokingHistory
95
+ │ └── alcohol: AlcoholConsumption
96
+ ├── personal_medical_history: PersonalMedicalHistory
97
+ │ ├── chronic_conditions: list[ChronicCondition]
98
+ │ ├── previous_cancers: list[CancerType]
99
+ │ ├── genetic_mutations: list[GeneticMutation]
100
+ │ ├── tyrer_cuzick_polygenic_risk_score: float | None
101
+ │ └── # ... other fields
102
+ ├── female_specific: FemaleSpecific | None
103
+ │ ├── menstrual: MenstrualHistory
104
+ │ ├── parity: ParityHistory
105
+ │ └── breast_health: BreastHealthHistory
106
+ ├── symptoms: list[SymptomEntry]
107
+ └── family_history: list[FamilyMemberCancer]
108
+ ```
109
+
110
+ ## REQUIRED_INPUTS Specification
111
+
112
+ ### Structure
113
+
114
+ Every risk model must define a `REQUIRED_INPUTS` class attribute using Pydantic's `Annotated` types with `Field` constraints:
115
+
116
+ ```python
117
+ REQUIRED_INPUTS: dict[str, tuple[type, bool]] = {
118
+ "demographics.age_years": (Annotated[int, Field(ge=18, le=100)], True),
119
+ "demographics.sex": (Sex, True),
120
+ "demographics.ethnicity": (Ethnicity | None, False),
121
+ "demographics.anthropometrics.height_cm": (Annotated[float, Field(gt=0)], False),
122
+ "demographics.anthropometrics.weight_kg": (Annotated[float, Field(gt=0)], False),
123
+ "female_specific.menstrual.age_at_menarche": (Annotated[int, Field(ge=8, le=25)], False),
124
+ "personal_medical_history.tyrer_cuzick_polygenic_risk_score": (Annotated[float, Field(gt=0)], False),
125
+ "family_history": (list, False), # list[FamilyMemberCancer]
126
+ "symptoms": (list, False), # list[SymptomEntry]
127
+ }
128
+ ```
129
+
130
+ ### Field Constraints
131
+
132
+ Use appropriate `Field` constraints for validation:
133
+
134
+ - `ge=X`: Greater than or equal to X
135
+ - `le=X`: Less than or equal to X
136
+ - `gt=X`: Greater than X
137
+ - `lt=X`: Less than X
138
+
139
+ ### Required vs Optional
140
+
141
+ - `True`: Field is required for the model
142
+ - `False`: Field is optional but validated if present
143
+
144
+ ## Input Validation
145
+
146
+ ### Validation in compute_score
147
+
148
+ Every `compute_score` method must start with input validation:
149
+
150
+ ```python
151
+ def compute_score(self, user: UserInput) -> str:
152
+ """Compute the risk score for a given user profile."""
153
+ # Validate inputs first
154
+ is_valid, errors = self.validate_inputs(user)
155
+ if not is_valid:
156
+ raise ValueError(f"Invalid inputs for {self.name}: {'; '.join(errors)}")
157
+
158
+ # Continue with model-specific logic...
159
+ ```
160
+
161
+ ### Model-Specific Validation
162
+
163
+ Add additional validation as needed:
164
+
165
+ ```python
166
+ # Check sex applicability
167
+ if user.demographics.sex != Sex.FEMALE:
168
+ return "N/A: Model is only applicable to female patients."
169
+
170
+ # Check age range
171
+ if not (35 <= user.demographics.age_years <= 85):
172
+ return "N/A: Age is outside the validated range."
173
+
174
+ # Check required data availability
175
+ if user.female_specific is None:
176
+ return "N/A: Missing female-specific information required for model."
177
+ ```
178
+
179
+ ## Extending UserInput
180
+
181
+ ### When to Extend
182
+
183
+ If a risk model requires fields or enums that don't exist in `UserInput`, **do not** use replacement values or hacks. Instead, propose extending `UserInput`:
184
+
185
+ 1. **Missing Enums**: Add new values to existing enums (e.g., `ChronicCondition`, `SymptomType`)
186
+ 2. **Missing Fields**: Add new fields to appropriate sections (e.g., `PersonalMedicalHistory`, `BreastHealthHistory`)
187
+ 3. **Missing Models**: Create new Pydantic models if needed
188
+
189
+ ### Extension Process
190
+
191
+ 1. **Identify Missing Elements**: Document what's needed for the model
192
+ 2. **Propose Extension**: Suggest specific additions to `UserInput`
193
+ 3. **Implement Extension**: Add the new fields/enums to `src/sentinel/user_input.py`
194
+ 4. **Update Tests**: Add tests for new fields in `tests/test_user_input.py`
195
+ 5. **Update Model**: Use the new fields in your risk model
196
+ 6. **Run Tests**: Ensure all tests pass
197
+
198
+ ### Example Extensions
199
+
200
+ ```python
201
+ # Adding new ChronicCondition enum values
202
+ class ChronicCondition(str, Enum):
203
+ # ... existing values
204
+ ENDOMETRIAL_POLYPS = "endometrial_polyps"
205
+ ANAEMIA = "anaemia"
206
+
207
+ # Adding new fields to PersonalMedicalHistory
208
+ class PersonalMedicalHistory(StrictBaseModel):
209
+ # ... existing fields
210
+ tyrer_cuzick_polygenic_risk_score: float | None = Field(
211
+ None,
212
+ gt=0,
213
+ description="Tyrer-Cuzick polygenic risk score as relative risk multiplier",
214
+ )
215
+
216
+ # Adding new fields to BreastHealthHistory
217
+ class BreastHealthHistory(StrictBaseModel):
218
+ # ... existing fields
219
+ lobular_carcinoma_in_situ: bool | None = Field(
220
+ None,
221
+ description="History of lobular carcinoma in situ (LCIS) diagnosis",
222
+ )
223
+ ```
224
+
225
+ ## Data Access Patterns
226
+
227
+ ### Demographics
228
+
229
+ ```python
230
+ age = user.demographics.age_years
231
+ sex = user.demographics.sex
232
+ ethnicity = user.demographics.ethnicity
233
+ height_cm = user.demographics.anthropometrics.height_cm
234
+ weight_kg = user.demographics.anthropometrics.weight_kg
235
+ ```
236
+
237
+ ### Female-Specific Data
238
+
239
+ ```python
240
+ if user.female_specific is not None:
241
+ fs = user.female_specific
242
+ menarche_age = fs.menstrual.age_at_menarche
243
+ menopause_age = fs.menstrual.age_at_menopause
244
+ num_births = fs.parity.num_live_births
245
+ first_birth_age = fs.parity.age_at_first_live_birth
246
+ num_biopsies = fs.breast_health.num_biopsies
247
+ atypical_hyperplasia = fs.breast_health.atypical_hyperplasia
248
+ lcis = fs.breast_health.lobular_carcinoma_in_situ
249
+ ```
250
+
251
+ ### Medical History
252
+
253
+ ```python
254
+ chronic_conditions = user.personal_medical_history.chronic_conditions
255
+ previous_cancers = user.personal_medical_history.previous_cancers
256
+ genetic_mutations = user.personal_medical_history.genetic_mutations
257
+ polygenic_score = user.personal_medical_history.tyrer_cuzick_polygenic_risk_score
258
+ ```
259
+
260
+ ### Family History
261
+
262
+ ```python
263
+ for member in user.family_history:
264
+ if member.cancer_type == CancerType.BREAST:
265
+ relation = member.relation
266
+ age_at_diagnosis = member.age_at_diagnosis
267
+ degree = member.degree
268
+ side = member.side
269
+ ```
270
+
271
+ ### Symptoms
272
+
273
+ ```python
274
+ for symptom in user.symptoms:
275
+ symptom_type = symptom.symptom_type
276
+ severity = symptom.severity
277
+ duration_days = symptom.duration_days
278
+ ```
279
+
280
+ ## Enum Usage
281
+
282
+ ### Always Use Enums
283
+
284
+ Never use string literals. Always use the appropriate enums:
285
+
286
+ ```python
287
+ # ✅ Correct
288
+ if user.demographics.sex == Sex.FEMALE:
289
+ if member.cancer_type == CancerType.BREAST:
290
+ if member.relation == FamilyRelation.MOTHER:
291
+ if member.degree == RelationshipDegree.FIRST:
292
+ if member.side == FamilySide.MATERNAL:
293
+
294
+ # ❌ Incorrect
295
+ if user.demographics.sex == "female":
296
+ if member.cancer_type == "breast":
297
+ if member.relation == "mother":
298
+ ```
299
+
300
+ ### Enum Mapping
301
+
302
+ When you need to map enums to model-specific codes:
303
+
304
+ ```python
305
+ def _race_code_from_ethnicity(ethnicity: Ethnicity | None) -> int:
306
+ """Map ethnicity enum to model-specific race code."""
307
+ if not ethnicity:
308
+ return 1 # Default
309
+
310
+ if ethnicity == Ethnicity.BLACK:
311
+ return 2
312
+ if ethnicity in {Ethnicity.ASIAN, Ethnicity.PACIFIC_ISLANDER}:
313
+ return 3
314
+ if ethnicity == Ethnicity.HISPANIC:
315
+ return 6
316
+ return 1 # Default to White
317
+ ```
318
+
319
+ ## Testing Requirements
320
+
321
+ ### Test File Structure
322
+
323
+ Create comprehensive test files following this pattern:
324
+
325
+ ```python
326
+ import pytest
327
+ from sentinel.user_input import (
328
+ # Import all needed models and enums
329
+ Anthropometrics,
330
+ BreastHealthHistory,
331
+ CancerType,
332
+ Demographics,
333
+ Ethnicity,
334
+ FamilyMemberCancer,
335
+ FamilyRelation,
336
+ FamilySide,
337
+ FemaleSpecific,
338
+ Lifestyle,
339
+ MenstrualHistory,
340
+ ParityHistory,
341
+ PersonalMedicalHistory,
342
+ RelationshipDegree,
343
+ Sex,
344
+ SmokingHistory,
345
+ SmokingStatus,
346
+ UserInput,
347
+ )
348
+ from sentinel.risk_models import YourRiskModel
349
+
350
+ # Ground truth test cases
351
+ GROUND_TRUTH_CASES = [
352
+ {
353
+ "name": "test_case_name",
354
+ "input": UserInput(
355
+ demographics=Demographics(
356
+ age_years=40,
357
+ sex=Sex.FEMALE,
358
+ ethnicity=Ethnicity.WHITE,
359
+ anthropometrics=Anthropometrics(height_cm=165.0, weight_kg=65.0),
360
+ ),
361
+ lifestyle=Lifestyle(
362
+ smoking=SmokingHistory(status=SmokingStatus.NEVER),
363
+ ),
364
+ personal_medical_history=PersonalMedicalHistory(),
365
+ female_specific=FemaleSpecific(
366
+ menstrual=MenstrualHistory(age_at_menarche=13),
367
+ parity=ParityHistory(num_live_births=1, age_at_first_live_birth=25),
368
+ breast_health=BreastHealthHistory(),
369
+ ),
370
+ family_history=[
371
+ FamilyMemberCancer(
372
+ relation=FamilyRelation.MOTHER,
373
+ cancer_type=CancerType.BREAST,
374
+ age_at_diagnosis=55,
375
+ degree=RelationshipDegree.FIRST,
376
+ side=FamilySide.MATERNAL,
377
+ )
378
+ ],
379
+ ),
380
+ "expected": 1.5, # Expected risk percentage
381
+ },
382
+ # ... more test cases
383
+ ]
384
+
385
+ class TestYourRiskModel:
386
+ """Test suite for YourRiskModel."""
387
+
388
+ def setup_method(self):
389
+ """Initialize model instance for testing."""
390
+ self.model = YourRiskModel()
391
+
392
+ @pytest.mark.parametrize("case", GROUND_TRUTH_CASES, ids=lambda x: x["name"])
393
+ def test_ground_truth_validation(self, case):
394
+ """Test against ground truth results."""
395
+ user_input = case["input"]
396
+ expected_risk = case["expected"]
397
+
398
+ actual_risk_str = self.model.compute_score(user_input)
399
+
400
+ if "N/A" in actual_risk_str:
401
+ pytest.fail(f"Model returned N/A: {actual_risk_str}")
402
+
403
+ actual_risk = float(actual_risk_str)
404
+ assert actual_risk == pytest.approx(expected_risk, abs=0.01)
405
+
406
+ def test_validation_errors(self):
407
+ """Test that model raises ValueError for invalid inputs."""
408
+ # Test invalid age
409
+ user_input = UserInput(
410
+ demographics=Demographics(
411
+ age_years=30, # Below minimum
412
+ sex=Sex.FEMALE,
413
+ anthropometrics=Anthropometrics(height_cm=165.0, weight_kg=65.0),
414
+ ),
415
+ # ... rest of input
416
+ )
417
+
418
+ with pytest.raises(ValueError, match=r"Invalid inputs for.*:"):
419
+ self.model.compute_score(user_input)
420
+
421
+ def test_inapplicable_cases(self):
422
+ """Test cases where model returns N/A."""
423
+ # Test male patient
424
+ user_input = UserInput(
425
+ demographics=Demographics(
426
+ age_years=50,
427
+ sex=Sex.MALE, # Wrong sex
428
+ anthropometrics=Anthropometrics(height_cm=175.0, weight_kg=70.0),
429
+ ),
430
+ # ... rest of input
431
+ )
432
+
433
+ score = self.model.compute_score(user_input)
434
+ assert "N/A" in score
435
+ ```
436
+
437
+ ### Test Coverage Requirements
438
+
439
+ - **Ground Truth Validation**: Test against known reference values
440
+ - **Input Validation**: Test that invalid inputs raise `ValueError`
441
+ - **Edge Cases**: Test boundary conditions and edge cases
442
+ - **Inapplicable Cases**: Test cases where model should return "N/A"
443
+ - **Enum Usage**: Test that all enums are used correctly
444
+ - **Family History**: Test various family relationship combinations
445
+ - **Error Handling**: Test error conditions and exception handling
446
+
447
+ ## Code Quality Requirements
448
+
449
+ ### Pre-commit Hooks
450
+
451
+ All code must pass these pre-commit hooks:
452
+
453
+ - **unimport**: Remove unused imports
454
+ - **ruff format**: Code formatting
455
+ - **ruff check**: Linting and style checks
456
+ - **pylint**: Code quality analysis
457
+ - **darglint**: Docstring validation
458
+ - **pydocstyle**: Docstring style checks
459
+ - **codespell**: Spell checking
460
+
461
+ ### Code Style
462
+
463
+ - Use type hints throughout
464
+ - Write clear, concise docstrings
465
+ - Follow PEP 8 style guidelines
466
+ - Use meaningful variable names
467
+ - Add comments for complex logic
468
+ - Handle edge cases gracefully
469
+
470
+ ### Error Handling
471
+
472
+ ```python
473
+ def compute_score(self, user: UserInput) -> str:
474
+ """Compute the risk score for a given user profile."""
475
+ try:
476
+ # Validate inputs
477
+ is_valid, errors = self.validate_inputs(user)
478
+ if not is_valid:
479
+ raise ValueError(f"Invalid inputs for {self.name}: {'; '.join(errors)}")
480
+
481
+ # Model-specific validation
482
+ if user.demographics.sex != Sex.FEMALE:
483
+ return "N/A: Model is only applicable to female patients."
484
+
485
+ # Calculate risk
486
+ risk = self._calculate_risk(user)
487
+ return f"{risk:.2f}"
488
+
489
+ except Exception as e:
490
+ return f"N/A: Error calculating risk - {e!s}"
491
+ ```
492
+
493
+ ## Migration Checklist
494
+
495
+ When adapting an existing risk model to the new structure:
496
+
497
+ - [ ] Update imports to use new `user_input` module
498
+ - [ ] Add `REQUIRED_INPUTS` with Pydantic validation
499
+ - [ ] Refactor `compute_score` to use new `UserInput` structure
500
+ - [ ] Replace string literals with enums
501
+ - [ ] Update parameter extraction logic
502
+ - [ ] Add input validation at start of `compute_score`
503
+ - [ ] Update all test cases to use new `UserInput` structure
504
+ - [ ] Run full test suite to ensure 100% pass rate
505
+ - [ ] Run pre-commit hooks to ensure code quality
506
+ - [ ] Document any `UserInput` extensions needed
507
+ - [ ] Update model documentation and references
508
+
509
+ ## Examples
510
+
511
+ ### Complete Risk Model Template
512
+
513
+ ```python
514
+ """Your cancer risk model implementation."""
515
+
516
+ from typing import Annotated
517
+ from pydantic import Field
518
+ from sentinel.risk_models.base import RiskModel
519
+ from sentinel.user_input import (
520
+ CancerType,
521
+ Demographics,
522
+ Ethnicity,
523
+ FamilyMemberCancer,
524
+ FamilyRelation,
525
+ RelationshipDegree,
526
+ Sex,
527
+ UserInput,
528
+ )
529
+
530
+ class YourRiskModel(RiskModel):
531
+ """Compute cancer risk using the Your model."""
532
+
533
+ def __init__(self):
534
+ super().__init__("your_model")
535
+
536
+ REQUIRED_INPUTS: dict[str, tuple[type, bool]] = {
537
+ "demographics.age_years": (Annotated[int, Field(ge=18, le=100)], True),
538
+ "demographics.sex": (Sex, True),
539
+ "demographics.ethnicity": (Ethnicity | None, False),
540
+ "family_history": (list, False), # list[FamilyMemberCancer]
541
+ }
542
+
543
+ def compute_score(self, user: UserInput) -> str:
544
+ """Compute the risk score for a given user profile."""
545
+ # Validate inputs first
546
+ is_valid, errors = self.validate_inputs(user)
547
+ if not is_valid:
548
+ raise ValueError(f"Invalid inputs for Your: {'; '.join(errors)}")
549
+
550
+ # Model-specific validation
551
+ if user.demographics.sex != Sex.FEMALE:
552
+ return "N/A: Model is only applicable to female patients."
553
+
554
+ # Extract parameters
555
+ age = user.demographics.age_years
556
+ ethnicity = user.demographics.ethnicity
557
+
558
+ # Count family history
559
+ family_count = sum(
560
+ 1 for member in user.family_history
561
+ if member.cancer_type == CancerType.BREAST
562
+ and member.degree == RelationshipDegree.FIRST
563
+ )
564
+
565
+ # Calculate risk (example)
566
+ risk = self._calculate_risk(age, family_count, ethnicity)
567
+ return f"{risk:.2f}"
568
+
569
+ def _calculate_risk(self, age: int, family_count: int, ethnicity: Ethnicity | None) -> float:
570
+ """Calculate the actual risk value."""
571
+ # Implementation here
572
+ return 1.5 # Example
573
+
574
+ def cancer_type(self) -> str:
575
+ return "breast"
576
+
577
+ def description(self) -> str:
578
+ return "Your model description here."
579
+
580
+ def interpretation(self) -> str:
581
+ return "Interpretation guidance here."
582
+
583
+ def references(self) -> list[str]:
584
+ return ["Your reference here."]
585
+ ```
586
+
587
+ This specification ensures consistency, maintainability, and quality across all risk models in the Sentinel system.
apps/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Apps package for the Sentinel project
apps/api/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # API package
apps/api/main.py ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """FastAPI application exposing cancer risk assessment endpoints."""
2
+
3
+ from pathlib import Path
4
+
5
+ from fastapi import FastAPI, HTTPException
6
+
7
+ from sentinel.config import AppConfig, ModelConfig, ResourcePaths
8
+ from sentinel.factory import SentinelFactory
9
+ from sentinel.models import InitialAssessment, UserInput
10
+
11
+ app = FastAPI(
12
+ title="Cancer Risk Assessment Assistant",
13
+ description="API for assessing cancer risks using LLMs.",
14
+ )
15
+
16
+ # Define base paths relative to the project root
17
+ BASE_DIR = Path(__file__).resolve().parents[2] # Go up to project root
18
+ CONFIGS_DIR = BASE_DIR / "configs"
19
+ PROMPTS_DIR = BASE_DIR / "prompts"
20
+
21
+
22
+ def create_knowledge_base_paths() -> ResourcePaths:
23
+ """Build resource path configuration resolved from the repository root.
24
+
25
+ Returns:
26
+ ResourcePaths: Paths pointing to persona, prompt, and configuration
27
+ assets required by the API routes.
28
+ """
29
+
30
+ return ResourcePaths(
31
+ persona=PROMPTS_DIR / "persona" / "default.md",
32
+ instruction_assessment=PROMPTS_DIR / "instruction" / "assessment.md",
33
+ instruction_conversation=PROMPTS_DIR / "instruction" / "conversation.md",
34
+ output_format_assessment=CONFIGS_DIR / "output_format" / "assessment.yaml",
35
+ output_format_conversation=CONFIGS_DIR / "output_format" / "conversation.yaml",
36
+ cancer_modules_dir=CONFIGS_DIR / "knowledge_base" / "cancer_modules",
37
+ dx_protocols_dir=CONFIGS_DIR / "knowledge_base" / "dx_protocols",
38
+ )
39
+
40
+
41
+ @app.get("/")
42
+ async def read_root() -> dict:
43
+ """Return a simple greeting message.
44
+
45
+ Returns:
46
+ dict: A dictionary containing a greeting message.
47
+ """
48
+ return {"message": "Hello, world!"}
49
+
50
+
51
+ @app.post("/assess/{provider}", response_model=InitialAssessment)
52
+ async def assess(
53
+ provider: str,
54
+ user_input: UserInput,
55
+ model: str | None = None,
56
+ cancer_modules: list[str] | None = None,
57
+ dx_protocols: list[str] | None = None,
58
+ ) -> InitialAssessment:
59
+ """Assess cancer risk for a user.
60
+
61
+ Args:
62
+ provider (str): LLM provider identifier (for example ``"openai"`` or
63
+ ``"anthropic"``).
64
+ user_input (UserInput): Structured demographics and clinical
65
+ information supplied by the client.
66
+ model (str | None): Optional model name overriding the provider
67
+ default.
68
+ cancer_modules (list[str] | None): Optional list of cancer module slugs
69
+ to include in the knowledge base.
70
+ dx_protocols (list[str] | None): Optional list of diagnostic protocol
71
+ slugs to include.
72
+
73
+ Returns:
74
+ InitialAssessment: Parsed model output describing the initial
75
+ assessment.
76
+
77
+ Raises:
78
+ HTTPException: 400 for invalid input, 500 for unexpected errors.
79
+ """
80
+ try:
81
+ # Create knowledge base paths
82
+ knowledge_base_paths = create_knowledge_base_paths()
83
+
84
+ # Set default model name if not provided
85
+ if model is None:
86
+ model_defaults = {
87
+ "openai": "gpt-4o-mini",
88
+ "anthropic": "claude-3-5-sonnet-20241022",
89
+ "google": "gemini-1.5-pro",
90
+ }
91
+ model = model_defaults.get(provider, "gpt-4o-mini")
92
+
93
+ # Set default modules if not provided
94
+ if cancer_modules is None:
95
+ cancer_modules_dir = knowledge_base_paths.cancer_modules_dir
96
+ cancer_modules = [p.stem for p in cancer_modules_dir.glob("*.yaml")]
97
+
98
+ if dx_protocols is None:
99
+ dx_protocols_dir = knowledge_base_paths.dx_protocols_dir
100
+ dx_protocols = [p.stem for p in dx_protocols_dir.glob("*.yaml")]
101
+
102
+ # Create AppConfig
103
+ app_config = AppConfig(
104
+ model=ModelConfig(provider=provider, model_name=model),
105
+ knowledge_base_paths=knowledge_base_paths,
106
+ selected_cancer_modules=cancer_modules,
107
+ selected_dx_protocols=dx_protocols,
108
+ )
109
+
110
+ # Create factory and conversation manager
111
+ factory = SentinelFactory(app_config)
112
+ conversation_manager = factory.create_conversation_manager()
113
+
114
+ # Run assessment
115
+ response = conversation_manager.initial_assessment(user_input)
116
+ return response
117
+
118
+ except ValueError as e:
119
+ raise HTTPException(status_code=400, detail=str(e))
120
+ except Exception as e:
121
+ raise HTTPException(status_code=500, detail=f"Internal Server Error: {e!s}")
apps/cli/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # CLI package
apps/cli/main.py ADDED
@@ -0,0 +1,539 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Command-line interface for running assessments and exporting reports."""
2
+
3
+ import json
4
+ from datetime import datetime
5
+ from pathlib import Path
6
+
7
+ import hydra
8
+ from hydra.utils import to_absolute_path
9
+ from omegaconf import DictConfig
10
+
11
+ from sentinel.config import AppConfig, ModelConfig, ResourcePaths
12
+ from sentinel.factory import SentinelFactory
13
+ from sentinel.models import (
14
+ ConversationResponse,
15
+ Demographics,
16
+ FamilyMemberCancer,
17
+ FemaleSpecific,
18
+ InitialAssessment,
19
+ Lifestyle,
20
+ PersonalMedicalHistory,
21
+ UserInput,
22
+ )
23
+ from sentinel.reporting import generate_excel_report, generate_pdf_report
24
+ from sentinel.risk_models import RISK_MODELS
25
+ from sentinel.utils import load_user_file
26
+
27
+
28
+ # Color codes for terminal output
29
+ class Colors:
30
+ """ANSI color codes for terminal output formatting."""
31
+
32
+ HEADER = "\033[95m"
33
+ OKBLUE = "\033[94m"
34
+ OKCYAN = "\033[96m"
35
+ OKGREEN = "\033[92m"
36
+ WARNING = "\033[93m"
37
+ FAIL = "\033[91m"
38
+ ENDC = "\033[0m"
39
+ BOLD = "\033[1m"
40
+ UNDERLINE = "\033[4m"
41
+
42
+
43
+ def _get_input(prompt: str, optional: bool = False) -> str:
44
+ """Get a line of input from the user.
45
+
46
+ Args:
47
+ prompt: Message to display to the user.
48
+ optional: If True, allow empty input to be returned as an empty string.
49
+
50
+ Returns:
51
+ The raw string entered by the user (may be empty if optional).
52
+ """
53
+ suffix = " (optional, press Enter to skip)" if optional else ""
54
+ return input(f"{Colors.OKCYAN}{prompt}{suffix}:{Colors.ENDC} ")
55
+
56
+
57
+ def _get_int_input(prompt: str, optional: bool = False) -> int | None:
58
+ """Get an integer from the user.
59
+
60
+ Args:
61
+ prompt: Message to display to the user.
62
+ optional: If True, allow empty input and return None.
63
+
64
+ Returns:
65
+ The parsed integer value, or None if optional and left empty.
66
+ """
67
+ while True:
68
+ val = _get_input(prompt, optional)
69
+ if not val and optional:
70
+ return None
71
+ try:
72
+ return int(val)
73
+ except (ValueError, TypeError):
74
+ print(f"{Colors.WARNING}Please enter a valid number.{Colors.ENDC}")
75
+
76
+
77
+ def collect_user_input() -> UserInput:
78
+ """Collect user profile data interactively.
79
+
80
+ Returns:
81
+ UserInput: Structured demographics, lifestyle, and clinical data
82
+ assembled from CLI prompts.
83
+ """
84
+ print(
85
+ f"\n{Colors.HEADER}{Colors.BOLD}=== User Information Collection ==={Colors.ENDC}"
86
+ )
87
+ print("Please provide the following details for your assessment.")
88
+
89
+ # --- DEMOGRAPHICS ---
90
+ print(f"\n{Colors.OKBLUE}{Colors.BOLD}--- Demographics ---{Colors.ENDC}")
91
+ age = _get_int_input("Age")
92
+ sex = _get_input("Biological Sex (e.g., Male, Female)")
93
+ ethnicity = _get_input("Ethnicity", optional=True)
94
+ demographics = Demographics(age=age, sex=sex, ethnicity=ethnicity)
95
+
96
+ # --- LIFESTYLE ---
97
+ print(f"\n{Colors.OKBLUE}{Colors.BOLD}--- Lifestyle ---{Colors.ENDC}")
98
+ smoking_status = _get_input("Smoking Status (e.g., never, former, current)")
99
+ smoking_pack_years = (
100
+ _get_int_input("Smoking Pack-Years", optional=True)
101
+ if smoking_status in ["former", "current"]
102
+ else None
103
+ )
104
+ alcohol_consumption = _get_input(
105
+ "Alcohol Consumption (e.g., none, light, moderate, heavy)"
106
+ )
107
+ dietary_habits = _get_input("Dietary Habits", optional=True)
108
+ physical_activity_level = _get_input("Physical Activity Level", optional=True)
109
+ lifestyle = Lifestyle(
110
+ smoking_status=smoking_status,
111
+ smoking_pack_years=smoking_pack_years,
112
+ alcohol_consumption=alcohol_consumption,
113
+ dietary_habits=dietary_habits,
114
+ physical_activity_level=physical_activity_level,
115
+ )
116
+
117
+ # --- PERSONAL MEDICAL HISTORY ---
118
+ print(
119
+ f"\n{Colors.OKBLUE}{Colors.BOLD}--- Personal Medical History ---{Colors.ENDC}"
120
+ )
121
+ mutations = _get_input("Known genetic mutations (comma-separated)", optional=True)
122
+ cancers = _get_input("Previous cancers (comma-separated)", optional=True)
123
+ illnesses = _get_input(
124
+ "Chronic illnesses (e.g., IBD, comma-separated)", optional=True
125
+ )
126
+ personal_medical_history = PersonalMedicalHistory(
127
+ known_genetic_mutations=[m.strip() for m in mutations.split(",")]
128
+ if mutations
129
+ else [],
130
+ previous_cancers=[c.strip() for c in cancers.split(",")] if cancers else [],
131
+ chronic_illnesses=[i.strip() for i in illnesses.split(",")]
132
+ if illnesses
133
+ else [],
134
+ )
135
+
136
+ # --- CLINICAL OBSERVATIONS ---
137
+ print(
138
+ f"\n{Colors.OKBLUE}{Colors.BOLD}--- Clinical Observations / Test Results (Optional) ---{Colors.ENDC}"
139
+ )
140
+ clinical_observations = []
141
+ while True:
142
+ add_test = _get_input(
143
+ "Add a clinical observation or test result? (y/N)"
144
+ ).lower()
145
+ if add_test not in ["y", "yes"]:
146
+ break
147
+ test_name = _get_input("Test/Observation Name")
148
+ value = _get_input("Value")
149
+ unit = _get_input("Unit (e.g., ng/mL, or N/A)")
150
+ reference_range = _get_input("Reference Range", optional=True)
151
+ date = _get_input("Date of Test (YYYY-MM-DD)", optional=True)
152
+ clinical_observations.append(
153
+ {
154
+ "test_name": test_name,
155
+ "value": value,
156
+ "unit": unit,
157
+ "reference_range": reference_range or None,
158
+ "date": date or None,
159
+ }
160
+ )
161
+
162
+ # --- FAMILY HISTORY ---
163
+ print(
164
+ f"\n{Colors.OKBLUE}{Colors.BOLD}--- Family History of Cancer ---{Colors.ENDC}"
165
+ )
166
+ family_history = []
167
+ while True:
168
+ add_relative = _get_input("Add a family member with cancer? (y/N)").lower()
169
+ if add_relative not in ["y", "yes"]:
170
+ break
171
+ relative = _get_input("Relative (e.g., mother, sister)")
172
+ cancer_type = _get_input("Cancer Type")
173
+ age_at_diagnosis = _get_int_input("Age at Diagnosis", optional=True)
174
+ family_history.append(
175
+ FamilyMemberCancer(
176
+ relative=relative,
177
+ cancer_type=cancer_type,
178
+ age_at_diagnosis=age_at_diagnosis,
179
+ )
180
+ )
181
+
182
+ # --- FEMALE-SPECIFIC ---
183
+ female_specific = None
184
+ if sex.lower() == "female":
185
+ print(
186
+ f"\n{Colors.OKBLUE}{Colors.BOLD}--- Female-Specific Information ---{Colors.ENDC}"
187
+ )
188
+ age_at_first_period = _get_int_input("Age at first period", optional=True)
189
+ age_at_menopause = _get_int_input("Age at menopause", optional=True)
190
+ num_live_births = _get_int_input("Number of live births", optional=True)
191
+ age_at_first_live_birth = _get_int_input(
192
+ "Age at first live birth", optional=True
193
+ )
194
+ hormone_therapy_use = _get_input("Hormone therapy use", optional=True)
195
+ female_specific = FemaleSpecific(
196
+ age_at_first_period=age_at_first_period,
197
+ age_at_menopause=age_at_menopause,
198
+ num_live_births=num_live_births,
199
+ age_at_first_live_birth=age_at_first_live_birth,
200
+ hormone_therapy_use=hormone_therapy_use,
201
+ )
202
+
203
+ # --- CURRENT CONCERNS ---
204
+ print(f"\n{Colors.OKBLUE}{Colors.BOLD}--- Current Concerns ---{Colors.ENDC}")
205
+ current_concerns_or_symptoms = _get_input(
206
+ "Current symptoms or health concerns", optional=True
207
+ )
208
+
209
+ return UserInput(
210
+ demographics=demographics,
211
+ lifestyle=lifestyle,
212
+ family_history=family_history,
213
+ personal_medical_history=personal_medical_history,
214
+ female_specific=female_specific,
215
+ current_concerns_or_symptoms=current_concerns_or_symptoms,
216
+ clinical_observations=clinical_observations,
217
+ )
218
+
219
+
220
+ def format_risk_assessment(response: InitialAssessment, dev_mode: bool = False) -> None:
221
+ """Pretty-print an initial risk assessment payload.
222
+
223
+ Args:
224
+ response (InitialAssessment): Parsed result returned by the assessment
225
+ chain.
226
+ dev_mode (bool): Flag enabling verbose debugging output.
227
+ """
228
+ # In dev mode, show everything
229
+ if dev_mode:
230
+ print(
231
+ f"\n{Colors.WARNING}{Colors.BOLD}--- DEV MODE: RAW MODEL OUTPUT ---{Colors.ENDC}"
232
+ )
233
+ # Use model_dump instead of model_dump_json for direct printing
234
+ print(json.dumps(response.model_dump(), indent=2))
235
+ print(
236
+ f"\n{Colors.WARNING}{Colors.BOLD}--- DEV MODE: PARSED & VALIDATED PYDANTIC OBJECT ---{Colors.ENDC}"
237
+ )
238
+ if response.thinking:
239
+ print(
240
+ f"{Colors.OKCYAN}{Colors.BOLD}🤔 Chain of Thought (`<think>` block):{Colors.ENDC}"
241
+ )
242
+ print(response.thinking)
243
+ print(f"{Colors.WARNING}{Colors.BOLD}{'-' * 30}{Colors.ENDC}")
244
+ if response.reasoning:
245
+ print(
246
+ f"{Colors.OKCYAN}{Colors.BOLD}🧠 Reasoning (`<reasoning>` block):{Colors.ENDC}"
247
+ )
248
+ print(response.reasoning)
249
+ print(f"{Colors.WARNING}{Colors.BOLD}{'-' * 30}{Colors.ENDC}")
250
+ print(f"{Colors.OKCYAN}{Colors.BOLD}Full Pydantic Object:{Colors.ENDC}")
251
+
252
+ # return
253
+ print(
254
+ f"\n{Colors.WARNING}{Colors.BOLD}--- DEV MODE: FORMATTED MODEL OUTPUT ---{Colors.ENDC}"
255
+ )
256
+
257
+ # User-friendly formatting
258
+ print(f"\n{Colors.HEADER}{Colors.BOLD}{'=' * 60}")
259
+ print("🏥 CANCER RISK ASSESSMENT REPORT")
260
+ print(f"{'=' * 60}{Colors.ENDC}")
261
+
262
+ # Display the primary user-facing response first
263
+ if response.response:
264
+ print(f"\n{Colors.OKCYAN}{Colors.BOLD}🤖 BiOS:{Colors.ENDC}")
265
+ print(response.response)
266
+
267
+ # Then display the structured summary and details
268
+ print(f"\n{Colors.OKBLUE}{Colors.BOLD}📋 OVERALL SUMMARY{Colors.ENDC}")
269
+ if response.overall_risk_score is not None:
270
+ print(
271
+ f"{Colors.OKCYAN}Overall Risk Score: {Colors.BOLD}{response.overall_risk_score}/100{Colors.ENDC}"
272
+ )
273
+ if response.overall_summary:
274
+ print(f"{Colors.OKCYAN}{response.overall_summary}{Colors.ENDC}")
275
+
276
+ # Risk assessments
277
+ risk_assessments = response.risk_assessments
278
+ if risk_assessments:
279
+ print(
280
+ f"\n{Colors.OKBLUE}{Colors.BOLD}🎯 DETAILED RISK ASSESSMENTS{Colors.ENDC}"
281
+ )
282
+ print(f"{Colors.OKBLUE}{'─' * 40}{Colors.ENDC}")
283
+
284
+ for i, assessment in enumerate(risk_assessments, 1):
285
+ cancer_type = assessment.cancer_type
286
+ risk_level = assessment.risk_level
287
+ explanation = assessment.explanation
288
+
289
+ # Color code risk levels
290
+ if risk_level is None:
291
+ risk_color = Colors.ENDC
292
+ elif risk_level <= 2:
293
+ risk_color = Colors.OKGREEN
294
+ elif risk_level == 3:
295
+ risk_color = Colors.WARNING
296
+ else: # 4-5
297
+ risk_color = Colors.FAIL
298
+
299
+ print(f"\n{Colors.BOLD}{i}. {cancer_type.upper()}{Colors.ENDC}")
300
+ print(
301
+ f" 🎚️ Risk Level: {risk_color}{Colors.BOLD}{risk_level or 'N/A'}{Colors.ENDC}"
302
+ )
303
+ print(f" 💭 Explanation: {explanation}")
304
+
305
+ # Optional fields
306
+ if assessment.recommended_steps:
307
+ print(" 📝 Recommended Steps:")
308
+ if isinstance(assessment.recommended_steps, list):
309
+ for step in assessment.recommended_steps:
310
+ print(f" • {step}")
311
+ else:
312
+ print(f" • {assessment.recommended_steps}")
313
+
314
+ if assessment.lifestyle_advice:
315
+ print(f" 🌟 Lifestyle Advice: {assessment.lifestyle_advice}")
316
+
317
+ if i < len(risk_assessments):
318
+ print(f" {Colors.OKBLUE}{'─' * 40}{Colors.ENDC}")
319
+
320
+ # Diagnostic recommendations
321
+ dx_recommendations = response.dx_recommendations
322
+ if dx_recommendations:
323
+ print(
324
+ f"\n{Colors.OKBLUE}{Colors.BOLD}🔬 DIAGNOSTIC RECOMMENDATIONS{Colors.ENDC}"
325
+ )
326
+ print(f"{Colors.OKBLUE}{'─' * 40}{Colors.ENDC}")
327
+
328
+ for i, dx_rec in enumerate(dx_recommendations, 1):
329
+ test_name = dx_rec.test_name
330
+ frequency = dx_rec.frequency
331
+ rationale = dx_rec.rationale
332
+ recommendation_level = dx_rec.recommendation_level
333
+
334
+ level_text = ""
335
+ if recommendation_level is not None:
336
+ level_map = {
337
+ 1: "Unsuitable",
338
+ 2: "Unnecessary",
339
+ 3: "Optional",
340
+ 4: "Recommended",
341
+ 5: "Critical - Do not skip",
342
+ }
343
+ level_text = f" ({level_map.get(recommendation_level, 'Unknown')})"
344
+
345
+ print(f"\n{Colors.BOLD}{i}. {test_name.upper()}{Colors.ENDC}")
346
+ if recommendation_level is not None:
347
+ print(
348
+ f" ⭐ Recommendation Level: {Colors.BOLD}{recommendation_level}/5{level_text}{Colors.ENDC}"
349
+ )
350
+ print(f" 📅 Frequency: {Colors.OKGREEN}{frequency}{Colors.ENDC}")
351
+ print(f" 💭 Rationale: {rationale}")
352
+
353
+ if dx_rec.applicable_guideline:
354
+ print(f" 📜 Applicable Guideline: {dx_rec.applicable_guideline}")
355
+
356
+ if i < len(dx_recommendations):
357
+ print(f" {Colors.OKBLUE}{'─' * 40}{Colors.ENDC}")
358
+
359
+ print(
360
+ f"\n{Colors.WARNING}⚠️ IMPORTANT: This assessment does not replace professional medical advice.{Colors.ENDC}"
361
+ )
362
+ print(f"{Colors.HEADER}{'=' * 60}{Colors.ENDC}")
363
+
364
+
365
+ def format_followup_response(
366
+ response: ConversationResponse, dev_mode: bool = False
367
+ ) -> None:
368
+ """Display follow-up conversation output.
369
+
370
+ Args:
371
+ response (ConversationResponse): Conversation exchange returned by the
372
+ LLM chain.
373
+ dev_mode (bool): Flag enabling verbose debugging output.
374
+ """
375
+ if dev_mode:
376
+ print(
377
+ f"\n{Colors.WARNING}{Colors.BOLD}--- DEV MODE: RAW MODEL OUTPUT ---{Colors.ENDC}"
378
+ )
379
+ # Use model_dump instead of model_dump_json for direct printing
380
+ print(json.dumps(response.model_dump(), indent=2))
381
+ print(
382
+ f"\n{Colors.WARNING}{Colors.BOLD}--- DEV MODE: PARSED RESPONSE ---{Colors.ENDC}"
383
+ )
384
+ if response.thinking:
385
+ print(f"\n{Colors.OKCYAN}{Colors.BOLD}🤔 Chain of Thought:{Colors.ENDC}")
386
+ print(f"{Colors.OKCYAN}{response.thinking}{Colors.ENDC}")
387
+
388
+ print(f"\n{Colors.OKCYAN}{Colors.BOLD}🤖 BiOS:{Colors.ENDC}")
389
+ print(f"{response.response}")
390
+
391
+
392
+ @hydra.main(config_path="../../configs", config_name="config", version_base=None)
393
+ def main(cfg: DictConfig) -> None:
394
+ """Entry point for the CLI tool invoked via Hydra.
395
+
396
+ Args:
397
+ cfg (DictConfig): Hydra configuration containing model, knowledge base,
398
+ and runtime settings.
399
+ """
400
+ print(
401
+ f"{Colors.HEADER}{Colors.BOLD}Welcome to the Cancer Risk Assessment Tool{Colors.ENDC}"
402
+ )
403
+ print(
404
+ f"{Colors.OKBLUE}This tool provides preliminary cancer risk assessments based on your input.{Colors.ENDC}\n"
405
+ )
406
+
407
+ dev_mode = cfg.dev_mode
408
+
409
+ if dev_mode:
410
+ print(
411
+ f"{Colors.WARNING}🔧 Running in developer mode - raw JSON output enabled{Colors.ENDC}"
412
+ )
413
+ else:
414
+ print(
415
+ f"{Colors.OKGREEN}👤 Running in user mode - formatted output enabled{Colors.ENDC}"
416
+ )
417
+
418
+ model = cfg.model.model_name
419
+ provider = cfg.model.provider
420
+ print(f"{Colors.OKBLUE}🤖 Using model: {model} from {provider}{Colors.ENDC}")
421
+
422
+ # Create ResourcePaths with resolved absolute paths
423
+ knowledge_base_paths = ResourcePaths(
424
+ persona=Path(to_absolute_path("prompts/persona/default.md")),
425
+ instruction_assessment=Path(
426
+ to_absolute_path("prompts/instruction/assessment.md")
427
+ ),
428
+ instruction_conversation=Path(
429
+ to_absolute_path("prompts/instruction/conversation.md")
430
+ ),
431
+ output_format_assessment=Path(
432
+ to_absolute_path("configs/output_format/assessment.yaml")
433
+ ),
434
+ output_format_conversation=Path(
435
+ to_absolute_path("configs/output_format/conversation.yaml")
436
+ ),
437
+ cancer_modules_dir=Path(
438
+ to_absolute_path("configs/knowledge_base/cancer_modules")
439
+ ),
440
+ dx_protocols_dir=Path(to_absolute_path("configs/knowledge_base/dx_protocols")),
441
+ )
442
+
443
+ # Create AppConfig from Hydra config
444
+ app_config = AppConfig(
445
+ model=ModelConfig(provider=cfg.model.provider, model_name=cfg.model.model_name),
446
+ knowledge_base_paths=knowledge_base_paths,
447
+ selected_cancer_modules=list(cfg.knowledge_base.cancer_modules),
448
+ selected_dx_protocols=list(cfg.knowledge_base.dx_protocols),
449
+ )
450
+
451
+ # Create factory and conversation manager
452
+ factory = SentinelFactory(app_config)
453
+ conversation = factory.create_conversation_manager()
454
+
455
+ if cfg.user_file:
456
+ print(f"{Colors.OKBLUE}📂 Loading user data from: {cfg.user_file}{Colors.ENDC}")
457
+ user = load_user_file(cfg.user_file)
458
+ else:
459
+ user = collect_user_input()
460
+
461
+ print(f"\n{Colors.OKCYAN}🔄 Running risk scoring tools...{Colors.ENDC}")
462
+ risks_scores = []
463
+ for model in RISK_MODELS:
464
+ risk_score = model().run(user)
465
+ risks_scores.append(risk_score)
466
+
467
+ user.risks_scores = risks_scores
468
+ for risk_score in risks_scores:
469
+ print(f"{Colors.OKCYAN}🔄 {risk_score.name}: {risk_score.score}{Colors.ENDC}")
470
+
471
+ print(f"\n{Colors.OKGREEN}🔄 Analyzing your information...{Colors.ENDC}")
472
+ response = None
473
+ try:
474
+ response = conversation.initial_assessment(user)
475
+ format_risk_assessment(response, dev_mode)
476
+ except Exception as e:
477
+ print(f"{Colors.FAIL}❌ Error generating assessment: {e}{Colors.ENDC}")
478
+ return
479
+
480
+ if response:
481
+ export_choice = input(
482
+ f"\n{Colors.OKCYAN}Export full report to a file? (pdf/excel/both/N):{Colors.ENDC} "
483
+ ).lower()
484
+ if export_choice in ["pdf", "excel", "both"]:
485
+ output_dir = Path("outputs")
486
+ output_dir.mkdir(exist_ok=True)
487
+ timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
488
+ base_filename = f"Cancer_Risk_Report_{timestamp}"
489
+
490
+ if export_choice in ["pdf", "both"]:
491
+ pdf_filename = output_dir / f"{base_filename}.pdf"
492
+ try:
493
+ print(f"{Colors.OKCYAN}Generating PDF report...{Colors.ENDC}")
494
+ generate_pdf_report(response, user, str(pdf_filename))
495
+ print(
496
+ f"{Colors.OKGREEN}✅ Successfully generated {pdf_filename}{Colors.ENDC}"
497
+ )
498
+ except Exception as e:
499
+ print(
500
+ f"{Colors.FAIL}❌ Error generating PDF report: {e}{Colors.ENDC}"
501
+ )
502
+
503
+ if export_choice in ["excel", "both"]:
504
+ excel_filename = output_dir / f"{base_filename}.xlsx"
505
+ try:
506
+ print(f"{Colors.OKCYAN}Generating Excel report...{Colors.ENDC}")
507
+ generate_excel_report(response, user, str(excel_filename))
508
+ print(
509
+ f"{Colors.OKGREEN}✅ Successfully generated {excel_filename}{Colors.ENDC}"
510
+ )
511
+ except Exception as e:
512
+ print(
513
+ f"{Colors.FAIL}❌ Error generating Excel report: {e}{Colors.ENDC}"
514
+ )
515
+
516
+ # Follow-up conversation loop
517
+ print(
518
+ f"\n{Colors.OKBLUE}{Colors.BOLD}💬 You can now ask follow-up questions. Type 'quit' to exit.{Colors.ENDC}"
519
+ )
520
+ while True:
521
+ q = input(f"\n{Colors.BOLD}You: {Colors.ENDC}")
522
+ if q.lower() in {"quit", "exit", "q"}:
523
+ print(
524
+ f"{Colors.OKGREEN}👋 Thank you for using the Cancer Risk Assessment Tool!{Colors.ENDC}"
525
+ )
526
+ break
527
+
528
+ if not q.strip():
529
+ continue
530
+
531
+ try:
532
+ text = conversation.follow_up(q)
533
+ format_followup_response(text, dev_mode)
534
+ except Exception as e:
535
+ print(f"{Colors.FAIL}❌ Error: {e}{Colors.ENDC}")
536
+
537
+
538
+ if __name__ == "__main__":
539
+ main()
apps/streamlit_ui/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Streamlit UI package
apps/streamlit_ui/main.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Streamlit entry point for the Sentinel expert feedback UI."""
2
+
3
+ import streamlit as st
4
+
5
+ # --- Page Configuration ---
6
+ st.set_page_config(
7
+ page_title="Sentinel | AI Cancer Risk Assessment", page_icon="⚕️", layout="wide"
8
+ )
9
+
10
+ # --- Header Section ---
11
+ st.title("Sentinel: AI-Powered Cancer Risk Assessment")
12
+ st.markdown("""
13
+ Welcome to **Sentinel**, an advanced demonstration of an AI-powered assistant for evidence-based cancer risk assessment.
14
+ This tool analyzes user-provided health data to generate a preliminary risk profile and personalized diagnostic recommendations based on a configurable knowledge base.
15
+ """)
16
+
17
+ st.divider()
18
+
19
+ # --- Key Features Section ---
20
+ st.header("How It Works", anchor=False)
21
+ col1, col2, col3 = st.columns(3, gap="large")
22
+
23
+ with col1:
24
+ st.subheader("👤 1. Build Your Profile")
25
+ st.write(
26
+ "Navigate to the **Profile** page to input your health information. "
27
+ "You can either upload a pre-filled YAML file or create a new profile from scratch using our guided form."
28
+ )
29
+
30
+ with col2:
31
+ st.subheader("⚙️ 2. Configure the AI")
32
+ st.write(
33
+ "On the **Configuration** page, you can select the AI model and the specific cancer modules and diagnostic protocols "
34
+ "from our knowledge base that will be used for your assessment."
35
+ )
36
+
37
+ with col3:
38
+ st.subheader("🔬 3. Run the Assessment")
39
+ st.write(
40
+ "Finally, visit the **Assessment** page to run the analysis. You'll receive a full dashboard of your results, "
41
+ "and you can interact with the AI assistant via a chat interface."
42
+ )
43
+
44
+ # --- Call to Action / How to Get Started ---
45
+ st.header("Get Started", anchor=False)
46
+ st.page_link(
47
+ "pages/1_Profile.py", label="**Go to the Profile Page to begin →**", icon="👤"
48
+ )
49
+
50
+ st.divider()
51
+
52
+ st.warning(
53
+ "**Disclaimer:** This is a demo application - please report any bugs or issues to Tom!"
54
+ )
55
+
56
+
57
+ # --- Footer / About Section ---
58
+ with st.sidebar:
59
+ st.info("Created by **Tom Barrett**")
60
+ with st.expander("About Sentinel"):
61
+ st.markdown("""
62
+ This application uses a Large Language Model (LLM) to synthesize user data with an evidence-based knowledge base,
63
+ providing a nuanced, preliminary cancer risk assessment.
64
+
65
+ **Powered by:**
66
+ - Streamlit
67
+ - FastAPI
68
+ - LangChain
69
+ - ChatGPT, Google Gemini, Llama, etc.
70
+ - ☕ Coffee
71
+ """)
apps/streamlit_ui/page_versions/profile/v1.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Legacy v1 profile page components for Streamlit UI."""
2
+
3
+ import streamlit as st
4
+
5
+
6
+ def render():
7
+ """Renders the V1 view of the Profile page (JSON Viewer)."""
8
+
9
+ st.markdown("### V1: Simple JSON Viewer")
10
+ st.info(
11
+ "This view displays the raw JSON of the loaded user profile. It is not editable."
12
+ )
13
+
14
+ profile = st.session_state.get("user_profile")
15
+
16
+ if profile is not None:
17
+ # Display the profile using st.json for clarity and robustness
18
+ st.json(profile.model_dump_json())
19
+ else:
20
+ st.warning("No user profile loaded. Please create or upload one.")
apps/streamlit_ui/page_versions/profile/v2.py ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """V2 profile page with editable form for Streamlit UI."""
2
+
3
+ import pandas as pd
4
+ import streamlit as st
5
+
6
+ from sentinel.models import (
7
+ ClinicalObservation,
8
+ Demographics,
9
+ FamilyMemberCancer,
10
+ FemaleSpecific,
11
+ Lifestyle,
12
+ PersonalMedicalHistory,
13
+ UserInput,
14
+ )
15
+ from sentinel.risk_models import RISK_MODELS
16
+
17
+
18
+ def render():
19
+ """Renders the V2 view of the Profile page (Editable Form)."""
20
+
21
+ st.markdown("### V2: Editable Profile Form")
22
+ st.info(
23
+ "This view populates an editable form with the loaded profile data, allowing you to make and save changes."
24
+ )
25
+
26
+ profile = st.session_state.get("user_profile")
27
+
28
+ if profile is None:
29
+ st.warning("No user profile loaded. Please create or upload one.")
30
+ return
31
+
32
+ with st.container(border=True):
33
+ # This selectbox stays outside the form but inside the container.
34
+ sex_options = ["Female", "Male", "Other"]
35
+ try:
36
+ current_sex_index = sex_options.index(profile.demographics.sex)
37
+ except ValueError:
38
+ current_sex_index = 0
39
+
40
+ sex = st.selectbox(
41
+ "Biological Sex",
42
+ options=sex_options,
43
+ index=current_sex_index,
44
+ key="edit_profile_sex",
45
+ help="Changing this will dynamically show or hide sex-specific fields in the form below.",
46
+ )
47
+
48
+ # The form starts here and should contain all the fields and the submit button.
49
+ with st.form(key="edit_profile_form"):
50
+ st.subheader("Demographics")
51
+ age = st.number_input(
52
+ "Age", min_value=0, step=1, value=profile.demographics.age
53
+ )
54
+ ethnicity = st.text_input(
55
+ "Ethnicity", value=profile.demographics.ethnicity or ""
56
+ )
57
+
58
+ st.subheader("Lifestyle")
59
+ smoking_options = ["never", "former", "current"]
60
+ try:
61
+ smoking_index = smoking_options.index(profile.lifestyle.smoking_status)
62
+ except ValueError:
63
+ st.warning(
64
+ f"Invalid 'smoking_status' ('{profile.lifestyle.smoking_status}') in file. Defaulting to '{smoking_options[0]}'."
65
+ )
66
+ smoking_index = 0
67
+ smoking_status = st.selectbox(
68
+ "Smoking Status", smoking_options, index=smoking_index
69
+ )
70
+ smoking_pack_years = st.number_input(
71
+ "Pack-Years",
72
+ min_value=0,
73
+ step=1,
74
+ value=profile.lifestyle.smoking_pack_years or 0,
75
+ )
76
+ alcohol_options = ["none", "light", "moderate", "heavy"]
77
+ try:
78
+ alcohol_index = alcohol_options.index(
79
+ profile.lifestyle.alcohol_consumption
80
+ )
81
+ except ValueError:
82
+ st.warning(
83
+ f"Invalid 'alcohol_consumption' ('{profile.lifestyle.alcohol_consumption}') in file. Defaulting to '{alcohol_options[0]}'."
84
+ )
85
+ alcohol_index = 0
86
+ alcohol_consumption = st.selectbox(
87
+ "Alcohol Consumption", alcohol_options, index=alcohol_index
88
+ )
89
+ dietary_habits = st.text_area(
90
+ "Dietary Habits", value=profile.lifestyle.dietary_habits or ""
91
+ )
92
+ physical_activity_level = st.text_area(
93
+ "Physical Activity",
94
+ value=profile.lifestyle.physical_activity_level or "",
95
+ )
96
+
97
+ st.subheader("Personal Medical History")
98
+ known_genetic_mutations = st.text_input(
99
+ "Known Genetic Mutations (comma-separated)",
100
+ value=", ".join(
101
+ profile.personal_medical_history.known_genetic_mutations
102
+ ),
103
+ )
104
+ previous_cancers = st.text_input(
105
+ "Previous Cancers (comma-separated)",
106
+ value=", ".join(profile.personal_medical_history.previous_cancers),
107
+ )
108
+ chronic_illnesses = st.text_input(
109
+ "Chronic Illnesses (comma-separated)",
110
+ value=", ".join(profile.personal_medical_history.chronic_illnesses),
111
+ )
112
+
113
+ st.subheader("Family History")
114
+ fam_cols = ["relative", "cancer_type", "age_at_diagnosis"]
115
+ fam_history_data = [m.model_dump() for m in profile.family_history]
116
+ fam_history_df = (
117
+ pd.DataFrame(fam_history_data, columns=fam_cols)
118
+ if fam_history_data
119
+ else pd.DataFrame(columns=fam_cols)
120
+ )
121
+ edited_fam_history = st.data_editor(
122
+ fam_history_df,
123
+ num_rows="dynamic",
124
+ key="edit_family_history_editor",
125
+ use_container_width=True,
126
+ )
127
+
128
+ st.subheader("Clinical Observations")
129
+ obs_cols = ["test_name", "value", "unit", "reference_range", "date"]
130
+ obs_data = [o.model_dump() for o in profile.clinical_observations]
131
+ obs_df = (
132
+ pd.DataFrame(obs_data, columns=obs_cols)
133
+ if obs_data
134
+ else pd.DataFrame(columns=obs_cols)
135
+ )
136
+ edited_obs = st.data_editor(
137
+ obs_df,
138
+ num_rows="dynamic",
139
+ key="edit_clinical_obs_editor",
140
+ use_container_width=True,
141
+ )
142
+
143
+ female_specific_data = {}
144
+ if sex == "Female":
145
+ st.subheader("Female-Specific")
146
+ fs_profile = profile.female_specific or FemaleSpecific()
147
+ female_specific_data["age_at_first_period"] = st.number_input(
148
+ "Age at First Period",
149
+ min_value=0,
150
+ step=1,
151
+ value=fs_profile.age_at_first_period or 0,
152
+ )
153
+ female_specific_data["age_at_menopause"] = st.number_input(
154
+ "Age at Menopause",
155
+ min_value=0,
156
+ step=1,
157
+ value=fs_profile.age_at_menopause or 0,
158
+ )
159
+ female_specific_data["num_live_births"] = st.number_input(
160
+ "Number of Live Births",
161
+ min_value=0,
162
+ step=1,
163
+ value=fs_profile.num_live_births or 0,
164
+ )
165
+ female_specific_data["age_at_first_live_birth"] = st.number_input(
166
+ "Age at First Live Birth",
167
+ min_value=0,
168
+ step=1,
169
+ value=fs_profile.age_at_first_live_birth or 0,
170
+ )
171
+ female_specific_data["hormone_therapy_use"] = st.text_input(
172
+ "Hormone Therapy Use", value=fs_profile.hormone_therapy_use or ""
173
+ )
174
+
175
+ current_concerns = st.text_area(
176
+ "Current Concerns or Symptoms",
177
+ value=profile.current_concerns_or_symptoms or "",
178
+ )
179
+
180
+ # The submit button MUST be inside the 'with st.form' block.
181
+ submitted = st.form_submit_button("Save Changes")
182
+ if submitted:
183
+ try:
184
+ demographics = Demographics(
185
+ age=int(age), sex=sex, ethnicity=ethnicity or None
186
+ )
187
+ lifestyle = Lifestyle(
188
+ smoking_status=smoking_status,
189
+ smoking_pack_years=int(smoking_pack_years) or None,
190
+ alcohol_consumption=alcohol_consumption,
191
+ dietary_habits=dietary_habits or None,
192
+ physical_activity_level=physical_activity_level or None,
193
+ )
194
+ pmh = PersonalMedicalHistory(
195
+ known_genetic_mutations=[
196
+ m.strip()
197
+ for m in known_genetic_mutations.split(",")
198
+ if m.strip()
199
+ ],
200
+ previous_cancers=[
201
+ c.strip() for c in previous_cancers.split(",") if c.strip()
202
+ ],
203
+ chronic_illnesses=[
204
+ i.strip() for i in chronic_illnesses.split(",") if i.strip()
205
+ ],
206
+ )
207
+ family_history = [
208
+ FamilyMemberCancer(**row.to_dict())
209
+ for _, row in edited_fam_history.dropna(how="all").iterrows()
210
+ ]
211
+ observations = [
212
+ ClinicalObservation(**row.to_dict())
213
+ for _, row in edited_obs.dropna(how="all").iterrows()
214
+ ]
215
+
216
+ female_specific = None
217
+ if sex == "Female":
218
+ if any(female_specific_data.values()):
219
+ female_specific = FemaleSpecific(**female_specific_data)
220
+
221
+ updated_profile = UserInput(
222
+ demographics=demographics,
223
+ lifestyle=lifestyle,
224
+ family_history=family_history,
225
+ personal_medical_history=pmh,
226
+ female_specific=female_specific,
227
+ current_concerns_or_symptoms=current_concerns or None,
228
+ clinical_observations=observations,
229
+ )
230
+
231
+ with st.spinner("Calculating risk scores..."):
232
+ risks_scores = []
233
+ for model in RISK_MODELS:
234
+ risk_score = model().run(updated_profile)
235
+ risks_scores.append(risk_score)
236
+
237
+ # Attach the scores to the object before saving
238
+ updated_profile.risks_scores = risks_scores
239
+
240
+ # Now save the fully updated object to the session state
241
+ st.session_state.user_profile = updated_profile
242
+ st.success("Profile updated and risk scores calculated!")
243
+ st.rerun()
244
+
245
+ except Exception as e:
246
+ st.error(f"Error updating profile: {e}")
apps/streamlit_ui/pages/1_Profile.py ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """User profile management page."""
2
+
3
+ import sys
4
+ from pathlib import Path
5
+
6
+ # Add the project root to the Python path
7
+ # This is necessary for Streamlit to find modules in the 'apps' directory
8
+ project_root = Path(__file__).resolve().parents[3]
9
+ if str(project_root) not in sys.path:
10
+ sys.path.append(str(project_root))
11
+
12
+ import pandas as pd
13
+ import streamlit as st
14
+
15
+ from apps.streamlit_ui.page_versions.profile import v1, v2
16
+ from sentinel.models import (
17
+ ClinicalObservation,
18
+ Demographics,
19
+ FamilyMemberCancer,
20
+ FemaleSpecific,
21
+ Lifestyle,
22
+ PersonalMedicalHistory,
23
+ UserInput,
24
+ )
25
+ from sentinel.utils import load_user_file
26
+
27
+
28
+ # --- Helper Functions ---
29
+ def clear_profile_state():
30
+ """Callback function to reset profile-related session state."""
31
+ st.session_state.user_profile = None
32
+ if "profile_upload" in st.session_state:
33
+ del st.session_state["profile_upload"]
34
+
35
+
36
+ # --- Main Page Layout ---
37
+ st.title("👤 User Profile")
38
+
39
+ # --- Sidebar for Version Selection and Upload ---
40
+ with st.sidebar:
41
+ st.header("Controls")
42
+
43
+ # Version selection
44
+ version_options = ["V2 (Editable Form)", "V1 (JSON Viewer)"]
45
+ version = st.radio(
46
+ "Select Demo Version",
47
+ version_options,
48
+ help="Choose the version of the profile page to display.",
49
+ )
50
+
51
+ st.divider()
52
+
53
+ # Example Profile Selector
54
+ examples_dir = project_root / "examples"
55
+
56
+ # Collect all example profiles
57
+ profile_files = []
58
+ if examples_dir.exists():
59
+ # Get profiles from dev/
60
+ dev_dir = examples_dir / "dev"
61
+ if dev_dir.exists():
62
+ profile_files.extend(sorted(dev_dir.glob("*.yaml")))
63
+ profile_files.extend(sorted(dev_dir.glob("*.json")))
64
+
65
+ # Get profiles from synthetic/
66
+ synthetic_dir = examples_dir / "synthetic"
67
+ if synthetic_dir.exists():
68
+ for subdir in sorted(synthetic_dir.iterdir()):
69
+ if subdir.is_dir():
70
+ profile_files.extend(sorted(subdir.glob("*.yaml")))
71
+ profile_files.extend(sorted(subdir.glob("*.json")))
72
+
73
+ # Create display names (relative to examples/)
74
+ profile_options = {}
75
+ if profile_files:
76
+ for p in profile_files:
77
+ rel_path = p.relative_to(examples_dir)
78
+ profile_options[str(rel_path)] = p
79
+
80
+ # Dropdown selector
81
+ if profile_options:
82
+ selected = st.selectbox(
83
+ "Load Example Profile",
84
+ options=["-- Select a profile --", *profile_options.keys()],
85
+ key="profile_selector",
86
+ )
87
+
88
+ if selected != "-- Select a profile --":
89
+ try:
90
+ profile_path = profile_options[selected]
91
+ st.session_state.user_profile = load_user_file(str(profile_path))
92
+ st.success(f"✅ Loaded: {selected}")
93
+ except Exception as e:
94
+ st.error(f"Failed to load profile: {e}")
95
+
96
+ # Clear Profile Button
97
+ if st.session_state.get("user_profile"):
98
+ st.button(
99
+ "Clear Loaded Profile",
100
+ on_click=clear_profile_state,
101
+ use_container_width=True,
102
+ )
103
+
104
+
105
+ # --- Page Content Dispatcher ---
106
+ # Render the selected page version
107
+ if version == "V1 (JSON Viewer)":
108
+ v1.render()
109
+ else: # Default to V2
110
+ v2.render()
111
+
112
+ # The manual creation form can be a persistent feature at the bottom of the page
113
+ with st.expander("Create New Profile Manually"):
114
+ # --- STEP 1: Move the sex selector OUTSIDE the form. ---
115
+ # This allows it to trigger a rerun and update the UI dynamically.
116
+ # Give it a unique key to avoid conflicts with other widgets.
117
+ sex = st.selectbox(
118
+ "Biological Sex", ["Male", "Female", "Other"], key="manual_profile_sex"
119
+ )
120
+
121
+ with st.form("manual_profile_form"):
122
+ st.subheader("Demographics")
123
+ age = st.number_input("Age", min_value=0, step=1)
124
+ # The 'sex' variable is now taken from the selector above the form.
125
+ ethnicity = st.text_input("Ethnicity")
126
+
127
+ st.subheader("Lifestyle")
128
+ smoking_status = st.selectbox("Smoking Status", ["never", "former", "current"])
129
+ smoking_pack_years = st.number_input("Pack-Years", min_value=0, step=1)
130
+ alcohol_consumption = st.selectbox(
131
+ "Alcohol Consumption", ["none", "light", "moderate", "heavy"]
132
+ )
133
+ dietary_habits = st.text_area("Dietary Habits")
134
+ physical_activity_level = st.text_area("Physical Activity")
135
+
136
+ st.subheader("Personal Medical History")
137
+ known_genetic_mutations = st.text_input(
138
+ "Known Genetic Mutations (comma-separated)"
139
+ )
140
+ previous_cancers = st.text_input("Previous Cancers (comma-separated)")
141
+ chronic_illnesses = st.text_input("Chronic Illnesses (comma-separated)")
142
+
143
+ st.subheader("Family History")
144
+ fam_cols = ["relative", "cancer_type", "age_at_diagnosis"]
145
+ fam_df = st.data_editor(
146
+ pd.DataFrame(columns=fam_cols),
147
+ num_rows="dynamic",
148
+ key="family_history_editor",
149
+ )
150
+
151
+ st.subheader("Clinical Observations")
152
+ obs_cols = ["test_name", "value", "unit", "reference_range", "date"]
153
+ obs_df = st.data_editor(
154
+ pd.DataFrame(columns=obs_cols),
155
+ num_rows="dynamic",
156
+ key="clinical_obs_editor",
157
+ )
158
+
159
+ female_specific_data = {}
160
+ # --- STEP 2: The conditional check now works correctly. ---
161
+ # The 'if' statement is evaluated on each rerun when the 'sex' selector changes.
162
+ if sex == "Female":
163
+ st.subheader("Female-Specific")
164
+ female_specific_data["age_at_first_period"] = st.number_input(
165
+ "Age at First Period", min_value=0, step=1
166
+ )
167
+ female_specific_data["age_at_menopause"] = st.number_input(
168
+ "Age at Menopause", min_value=0, step=1
169
+ )
170
+ female_specific_data["num_live_births"] = st.number_input(
171
+ "Number of Live Births", min_value=0, step=1
172
+ )
173
+ female_specific_data["age_at_first_live_birth"] = st.number_input(
174
+ "Age at First Live Birth", min_value=0, step=1
175
+ )
176
+ female_specific_data["hormone_therapy_use"] = st.text_input(
177
+ "Hormone Therapy Use"
178
+ )
179
+
180
+ current_concerns = st.text_area("Current Concerns or Symptoms")
181
+
182
+ submitted = st.form_submit_button("Save New Profile")
183
+ if submitted:
184
+ # --- STEP 3: Use the 'sex' variable from the external selector during submission. ---
185
+ demographics = Demographics(
186
+ age=int(age), sex=sex, ethnicity=ethnicity or None
187
+ )
188
+ lifestyle = Lifestyle(
189
+ smoking_status=smoking_status,
190
+ smoking_pack_years=int(smoking_pack_years) or None,
191
+ alcohol_consumption=alcohol_consumption,
192
+ dietary_habits=dietary_habits or None,
193
+ physical_activity_level=physical_activity_level or None,
194
+ )
195
+ pmh = PersonalMedicalHistory(
196
+ known_genetic_mutations=[
197
+ m.strip() for m in known_genetic_mutations.split(",") if m.strip()
198
+ ],
199
+ previous_cancers=[
200
+ c.strip() for c in previous_cancers.split(",") if c.strip()
201
+ ],
202
+ chronic_illnesses=[
203
+ i.strip() for i in chronic_illnesses.split(",") if i.strip()
204
+ ],
205
+ )
206
+ family_history = []
207
+ for _, row in fam_df.dropna(how="all").iterrows():
208
+ if row.get("relative") and row.get("cancer_type"):
209
+ family_history.append(
210
+ FamilyMemberCancer(
211
+ relative=str(row["relative"]),
212
+ cancer_type=str(row["cancer_type"]),
213
+ age_at_diagnosis=int(row["age_at_diagnosis"])
214
+ if row["age_at_diagnosis"] not in ["", None]
215
+ else None,
216
+ )
217
+ )
218
+
219
+ observations = []
220
+ for _, row in obs_df.dropna(how="all").iterrows():
221
+ if row.get("test_name") and row.get("value") and row.get("unit"):
222
+ observations.append(
223
+ ClinicalObservation(
224
+ test_name=str(row["test_name"]),
225
+ value=str(row["value"]),
226
+ unit=str(row["unit"]),
227
+ reference_range=(
228
+ str(row["reference_range"])
229
+ if row["reference_range"] not in ["", None]
230
+ else None
231
+ ),
232
+ date=str(row["date"])
233
+ if row["date"] not in ["", None]
234
+ else None,
235
+ )
236
+ )
237
+
238
+ female_specific = None
239
+ if sex == "Female":
240
+ female_specific = FemaleSpecific(**female_specific_data)
241
+
242
+ new_profile = UserInput(
243
+ demographics=demographics,
244
+ lifestyle=lifestyle,
245
+ family_history=family_history,
246
+ personal_medical_history=pmh,
247
+ female_specific=female_specific,
248
+ current_concerns_or_symptoms=current_concerns or None,
249
+ clinical_observations=observations,
250
+ )
251
+ st.success("Profile saved")
252
+
253
+ # --- STEP 4: Compute the risk scores ---
254
+ with st.spinner("Calculating risk scores..."):
255
+ from sentinel.risk_models import RISK_MODELS
256
+
257
+ risks_scores = []
258
+ for model in RISK_MODELS:
259
+ risk_score = model().run(new_profile)
260
+ risks_scores.append(risk_score)
261
+
262
+ new_profile.risks_scores = risks_scores
263
+
264
+ st.session_state.user_profile = new_profile
265
+ st.success("Risk scores calculated!")
266
+ st.rerun()
apps/streamlit_ui/pages/2_Configuration.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Streamlit page: Configuration."""
2
+
3
+ from pathlib import Path
4
+
5
+ import streamlit as st
6
+ import yaml
7
+ from ui_utils import initialize_session_state
8
+
9
+ from sentinel.config import AppConfig, ModelConfig, ResourcePaths
10
+ from sentinel.factory import SentinelFactory
11
+
12
+ initialize_session_state()
13
+
14
+ st.title("⚙️ Model Configuration")
15
+
16
+ # Define base paths relative to project root
17
+ root = Path(__file__).resolve().parents[3]
18
+ model_dir = root / "configs" / "model"
19
+ model_options = sorted([p.stem for p in model_dir.glob("*.yaml")])
20
+ default_model = (
21
+ "gemini_2.5_pro" if ("gemini_2.5_pro" in model_options) else model_options[0]
22
+ )
23
+
24
+ # Model selection
25
+ current_model = st.session_state.config.get("model") or default_model
26
+ selected_model = st.selectbox(
27
+ "Model Config",
28
+ model_options,
29
+ index=model_options.index(current_model) if current_model in model_options else 0,
30
+ )
31
+ st.session_state.config["model"] = selected_model
32
+
33
+ # Cancer modules selection
34
+ cancer_dir = root / "configs" / "knowledge_base" / "cancer_modules"
35
+ cancer_options = sorted([p.stem for p in cancer_dir.glob("*.yaml")])
36
+ selected_cancers = st.multiselect(
37
+ "Cancer Modules",
38
+ cancer_options,
39
+ default=st.session_state.config.get("cancer_modules", cancer_options),
40
+ )
41
+ st.session_state.config["cancer_modules"] = selected_cancers
42
+
43
+ # Diagnostic protocols selection
44
+ protocol_dir = root / "configs" / "knowledge_base" / "dx_protocols"
45
+ protocol_options = sorted([p.stem for p in protocol_dir.glob("*.yaml")])
46
+ selected_protocols = st.multiselect(
47
+ "Diagnostic Protocols",
48
+ protocol_options,
49
+ default=st.session_state.config.get("dx_protocols", protocol_options),
50
+ )
51
+ st.session_state.config["dx_protocols"] = selected_protocols
52
+
53
+
54
+ @st.cache_data(show_spinner=False)
55
+ def generate_prompt_preview(
56
+ model_config: str, cancer_modules: list, dx_protocols: list, _user_profile=None
57
+ ) -> str:
58
+ """Generate prompt preview using the factory system.
59
+
60
+ Args:
61
+ model_config (str): Name of the Hydra model configuration to load.
62
+ cancer_modules (list): Cancer module slugs selected by the user.
63
+ dx_protocols (list): Diagnostic protocol slugs to include.
64
+ _user_profile: Optional cached profile used when formatting prompts.
65
+
66
+ Returns:
67
+ str: Markdown-formatted prompt or an error message if generation fails.
68
+ """
69
+ try:
70
+ # Load model config to get provider and model name
71
+ model_config_path = root / "configs" / "model" / f"{model_config}.yaml"
72
+ with open(model_config_path) as f:
73
+ model_data = yaml.safe_load(f)
74
+
75
+ # Create knowledge base paths
76
+ knowledge_base_paths = ResourcePaths(
77
+ persona=root / "prompts" / "persona" / "default.md",
78
+ instruction_assessment=root / "prompts" / "instruction" / "assessment.md",
79
+ instruction_conversation=root
80
+ / "prompts"
81
+ / "instruction"
82
+ / "conversation.md",
83
+ output_format_assessment=root
84
+ / "configs"
85
+ / "output_format"
86
+ / "assessment.yaml",
87
+ output_format_conversation=root
88
+ / "configs"
89
+ / "output_format"
90
+ / "conversation.yaml",
91
+ cancer_modules_dir=root / "configs" / "knowledge_base" / "cancer_modules",
92
+ dx_protocols_dir=root / "configs" / "knowledge_base" / "dx_protocols",
93
+ )
94
+
95
+ # Create app config
96
+ app_config = AppConfig(
97
+ model=ModelConfig(
98
+ provider=model_data["provider"], model_name=model_data["model_name"]
99
+ ),
100
+ knowledge_base_paths=knowledge_base_paths,
101
+ selected_cancer_modules=cancer_modules,
102
+ selected_dx_protocols=dx_protocols,
103
+ )
104
+
105
+ # Create factory and get prompt builder
106
+ factory = SentinelFactory(app_config)
107
+
108
+ # Generate assessment prompt
109
+ prompt = factory.prompt_builder.build_assessment_prompt()
110
+
111
+ # Format prompt with user data if available
112
+ user_json = _user_profile.model_dump_json() if _user_profile is not None else ""
113
+ formatted_prompt = prompt.format(user_data=user_json)
114
+
115
+ return formatted_prompt
116
+
117
+ except Exception as e:
118
+ return f"Error generating prompt preview: {e!s}"
119
+
120
+
121
+ # Generate prompt preview
122
+ if selected_model:
123
+ prompt_text = generate_prompt_preview(
124
+ selected_model,
125
+ selected_cancers,
126
+ selected_protocols,
127
+ st.session_state.user_profile,
128
+ )
129
+
130
+ st.subheader("Prompt Preview")
131
+ st.text_area("System Prompt", value=prompt_text, height=500, disabled=True)
apps/streamlit_ui/pages/3_Assessment.py ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Streamlit page: Assessment."""
2
+
3
+ import os
4
+ import tempfile
5
+ from pathlib import Path
6
+
7
+ import streamlit as st
8
+ import yaml
9
+
10
+ # Configure page layout to be wider
11
+ st.set_page_config(layout="wide")
12
+ from collections import Counter
13
+
14
+ import pandas as pd
15
+ import plotly.graph_objects as go
16
+ from ui_utils import initialize_session_state
17
+
18
+ from sentinel.config import AppConfig, ModelConfig, ResourcePaths
19
+ from sentinel.conversation import ConversationManager
20
+ from sentinel.factory import SentinelFactory
21
+ from sentinel.reporting import generate_excel_report, generate_pdf_report
22
+
23
+ initialize_session_state()
24
+
25
+ if st.session_state.user_profile is None:
26
+ st.warning(
27
+ "Please complete your profile on the Profile page before running an assessment."
28
+ )
29
+ st.stop()
30
+
31
+
32
+ def create_conversation_manager(config: dict) -> ConversationManager:
33
+ """Create a conversation manager from the current configuration.
34
+
35
+ Args:
36
+ config: A dictionary containing the current configuration.
37
+
38
+ Returns:
39
+ ConversationManager: A conversation manager instance.
40
+ """
41
+ # Define base paths relative to project root
42
+ root = Path(__file__).resolve().parents[3]
43
+
44
+ # Load model config to get provider and model name
45
+ model_config_path = root / "configs" / "model" / f"{config['model']}.yaml"
46
+ with open(model_config_path) as f:
47
+ model_data = yaml.safe_load(f)
48
+
49
+ # Create knowledge base paths
50
+ knowledge_base_paths = ResourcePaths(
51
+ persona=root / "prompts" / "persona" / "default.md",
52
+ instruction_assessment=root / "prompts" / "instruction" / "assessment.md",
53
+ instruction_conversation=root / "prompts" / "instruction" / "conversation.md",
54
+ output_format_assessment=root / "configs" / "output_format" / "assessment.yaml",
55
+ output_format_conversation=root
56
+ / "configs"
57
+ / "output_format"
58
+ / "conversation.yaml",
59
+ cancer_modules_dir=root / "configs" / "knowledge_base" / "cancer_modules",
60
+ dx_protocols_dir=root / "configs" / "knowledge_base" / "dx_protocols",
61
+ )
62
+
63
+ # Create app config
64
+ app_config = AppConfig(
65
+ model=ModelConfig(
66
+ provider=model_data["provider"], model_name=model_data["model_name"]
67
+ ),
68
+ knowledge_base_paths=knowledge_base_paths,
69
+ selected_cancer_modules=config.get("cancer_modules", []),
70
+ selected_dx_protocols=config.get("dx_protocols", []),
71
+ )
72
+
73
+ # Create factory and conversation manager
74
+ factory = SentinelFactory(app_config)
75
+ return factory.create_conversation_manager()
76
+
77
+
78
+ manager = create_conversation_manager(st.session_state.config)
79
+ st.session_state.conversation_manager = manager
80
+
81
+ st.title("🔬 Assessment")
82
+
83
+ if st.button("Run Assessment", type="primary"):
84
+ with st.spinner("Running..."):
85
+ result = manager.initial_assessment(st.session_state.user_profile)
86
+ st.session_state.assessment = result
87
+
88
+ assessment = st.session_state.get("assessment")
89
+
90
+ if assessment:
91
+ # --- 1. PRE-SORT DATA ---
92
+ sorted_risk_assessments = sorted(
93
+ assessment.risk_assessments, key=lambda x: x.risk_level or 0, reverse=True
94
+ )
95
+ sorted_dx_recommendations = sorted(
96
+ assessment.dx_recommendations,
97
+ key=lambda x: x.recommendation_level or 0,
98
+ reverse=True,
99
+ )
100
+
101
+ # --- 2. ROW 1: OVERALL RISK SCORE ---
102
+ st.subheader("Overall Risk Score")
103
+ if assessment.overall_risk_score is not None:
104
+ fig = go.Figure(
105
+ go.Indicator(
106
+ mode="gauge+number",
107
+ value=assessment.overall_risk_score,
108
+ title={"text": "Overall Score"},
109
+ gauge={"axis": {"range": [0, 100]}},
110
+ )
111
+ )
112
+ fig.update_layout(height=300, margin=dict(t=50, b=40, l=40, r=40))
113
+ st.plotly_chart(fig, use_container_width=True)
114
+ st.divider()
115
+
116
+ # --- 3. ROW 2: RISK & RECOMMENDATION CHARTS ---
117
+ col1, col2 = st.columns(2)
118
+ with col1:
119
+ st.subheader("Cancer Risk Levels")
120
+ if sorted_risk_assessments:
121
+ cancers = [ra.cancer_type for ra in sorted_risk_assessments]
122
+ levels = [ra.risk_level or 0 for ra in sorted_risk_assessments]
123
+ short_cancers = [c[:28] + "..." if len(c) > 28 else c for c in cancers]
124
+ fig = go.Figure(
125
+ go.Bar(
126
+ x=levels,
127
+ y=short_cancers,
128
+ orientation="h",
129
+ hovertext=cancers,
130
+ hovertemplate="<b>%{hovertext}</b><br>Risk Level: %{x}<extra></extra>",
131
+ )
132
+ )
133
+ fig.update_layout(
134
+ xaxis=dict(range=[0, 5], title="Risk Level"),
135
+ yaxis=dict(autorange="reversed"),
136
+ margin=dict(t=20, b=40, l=40, r=40),
137
+ )
138
+ st.plotly_chart(fig, use_container_width=True)
139
+
140
+ with col2:
141
+ st.subheader("Dx Recommendations")
142
+ if sorted_dx_recommendations:
143
+ tests = [dx.test_name for dx in sorted_dx_recommendations]
144
+ recs = [dx.recommendation_level or 0 for dx in sorted_dx_recommendations]
145
+ short_tests = [t[:28] + "..." if len(t) > 28 else t for t in tests]
146
+ fig = go.Figure(
147
+ go.Bar(
148
+ x=recs,
149
+ y=short_tests,
150
+ orientation="h",
151
+ hovertext=tests,
152
+ hovertemplate="<b>%{hovertext}</b><br>Recommendation: %{x}<extra></extra>",
153
+ )
154
+ )
155
+ fig.update_layout(
156
+ xaxis=dict(range=[0, 5], title="Recommendation"),
157
+ yaxis=dict(autorange="reversed"),
158
+ margin=dict(t=20, b=40, l=40, r=40),
159
+ )
160
+ st.plotly_chart(fig, use_container_width=True)
161
+ st.divider()
162
+
163
+ # --- 4. ROW 3: RISK FACTOR VISUALIZATIONS ---
164
+ if assessment.identified_risk_factors:
165
+ col3, col4 = st.columns(2)
166
+ with col3:
167
+ st.subheader("Risk Factor Summary")
168
+ categories = [
169
+ rf.category.value for rf in assessment.identified_risk_factors
170
+ ]
171
+ category_counts = Counter(categories)
172
+ pie_fig = go.Figure(
173
+ go.Pie(
174
+ labels=list(category_counts.keys()),
175
+ values=list(category_counts.values()),
176
+ hole=0.3,
177
+ )
178
+ )
179
+ pie_fig.update_layout(
180
+ height=400,
181
+ margin=dict(t=20, b=40, l=40, r=40),
182
+ legend=dict(
183
+ orientation="v", yanchor="middle", y=0.5, xanchor="left", x=1.05
184
+ ),
185
+ )
186
+ st.plotly_chart(pie_fig, use_container_width=True)
187
+
188
+ with col4:
189
+ st.subheader("Identified Risk Factors")
190
+ risk_factor_data = [
191
+ {"Category": rf.category.value, "Description": rf.description}
192
+ for rf in assessment.identified_risk_factors
193
+ ]
194
+ rf_df = pd.DataFrame(risk_factor_data)
195
+ st.dataframe(rf_df, use_container_width=True, height=400, hide_index=True)
196
+
197
+ # --- 5. EXPANDERS (using sorted data) ---
198
+ with st.expander("Overall Summary"):
199
+ st.markdown(assessment.overall_summary, unsafe_allow_html=True)
200
+
201
+ with st.expander("Risk Assessments"):
202
+ for ra in sorted_risk_assessments:
203
+ st.markdown(f"**{ra.cancer_type}** - {ra.risk_level or 'N/A'}/5")
204
+ st.write(ra.explanation)
205
+ if ra.recommended_steps:
206
+ st.write("**Recommended Steps:**")
207
+ steps = ra.recommended_steps
208
+ if isinstance(steps, list):
209
+ for step in steps:
210
+ st.write(f"- {step}")
211
+ else:
212
+ st.write(f"- {steps}")
213
+ if ra.lifestyle_advice:
214
+ st.write(f"*{ra.lifestyle_advice}*")
215
+ st.divider()
216
+
217
+ with st.expander("Dx Recommendations"):
218
+ for dx in sorted_dx_recommendations:
219
+ st.markdown(f"**{dx.test_name}** - {dx.recommendation_level or 'N/A'}/5")
220
+ if dx.frequency:
221
+ st.write(f"Frequency: {dx.frequency}")
222
+ st.write(dx.rationale)
223
+ if dx.applicable_guideline:
224
+ st.write(f"Guideline: {dx.applicable_guideline}")
225
+ st.divider()
226
+
227
+ # --- 6. EXISTING DOWNLOAD AND CHAT LOGIC ---
228
+ with tempfile.NamedTemporaryFile(suffix=".pdf", delete=False) as f:
229
+ generate_pdf_report(assessment, st.session_state.user_profile, f.name)
230
+ f.seek(0)
231
+ pdf_data = f.read()
232
+ st.download_button("Download PDF", pdf_data, file_name="assessment.pdf")
233
+ os.unlink(f.name)
234
+
235
+ with tempfile.NamedTemporaryFile(suffix=".xlsx", delete=False) as f:
236
+ generate_excel_report(assessment, st.session_state.user_profile, f.name)
237
+ f.seek(0)
238
+ xls_data = f.read()
239
+ st.download_button("Download Excel", xls_data, file_name="assessment.xlsx")
240
+
241
+ # for q, a in manager.history:
242
+ # st.chat_message("user").write(q)
243
+ # st.chat_message("assistant").write(a)
244
+
245
+ if question := st.chat_input("Ask a follow-up question"):
246
+ with st.spinner("Thinking..."):
247
+ resp = manager.follow_up(question)
248
+ st.chat_message("user").write(question)
249
+ st.chat_message("assistant").write(resp.response)
apps/streamlit_ui/pages/4_Risk_Scores.py ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Streamlit page: Risk Scores."""
2
+
3
+ import streamlit as st
4
+
5
+ st.set_page_config(page_title="Risk Scores", page_icon="🧮")
6
+
7
+ st.title("🧮 Calculated Risk Scores")
8
+
9
+ profile = st.session_state.get("user_profile")
10
+
11
+ if profile is None:
12
+ st.info(
13
+ "⬅️ Please load or create a user profile on the 'Profile' page to view the calculated scores."
14
+ )
15
+ st.stop()
16
+
17
+ if not profile.risks_scores:
18
+ st.warning("Risk scores have not been calculated for the current profile yet.")
19
+ st.info(
20
+ "⬅️ Please go to the 'Profile' page and click the 'Save' button to trigger the calculation."
21
+ )
22
+ st.stop()
23
+
24
+ st.header("Applicable Risk Scores")
25
+ st.caption(
26
+ "The following risk scores were applicable to the provided user profile. Models that were not applicable are not shown."
27
+ )
28
+
29
+ # Filter out scores where the score string contains "N/A".
30
+ applicable_scores = [
31
+ s for s in profile.risks_scores if s is not None and "N/A" not in s.score
32
+ ]
33
+
34
+ if not applicable_scores:
35
+ st.success("✅ No major risk models were applicable or triggered for this profile.")
36
+ st.stop()
37
+
38
+ # Loop through and display only the applicable scores
39
+ for score in applicable_scores:
40
+ model_name = score.name.replace("_", " ").title()
41
+ if score.cancer_type:
42
+ cancer_type = score.cancer_type.replace("_", " ").title()
43
+ title = f"{model_name} ({cancer_type} Risk)"
44
+ else:
45
+ title = model_name
46
+
47
+ with st.expander(title, expanded=True):
48
+ col1, col2 = st.columns(2)
49
+ with col1:
50
+ st.metric(label="Risk Score", value=f"{score.score}")
51
+
52
+ if score.interpretation:
53
+ st.markdown("**Interpretation:**")
54
+ st.info(score.interpretation)
55
+
56
+ if score.description:
57
+ st.markdown(f"**Model Description:** {score.description}")
58
+
59
+ if score.references:
60
+ st.markdown("**References:**")
61
+ for ref in score.references:
62
+ st.write(f"- {ref}")
apps/streamlit_ui/pages/__init__.py ADDED
File without changes
apps/streamlit_ui/ui_utils.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Utilities for Streamlit UI components and helpers."""
2
+
3
+ from pathlib import Path
4
+
5
+ import streamlit as st
6
+
7
+
8
+ def initialize_session_state() -> None:
9
+ """Initialize Streamlit session state with default values."""
10
+ if "user_profile" not in st.session_state:
11
+ st.session_state.user_profile = None
12
+ if "config" not in st.session_state:
13
+ # Load all available options as defaults
14
+ root = Path(__file__).resolve().parents[2] # Go up to project root
15
+
16
+ cancer_dir = root / "configs" / "knowledge_base" / "cancer_modules"
17
+ all_cancer_modules = sorted([p.stem for p in cancer_dir.glob("*.yaml")])
18
+
19
+ protocol_dir = root / "configs" / "knowledge_base" / "dx_protocols"
20
+ all_dx_protocols = sorted([p.stem for p in protocol_dir.glob("*.yaml")])
21
+
22
+ model_dir = root / "configs" / "model"
23
+ model_options = sorted([p.stem for p in model_dir.glob("*.yaml")])
24
+ if model_options:
25
+ default_model = (
26
+ "gemini_2.5_pro"
27
+ if ("gemini_2.5_pro" in model_options)
28
+ else model_options[0]
29
+ )
30
+ else:
31
+ default_model = None
32
+
33
+ st.session_state.config = {
34
+ "model": default_model,
35
+ "cancer_modules": all_cancer_modules,
36
+ "dx_protocols": all_dx_protocols,
37
+ }
38
+ if "assessment" not in st.session_state:
39
+ st.session_state.assessment = None
40
+ if "conversation_manager" not in st.session_state:
41
+ st.session_state.conversation_manager = None
configs/config.yaml ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ defaults:
2
+ - model: gemma3_4b
3
+ - _self_
4
+
5
+ user_file: null
6
+ dev_mode: false
7
+
8
+ knowledge_base:
9
+ # Cancer modules removed - risk models handle this logic directly
10
+ cancer_modules: []
11
+
12
+ dx_protocols:
13
+ # Keep one protocol as reference template for future additions
14
+ - mammography_screening
configs/knowledge_base/dx_protocols/mammography_screening.yaml ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ key: mammography_screening
2
+ name: "Mammogram for Breast Cancer Screening"
3
+ description: "A mammogram is a low-dose X-ray of the breast used to find early signs of cancer, often before they can be seen or felt as a lump. Finding breast cancer early greatly increases the chances of successful treatment."
4
+ typical_frequency: "Every 1 to 2 years for women of screening age, depending on specific guidelines and risk factors."
5
+ additional_information: |
6
+ #### CORE GUIDANCE FOR AVERAGE-RISK INDIVIDUALS
7
+ This information is for individuals at average risk of breast cancer. You are generally considered average risk if you do not have a personal history of breast cancer, a known high-risk genetic mutation like BRCA1/2, or a history of radiation therapy to the chest at a young age.
8
+
9
+ ##### Guideline Nuances:
10
+ It's important to know that different expert groups have slightly different recommendations. This can be confusing, but it reflects that they weigh the benefits and harms of screening differently.
11
+ - **U.S. Preventive Services Task Force (USPSTF):** Recommends a mammogram every 2 years for women ages 40 to 74.
12
+ - **American Cancer Society (ACS):** Recommends women ages 40-44 have the option to start yearly mammograms. It recommends yearly mammograms for women 45-54. At age 55, women can switch to every 2 years or continue yearly screening.
13
+ - **NHS (UK):** Invites women for a mammogram every 3 years between the ages of 50 and 71.
14
+ This assistant's primary logic is based on the USPSTF guidelines, which recommend starting at age 40. You should discuss with your doctor which schedule is best for you, considering your personal health, values, and local practices.
15
+
16
+ #### RISK STRATIFICATION: IDENTIFYING HIGH-RISK INDIVIDUALS
17
+ Certain factors place you at a significantly higher risk for breast cancer and mean you need a different, more intensive screening plan. If any of the following apply to you, the standard recommendations are NOT sufficient. You should speak with your doctor about a referral to a high-risk breast clinic or genetic counselor.
18
+
19
+ ##### High-Risk Triggers:
20
+ - **Known Genetic Mutation:** You or a first-degree relative (parent, sibling, child) have a known mutation in a gene like *BRCA1*, *BRCA2*, *TP53*, *PALB2*, etc..
21
+ - **Strong Family History:** Even without genetic testing, a strong family history may qualify you for high-risk screening. This can be complex, but often includes having multiple first-degree relatives with breast cancer, or relatives diagnosed at a young age (e.g., before 50).
22
+ - **Calculated Lifetime Risk:** Risk assessment tools (like the Tyrer-Cuzick or Gail models) estimate your lifetime risk of breast cancer to be 20% or higher.
23
+ - **History of Chest Radiation:** You received radiation therapy to the chest between the ages of 10 and 30 (e.g., for Hodgkin lymphoma).
24
+ - **Personal History:** You have a personal history of lobular carcinoma in situ (LCIS), atypical ductal hyperplasia (ADH), or atypical lobular hyperplasia (ALH).
25
+
26
+ **If you meet high-risk criteria, guidelines often recommend annual screening with both a breast MRI and a mammogram, typically starting at age 30.**
27
+
28
+ #### KEY CONSIDERATIONS & ALTERNATIVE OPTIONS
29
+
30
+ ##### Breast Density:
31
+ Breast density refers to the amount of fibrous and glandular tissue in a breast compared to fatty tissue. Nearly half of all women have dense breasts.
32
+ - **What it means:** Having dense breasts is common and is a risk factor for breast cancer. It can also make it harder for mammograms to detect cancer, as both dense tissue and tumors can appear white on an X-ray.
33
+ - **Supplemental Screening:** Because of this, there is ongoing research into whether additional tests, like a breast ultrasound or MRI, could help find cancers missed by mammography in women with dense breasts. Currently, the USPSTF states there is not enough evidence to make a recommendation for or against these extra tests for women at average risk. This is an important topic to discuss with your doctor.
34
+
35
+ ##### Alternative Mammography Technology:
36
+ - **3D Mammography (Digital Breast Tomosynthesis or DBT):** This is an advanced type of mammogram that takes pictures of the breast from multiple angles to create a 3D-like image. Studies show it can find slightly more cancers and reduce the number of "false alarms" (when you are called back for more testing for something that isn't cancer), especially in women with dense breasts. Both 2D and 3D mammography are considered effective screening methods.
37
+
38
+ ##### Benefits and Harms of Screening:
39
+ - **Benefit:** The main benefit of screening is finding cancer early, when it is most treatable and curable.
40
+ - **Harms:** Screening is not perfect. It can lead to:
41
+ - **False Positives:** A result that looks like cancer but is not. This leads to anxiety and the need for more tests (like biopsies).
42
+ - **Overdiagnosis:** Finding and treating cancers that are so slow-growing they would never have caused a problem in a person's lifetime.
43
+ - **Radiation Exposure:** Mammograms use a very low dose of radiation. The benefit of finding cancer early is widely believed to outweigh this small risk.
44
+
45
+ ##### Breast Awareness:
46
+ Screening tests are important, but they don't find every cancer. It's crucial to be familiar with how your breasts normally look and feel. If you notice any changes—such as a new lump, skin dimpling, nipple changes, or persistent pain—see a doctor right away, even if your last mammogram was normal.
configs/model/chatgpt_o1.yaml ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ provider: openai
2
+ model_name: o1