Agentic Safety Foundation-Sec GGUF Models

Overview

This repository contains GGUF-quantized versions of the Agentic Safety Foundation-Sec model, optimized for efficient inference with llama.cpp, Ollama, and LM Studio.

The model is specifically designed for cybersecurity analysis and agentic AI safety, providing expert-level insights for:

Security threat analysis
Vulnerability assessment
Incident response guidance
Secure agentic workflow design
MITRE ATT&CK framework integration

Available Models

Quantization	Size	Quality	Use Case
Q4_K_M (Recommended)	4.92 GB	97-98%	Balanced performance and quality

Current Release: agentic-safety-foundation-sec-q4_k_m.gguf (4.92 GB)

Quantization Details

Tool: Unsloth 2025.11.6 | 4-bit (QLoRA)
Environment: NVIDIA DGX Spark | NVIDIA-SMI 580.95.05 | Driver Version: 580.95.05 | CUDA Version: 13.0 |

Recommended Model: Q4_K_M

For most use cases, we recommend the Q4_K_M quantization:

✅ Excellent quality (97-98% of original model)
✅ Compact size (~5 GB)
✅ Fast inference speed
✅ Optimal balance of performance and resource usage
✅ Compatible with consumer hardware

Quick Start

Using llama.cpp

# Download llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make

# Run inference
./llama-cli -m ../agentic-safety-foundation-sec-q4_k_m.gguf \
    -p "Analyze the following agentic workflow for security threats..." \
    -n 512 --temp 0.3

Using Ollama

# Install Ollama from https://ollama.com

# Create model from this repo
ollama create agentic-safety -f Modelfile

# Run interactive chat
ollama run agentic-safety

Using Python (llama-cpp-python)

from llama_cpp import Llama

# Load model
llm = Llama(
    model_path="./agentic-safety-foundation-sec-q4_k_m.gguf",
    n_ctx=8192,
    n_threads=8
)

# Generate response
response = llm(
    "Analyze this workflow for security vulnerabilities...",
    max_tokens=512,
    temperature=0.3
)

print(response['choices'][0]['text'])

Using LM Studio (GUI)

Download LM Studio
Open LM Studio
Go to "Search" → Download from Hugging Face
Search for guerilla7/agentic-safety-gguf
Download and start chatting!

Model Quantization Details

Format	Size	Quality Retention	Best For
F16	16 GB	100%	Research, benchmarking
Q8_0	8 GB	99%	High-quality production
Q4_K_M	5 GB	97-98%	Recommended for most uses
Q4_K_S	4.5 GB	95-96%	Resource-constrained environments

Quantization Method

This model uses GGUF format with K-quants (K-means quantization):

K_M = Medium - balanced quality and size
K_S = Small - prioritizes smaller size
Preserves most of the original model's capabilities
Optimized for CPU and GPU inference

System Requirements

Minimum Requirements (Q4_K_M)

RAM: 8 GB
Disk Space: 5 GB
CPU: Modern multi-core processor (4+ cores recommended)
OS: Windows, macOS, or Linux

Recommended Requirements

RAM: 16 GB or more
Disk Space: 10 GB (for model + workspace)
CPU: 8+ cores for faster inference
GPU: Optional (CUDA/Metal acceleration supported)

Performance Expectations

CPU-only: 2-10 tokens/sec (depends on CPU)
With GPU acceleration: 20-50+ tokens/sec
Typical response time: 5-30 seconds for 100-200 tokens

Use Cases

🔒 Cybersecurity Analysis

Analyze the following CVE for exploitability and impact:
CVE-2024-XXXX...

🤖 Agentic AI Safety ⭐️⭐️⭐️

Review this multi-agent workflow for security vulnerabilities:
Agent 1: Data collector → Agent 2: Analyzer → Agent 3: Executor

🛡️ Incident Response

Provide step-by-step incident response for a detected ransomware attack
targeting our production database servers.

📊 MITRE ATT&CK Integration

Map this observed behavior to MITRE ATT&CK techniques:
[Observed behavior details...]

Generation Parameters

Recommended parameters for different use cases:

Security Analysis (High Precision)

--temp 0.1 --top-p 0.9 --repeat-penalty 1.1

Creative Security Scenarios

--temp 0.5 --top-p 0.95 --repeat-penalty 1.15

General Q&A

--temp 0.3 --top-p 0.9 --repeat-penalty 1.1

Limitations

Domain Focus: Optimized for cybersecurity and safety analysis; may underperform on general domains
Context Window: 8K tokens (8192) - may need chunking for very long documents
Quantization Loss: ~2-3% quality reduction from original FP16 model
No Real-time Updates: Knowledge cutoff from training data (check base model for date)
Not a Replacement: Should augment, not replace, human security expertise

License

Apache 2.0 - Inherited from the base Foundation-Sec model.

See LICENSE for full terms.

Base Model

This GGUF model is a quantized version of:

Base Model: fdtn-ai/Foundation-Sec-8B-Instruct
Architecture: LLaMA 3
Parameters: 8 Billion
Specialization: Cybersecurity and AI safety

Model Conversion Details

Converted by: guerilla7 | Ron F. Del Rosario
Conversion Date: 2025-12-08
Source Model: agentic-safety-foundation-sec-merged
Conversion Tool: llama.cpp convert_hf_to_gguf.py v1.0.0
Quantization: Q4_K_M (4-bit with K-means clustering)

Citation

If you use this model in your research or applications, please cite:

@misc{agentic-safety-gguf-2025,
  author = {Ron F. Del Rosario},
  title = {Agentic Safety Foundation-Sec GGUF Models},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/guerilla7/agentic-safety-gguf}}
}

Support & Contributing

Issues or Questions?

Check the USAGE.txt file for quick reference
Review llama.cpp documentation
Open an issue on this repo

Want to Contribute?

Report bugs or compatibility issues
Share your use cases and results
Suggest improvements or additional quantizations

Additional Resources

🦙 llama.cpp GitHub
🦙 Ollama Documentation
🎨 LM Studio
📚 GGUF Specification
🐝 OWASP Agentic Security Intiative (ASI)
🤖 MITRE ATLAS
🔐 MITRE ATT&CK Framework

Changelog

v1.0.0 (2025-12-08)

Initial release
Q4_K_M quantization
Optimized for cybersecurity and agentic AI safety use cases
Tested with llama.cpp, Ollama, and LM Studio

Download and start analyzing security threats today! 🚀🔒

Downloads last month: 17

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit

Model tree for guerilla7/agentic-safety-gguf

Base model

meta-llama/Llama-3.1-8B

Finetuned

fdtn-ai/Foundation-Sec-8B

Finetuned

fdtn-ai/Foundation-Sec-8B-Instruct

Quantized

(10)

this model

Evaluation results

Quality Retention on Cybersecurity Evaluation
self-reported

97.500