Agentic Safety Foundation-Sec GGUF Models
Overview
This repository contains GGUF-quantized versions of the Agentic Safety Foundation-Sec model, optimized for efficient inference with llama.cpp, Ollama, and LM Studio.
The model is specifically designed for cybersecurity analysis and agentic AI safety, providing expert-level insights for:
- Security threat analysis
- Vulnerability assessment
- Incident response guidance
- Secure agentic workflow design
- MITRE ATT&CK framework integration
Available Models
| Quantization | Size | Quality | Use Case |
|---|---|---|---|
| Q4_K_M (Recommended) | 4.92 GB | 97-98% | Balanced performance and quality |
Current Release: agentic-safety-foundation-sec-q4_k_m.gguf (4.92 GB)
Quantization Details
- Tool: Unsloth 2025.11.6 | 4-bit (QLoRA)
- Environment: NVIDIA DGX Spark | NVIDIA-SMI 580.95.05 | Driver Version: 580.95.05 | CUDA Version: 13.0 |
Recommended Model: Q4_K_M
For most use cases, we recommend the Q4_K_M quantization:
- β Excellent quality (97-98% of original model)
- β Compact size (~5 GB)
- β Fast inference speed
- β Optimal balance of performance and resource usage
- β Compatible with consumer hardware
Quick Start
Using llama.cpp
# Download llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make
# Run inference
./llama-cli -m ../agentic-safety-foundation-sec-q4_k_m.gguf \
-p "Analyze the following agentic workflow for security threats..." \
-n 512 --temp 0.3
Using Ollama
# Install Ollama from https://ollama.com
# Create model from this repo
ollama create agentic-safety -f Modelfile
# Run interactive chat
ollama run agentic-safety
Using Python (llama-cpp-python)
from llama_cpp import Llama
# Load model
llm = Llama(
model_path="./agentic-safety-foundation-sec-q4_k_m.gguf",
n_ctx=8192,
n_threads=8
)
# Generate response
response = llm(
"Analyze this workflow for security vulnerabilities...",
max_tokens=512,
temperature=0.3
)
print(response['choices'][0]['text'])
Using LM Studio (GUI)
- Download LM Studio
- Open LM Studio
- Go to "Search" β Download from Hugging Face
- Search for
guerilla7/agentic-safety-gguf - Download and start chatting!
Model Quantization Details
| Format | Size | Quality Retention | Best For |
|---|---|---|---|
| F16 | 16 GB | 100% | Research, benchmarking |
| Q8_0 | 8 GB | 99% | High-quality production |
| Q4_K_M | 5 GB | 97-98% | Recommended for most uses |
| Q4_K_S | 4.5 GB | 95-96% | Resource-constrained environments |
Quantization Method
This model uses GGUF format with K-quants (K-means quantization):
K_M= Medium - balanced quality and sizeK_S= Small - prioritizes smaller size- Preserves most of the original model's capabilities
- Optimized for CPU and GPU inference
System Requirements
Minimum Requirements (Q4_K_M)
- RAM: 8 GB
- Disk Space: 5 GB
- CPU: Modern multi-core processor (4+ cores recommended)
- OS: Windows, macOS, or Linux
Recommended Requirements
- RAM: 16 GB or more
- Disk Space: 10 GB (for model + workspace)
- CPU: 8+ cores for faster inference
- GPU: Optional (CUDA/Metal acceleration supported)
Performance Expectations
- CPU-only: 2-10 tokens/sec (depends on CPU)
- With GPU acceleration: 20-50+ tokens/sec
- Typical response time: 5-30 seconds for 100-200 tokens
Use Cases
π Cybersecurity Analysis
Analyze the following CVE for exploitability and impact:
CVE-2024-XXXX...
π€ Agentic AI Safety βοΈβοΈβοΈ
Review this multi-agent workflow for security vulnerabilities:
Agent 1: Data collector β Agent 2: Analyzer β Agent 3: Executor
π‘οΈ Incident Response
Provide step-by-step incident response for a detected ransomware attack
targeting our production database servers.
π MITRE ATT&CK Integration
Map this observed behavior to MITRE ATT&CK techniques:
[Observed behavior details...]
Generation Parameters
Recommended parameters for different use cases:
Security Analysis (High Precision)
--temp 0.1 --top-p 0.9 --repeat-penalty 1.1
Creative Security Scenarios
--temp 0.5 --top-p 0.95 --repeat-penalty 1.15
General Q&A
--temp 0.3 --top-p 0.9 --repeat-penalty 1.1
Limitations
- Domain Focus: Optimized for cybersecurity and safety analysis; may underperform on general domains
- Context Window: 8K tokens (8192) - may need chunking for very long documents
- Quantization Loss: ~2-3% quality reduction from original FP16 model
- No Real-time Updates: Knowledge cutoff from training data (check base model for date)
- Not a Replacement: Should augment, not replace, human security expertise
License
Apache 2.0 - Inherited from the base Foundation-Sec model.
See LICENSE for full terms.
Base Model
This GGUF model is a quantized version of:
- Base Model: fdtn-ai/Foundation-Sec-8B-Instruct
- Architecture: LLaMA 3
- Parameters: 8 Billion
- Specialization: Cybersecurity and AI safety
Model Conversion Details
- Converted by: guerilla7 | Ron F. Del Rosario
- Conversion Date: 2025-12-08
- Source Model: agentic-safety-foundation-sec-merged
- Conversion Tool: llama.cpp
convert_hf_to_gguf.pyv1.0.0 - Quantization: Q4_K_M (4-bit with K-means clustering)
Citation
If you use this model in your research or applications, please cite:
@misc{agentic-safety-gguf-2025,
author = {Ron F. Del Rosario},
title = {Agentic Safety Foundation-Sec GGUF Models},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/guerilla7/agentic-safety-gguf}}
}
Support & Contributing
Issues or Questions?
- Check the USAGE.txt file for quick reference
- Review llama.cpp documentation
- Open an issue on this repo
Want to Contribute?
- Report bugs or compatibility issues
- Share your use cases and results
- Suggest improvements or additional quantizations
Additional Resources
- π¦ llama.cpp GitHub
- π¦ Ollama Documentation
- π¨ LM Studio
- π GGUF Specification
- π OWASP Agentic Security Intiative (ASI)
- π€ MITRE ATLAS
- π MITRE ATT&CK Framework
Changelog
v1.0.0 (2025-12-08)
- Initial release
- Q4_K_M quantization
- Optimized for cybersecurity and agentic AI safety use cases
- Tested with llama.cpp, Ollama, and LM Studio
Download and start analyzing security threats today! ππ
- Downloads last month
- 17
4-bit
Model tree for guerilla7/agentic-safety-gguf
Base model
meta-llama/Llama-3.1-8BEvaluation results
- Quality Retention on Cybersecurity Evaluationself-reported97.500