Agentic Safety Foundation-Sec GGUF Models

Overview

This repository contains GGUF-quantized versions of the Agentic Safety Foundation-Sec model, optimized for efficient inference with llama.cpp, Ollama, and LM Studio.

The model is specifically designed for cybersecurity analysis and agentic AI safety, providing expert-level insights for:

  • Security threat analysis
  • Vulnerability assessment
  • Incident response guidance
  • Secure agentic workflow design
  • MITRE ATT&CK framework integration

Available Models

Quantization Size Quality Use Case
Q4_K_M (Recommended) 4.92 GB 97-98% Balanced performance and quality

Current Release: agentic-safety-foundation-sec-q4_k_m.gguf (4.92 GB)

Quantization Details

  • Tool: Unsloth 2025.11.6 | 4-bit (QLoRA)
  • Environment: NVIDIA DGX Spark | NVIDIA-SMI 580.95.05 | Driver Version: 580.95.05 | CUDA Version: 13.0 |

Recommended Model: Q4_K_M

For most use cases, we recommend the Q4_K_M quantization:

  • βœ… Excellent quality (97-98% of original model)
  • βœ… Compact size (~5 GB)
  • βœ… Fast inference speed
  • βœ… Optimal balance of performance and resource usage
  • βœ… Compatible with consumer hardware

Quick Start

Using llama.cpp

# Download llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make

# Run inference
./llama-cli -m ../agentic-safety-foundation-sec-q4_k_m.gguf \
    -p "Analyze the following agentic workflow for security threats..." \
    -n 512 --temp 0.3

Using Ollama

# Install Ollama from https://ollama.com

# Create model from this repo
ollama create agentic-safety -f Modelfile

# Run interactive chat
ollama run agentic-safety

Using Python (llama-cpp-python)

from llama_cpp import Llama

# Load model
llm = Llama(
    model_path="./agentic-safety-foundation-sec-q4_k_m.gguf",
    n_ctx=8192,
    n_threads=8
)

# Generate response
response = llm(
    "Analyze this workflow for security vulnerabilities...",
    max_tokens=512,
    temperature=0.3
)

print(response['choices'][0]['text'])

Using LM Studio (GUI)

  1. Download LM Studio
  2. Open LM Studio
  3. Go to "Search" β†’ Download from Hugging Face
  4. Search for guerilla7/agentic-safety-gguf
  5. Download and start chatting!

Model Quantization Details

Format Size Quality Retention Best For
F16 16 GB 100% Research, benchmarking
Q8_0 8 GB 99% High-quality production
Q4_K_M 5 GB 97-98% Recommended for most uses
Q4_K_S 4.5 GB 95-96% Resource-constrained environments

Quantization Method

This model uses GGUF format with K-quants (K-means quantization):

  • K_M = Medium - balanced quality and size
  • K_S = Small - prioritizes smaller size
  • Preserves most of the original model's capabilities
  • Optimized for CPU and GPU inference

System Requirements

Minimum Requirements (Q4_K_M)

  • RAM: 8 GB
  • Disk Space: 5 GB
  • CPU: Modern multi-core processor (4+ cores recommended)
  • OS: Windows, macOS, or Linux

Recommended Requirements

  • RAM: 16 GB or more
  • Disk Space: 10 GB (for model + workspace)
  • CPU: 8+ cores for faster inference
  • GPU: Optional (CUDA/Metal acceleration supported)

Performance Expectations

  • CPU-only: 2-10 tokens/sec (depends on CPU)
  • With GPU acceleration: 20-50+ tokens/sec
  • Typical response time: 5-30 seconds for 100-200 tokens

Use Cases

πŸ”’ Cybersecurity Analysis

Analyze the following CVE for exploitability and impact:
CVE-2024-XXXX...

πŸ€– Agentic AI Safety ⭐️⭐️⭐️

Review this multi-agent workflow for security vulnerabilities:
Agent 1: Data collector β†’ Agent 2: Analyzer β†’ Agent 3: Executor

πŸ›‘οΈ Incident Response

Provide step-by-step incident response for a detected ransomware attack
targeting our production database servers.

πŸ“Š MITRE ATT&CK Integration

Map this observed behavior to MITRE ATT&CK techniques:
[Observed behavior details...]

Generation Parameters

Recommended parameters for different use cases:

Security Analysis (High Precision)

--temp 0.1 --top-p 0.9 --repeat-penalty 1.1

Creative Security Scenarios

--temp 0.5 --top-p 0.95 --repeat-penalty 1.15

General Q&A

--temp 0.3 --top-p 0.9 --repeat-penalty 1.1

Limitations

  • Domain Focus: Optimized for cybersecurity and safety analysis; may underperform on general domains
  • Context Window: 8K tokens (8192) - may need chunking for very long documents
  • Quantization Loss: ~2-3% quality reduction from original FP16 model
  • No Real-time Updates: Knowledge cutoff from training data (check base model for date)
  • Not a Replacement: Should augment, not replace, human security expertise

License

Apache 2.0 - Inherited from the base Foundation-Sec model.

See LICENSE for full terms.

Base Model

This GGUF model is a quantized version of:

Model Conversion Details

  • Converted by: guerilla7 | Ron F. Del Rosario
  • Conversion Date: 2025-12-08
  • Source Model: agentic-safety-foundation-sec-merged
  • Conversion Tool: llama.cpp convert_hf_to_gguf.py v1.0.0
  • Quantization: Q4_K_M (4-bit with K-means clustering)

Citation

If you use this model in your research or applications, please cite:

@misc{agentic-safety-gguf-2025,
  author = {Ron F. Del Rosario},
  title = {Agentic Safety Foundation-Sec GGUF Models},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/guerilla7/agentic-safety-gguf}}
}

Support & Contributing

Issues or Questions?

  1. Check the USAGE.txt file for quick reference
  2. Review llama.cpp documentation
  3. Open an issue on this repo

Want to Contribute?

  • Report bugs or compatibility issues
  • Share your use cases and results
  • Suggest improvements or additional quantizations

Additional Resources

Changelog

v1.0.0 (2025-12-08)

  • Initial release
  • Q4_K_M quantization
  • Optimized for cybersecurity and agentic AI safety use cases
  • Tested with llama.cpp, Ollama, and LM Studio

Download and start analyzing security threats today! πŸš€πŸ”’

Downloads last month
17
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for guerilla7/agentic-safety-gguf

Quantized
(10)
this model

Evaluation results