Offensive Security CTEM Penetration Testing

Strobes AI: The Agent Stack Specialized for Offensive Security

PrakashMarch 27, 20268 min read

Authors

Prakash

TL;DR

The agent stack it ships with isn't a single AI assistant - it's a coordinated system of purpose-built offensive security specialists, each with defined tools, permissions, and methodology scope, all sharing workspace memory and capable of running in parallel. Agent Capability Web Pentest Agent Browser automation via Playwright, code execution, full OWASP WSTG coverage.

Most AI pentesting tools are wrappers. A language model in front of a scanner, a chatbot that generates nmap commands, or a dashboard that aggregates output from existing tools and calls it "AI-powered." None of that changes the underlying constraint: one tool, one target, one session at a time.

Strobes AI takes a different architectural bet. The agent stack it ships with isn't a single AI assistant - it's a coordinated system of purpose-built offensive security specialists, each with defined tools, permissions, and methodology scope, all sharing workspace memory and capable of running in parallel. This post breaks down exactly what that stack looks like, how it works, and why the architecture matters for teams running continuous exposure management programs.

What Is Strobes Exposure Management's Agent Architecture?

Strobes is an exposure management platform for managing vulnerabilities, assets, and security workflows from a unified interface. At its core, the Agents module is where the AI capabilities are fully realized. Each agent is a purpose-built specialist - not a general-purpose assistant repurposed for security tasks. They ship with predefined tools, execution scopes, and methodology coverage that maps to real offensive security disciplines.

The platform ships with over a dozen system agents and supports custom agent creation through Agent Templates. System agents are available immediately and cover the full offensive security spectrum: from threat intelligence enrichment and vulnerability triage through to active penetration testing and compliance reporting. If your existing workflow covers agentic pentesting, the agents here are the engine underneath that capability. These system agents are the engine behind Strobes’ AI-driven penetration testing capability.

The Agent Stack: Every Built-In Specialist

The table below maps each built-in agent to its capability set. These are not marketing descriptions - each agent has specific tool access, execution environments, and methodology coverage that determines what it can and cannot do.

Agent	Capability
Web Pentest Agent	Browser automation via Playwright, code execution, full OWASP WSTG coverage
Network Pentest Agent	Shell execution via workspace SSH, nmap, service enumeration, multi-host parallel testing
API Pentest Agent	REST and GraphQL testing with Python requests, httpx, and curl
Login & Auth Agent	Handles OTP, SSO, social login, email/password, CAPTCHA, MFA - produces reusable login scripts
Breach Simulation Agent	Safe, non-destructive exploitation validation to confirm true positives versus false positives
Attack Path Analyzer	Graph-algorithm analysis of asset relationships to find paths from external entry points to crown jewels
Code Review Agent	Autonomous source code analysis, SAST verification, vulnerability reachability
Code Reachability Analyzer	Builds call graphs to determine if SCA and SAST findings are actually exploitable
Exposure Assessment Agent	Cloud API, DNS probing, WAF and CDN detection for business sensitivity scoring
Threat Intel Agent	CVE, KEV, and exploit intelligence enrichment from threat feeds
AWS Agent	AWS CLI and boto3 for cloud reconnaissance, IAM, S3, EC2, and RDS assessment
Mobilization Agent	AWS tag and git history owner lookup, GitHub issue creation for finding assignment

The coverage here matters. A finding discovered by the Web Pentest Agent isn't siloed. It can be handed immediately to the Breach Simulation Agent for exploitation validation, then to the Threat Intel Agent for CVE enrichment, then to the Mobilization Agent for owner identification and ticket creation - all within the same workspace, without a human manually copying context between tools.

Skills: Teaching Agents New Techniques at Runtime

One of the more architecturally significant decisions in Strobes AI is the Skills system. Skills are modular, versioned instruction sets that extend what agents know how to do - following the open SKILL.md standard. They can be scoped per workspace, created via a natural-language Skill Generator, or uploaded as raw SKILL.md files.

This is not prompt engineering. A Skill defines a methodology: tools to use, phases to execute, what to record, and what constitutes a valid finding. The platform ships with active skills covering:

/attack-surface-recon External reconnaissance covering subdomains, IPs, ASNs, cloud assets, and email or credential exposure. A 7-phase methodology with a strict "Map Everything, Exploit Nothing" boundary.
/crawl-webapps Comprehensive web crawling using Playwright, katana, and gospider, supporting authenticated and unauthenticated crawling of SPAs built with React, Vue, and Angular.
/cloud-security-review-prowler AWS security assessment via Prowler v4.x, covering CIS benchmark, FSBP checks, and IAM, S3, EC2, and RDS misconfiguration review in read-only mode.
/javascript-analysis Client-side bundle analysis for API endpoint mapping.
/jwt-vulnerabilities JWT vulnerability detection including algorithm confusion, none-algorithm, and key confusion attacks.

The ability to define and attach custom skills to specific workspaces means the platform can be extended without code changes. A red team engagement against a financial services target with specific OAuth flows can have a custom skill that encodes exactly the testing methodology for that environment.

Human in the Loop: Governance Without Friction

Autonomous pentesting raises a real governance question: what happens when an agent wants to exploit a finding, send a request to a live system, or create a ticket in your Jira? Strobes AI answers this with a structured Human in the Loop (HITL) system.

HITL can be toggled per conversation using the APPROVALS switch in the chat interface. When enabled, agent actions require explicit approval before execution - creating a full audit trail and gating sensitive operations like finding creation, status changes, or external integrations. The platform tracks four types of input requests:

Choice The agent presents options and a human selects how to proceed. For example: "I've identified several vulnerabilities in the HTTP services. How would you like me to proceed?"
Browser Handover The agent pauses and hands the browser to the operator for manual authentication - useful for reCAPTCHA or MFA-protected targets. The agent documents the handover and resumes once authentication is complete.
Custom Form The agent requests structured input it cannot resolve autonomously: credentials, scope clarifications, or missing context.
Credentials The agent requests attachment of a stored credential set for a specific action.

This matters for exposure validation workflows where the distinction between testing and exploitation must be enforced by policy, not convention. The HITL system is that enforcement mechanism.

The Architecture Advantages That Actually Matter

The individual agents are useful. The architecture is the real product. Four properties separate this from a collection of AI-assisted tools:

Persistent Workspace Memory

An agent testing an application on day 3 of an engagement has full context from day 1: every crawled endpoint, every tested parameter, every credential tried. This context is stored in shared tables and learnings that any agent in the workspace can query. Traditional pentesting loses this context the moment an engagement ends. Strobes retains it indefinitely, making re-assessment a genuine comparison rather than starting over.

Parallel Execution

A 20-host network segment gets all its services tested simultaneously, not sequentially. A web application gets all 11 WSTG test categories designed in parallel. The time savings are multiplicative, not incremental. This is the property that makes continuous adversarial exposure validation operationally realistic rather than aspirational.

Composable Agent Chains

No human copy-pasting context between tools. A finding moves automatically through the pipeline: discovery, validation, enrichment, assignment. The agents that handle each step are specialists. The workflow that connects them is defined once and runs consistently. This is what separates the Strobes architecture from earlier approaches to building an AI harness for offensive security.

Replayable Workflows

A completed pentest can be re-run with a single click - "Re-run All" or "Restart from Phase X." Regression testing and continuous re-assessment of the same target become trivially executable. For teams managing large-scale continuous exposure programs, this property transforms pentesting from a quarterly point-in-time event into a persistent, always-on capability.

Production Numbers, Not Demo Metrics

The numbers Strobes cites aren't from controlled demonstrations. Against a real web target, the platform executed 32 structured phases, found 42 vulnerabilities with working payloads, and ran 134 tool invocations - all documented, structured, and pushed directly into the CTEM findings pipeline that connects to ticketing systems, SLA engines, and risk dashboards.

Against a 20-host network segment, the Network Pentest Agent ran service enumeration and multi-host testing in parallel. The findings pipeline handled the same workflow: discovery to triage to ticket, without human intervention at each handoff.

Key Takeaways

Strobes AI ships with 12 purpose-built offensive security agents, each with defined tools, permissions, and methodology scope - not a general assistant repurposed for security.
The Skills system extends agent capabilities at runtime via versioned SKILL.md files, enabling custom methodology coverage without code changes.
Human in the Loop controls governance at the action level - not at the workflow level - giving teams a full audit trail and explicit gating on sensitive operations.
Parallel execution, persistent workspace memory, composable agent chains, and replayable workflows are the architectural properties that make continuous exposure management operationally viable.
For pentesters, the agent stack removes the ceiling on how much expertise can be applied simultaneously. For security engineers, it means continuous offensive validation is no longer resource-constrained.

Back to Blog

Offensive Security CTEM Penetration Testing

Strobes AI: The Agent Stack Specialized for Offensive Security

PrakashMarch 27, 20268 min read

Authors

Prakash

TL;DR

What Is Strobes Exposure Management's Agent Architecture?

The Agent Stack: Every Built-In Specialist

Agent	Capability
Web Pentest Agent	Browser automation via Playwright, code execution, full OWASP WSTG coverage
Network Pentest Agent	Shell execution via workspace SSH, nmap, service enumeration, multi-host parallel testing
API Pentest Agent	REST and GraphQL testing with Python requests, httpx, and curl
Login & Auth Agent	Handles OTP, SSO, social login, email/password, CAPTCHA, MFA - produces reusable login scripts
Breach Simulation Agent	Safe, non-destructive exploitation validation to confirm true positives versus false positives
Attack Path Analyzer	Graph-algorithm analysis of asset relationships to find paths from external entry points to crown jewels
Code Review Agent	Autonomous source code analysis, SAST verification, vulnerability reachability
Code Reachability Analyzer	Builds call graphs to determine if SCA and SAST findings are actually exploitable
Exposure Assessment Agent	Cloud API, DNS probing, WAF and CDN detection for business sensitivity scoring
Threat Intel Agent	CVE, KEV, and exploit intelligence enrichment from threat feeds
AWS Agent	AWS CLI and boto3 for cloud reconnaissance, IAM, S3, EC2, and RDS assessment
Mobilization Agent	AWS tag and git history owner lookup, GitHub issue creation for finding assignment

Skills: Teaching Agents New Techniques at Runtime

This is not prompt engineering. A Skill defines a methodology: tools to use, phases to execute, what to record, and what constitutes a valid finding. The platform ships with active skills covering:

/attack-surface-recon External reconnaissance covering subdomains, IPs, ASNs, cloud assets, and email or credential exposure. A 7-phase methodology with a strict "Map Everything, Exploit Nothing" boundary.
/crawl-webapps Comprehensive web crawling using Playwright, katana, and gospider, supporting authenticated and unauthenticated crawling of SPAs built with React, Vue, and Angular.
/cloud-security-review-prowler AWS security assessment via Prowler v4.x, covering CIS benchmark, FSBP checks, and IAM, S3, EC2, and RDS misconfiguration review in read-only mode.
/javascript-analysis Client-side bundle analysis for API endpoint mapping.
/jwt-vulnerabilities JWT vulnerability detection including algorithm confusion, none-algorithm, and key confusion attacks.

Human in the Loop: Governance Without Friction

Choice The agent presents options and a human selects how to proceed. For example: "I've identified several vulnerabilities in the HTTP services. How would you like me to proceed?"
Browser Handover The agent pauses and hands the browser to the operator for manual authentication - useful for reCAPTCHA or MFA-protected targets. The agent documents the handover and resumes once authentication is complete.
Custom Form The agent requests structured input it cannot resolve autonomously: credentials, scope clarifications, or missing context.
Credentials The agent requests attachment of a stored credential set for a specific action.

This matters for exposure validation workflows where the distinction between testing and exploitation must be enforced by policy, not convention. The HITL system is that enforcement mechanism.

The Architecture Advantages That Actually Matter

The individual agents are useful. The architecture is the real product. Four properties separate this from a collection of AI-assisted tools:

Persistent Workspace Memory

Parallel Execution

Composable Agent Chains

Replayable Workflows

Production Numbers, Not Demo Metrics

Key Takeaways

Strobes AI ships with 12 purpose-built offensive security agents, each with defined tools, permissions, and methodology scope - not a general assistant repurposed for security.
The Skills system extends agent capabilities at runtime via versioned SKILL.md files, enabling custom methodology coverage without code changes.
Human in the Loop controls governance at the action level - not at the workflow level - giving teams a full audit trail and explicit gating on sensitive operations.
Parallel execution, persistent workspace memory, composable agent chains, and replayable workflows are the architectural properties that make continuous exposure management operationally viable.
For pentesters, the agent stack removes the ceiling on how much expertise can be applied simultaneously. For security engineers, it means continuous offensive validation is no longer resource-constrained.

Strobes AI: The Agent Stack Specialized for Offensive Security

Table of Contents

Authors

Share

What Is Strobes Exposure Management's Agent Architecture?

The Agent Stack: Every Built-In Specialist

Skills: Teaching Agents New Techniques at Runtime

Human in the Loop: Governance Without Friction

The Architecture Advantages That Actually Matter

Persistent Workspace Memory

Parallel Execution

Composable Agent Chains

Replayable Workflows

Production Numbers, Not Demo Metrics

Key Takeaways

Strobes AI: The Agent Stack Specialized for Offensive Security

Table of Contents

Authors

Share

What Is Strobes Exposure Management's Agent Architecture?

The Agent Stack: Every Built-In Specialist

Skills: Teaching Agents New Techniques at Runtime

Human in the Loop: Governance Without Friction

The Architecture Advantages That Actually Matter

Persistent Workspace Memory

Parallel Execution

Composable Agent Chains

Replayable Workflows

Production Numbers, Not Demo Metrics

Key Takeaways