Finding every subdomain is the easy part. Knowing what to do with them is where most scanners stop and most security teams get stuck.
In a previous post, we broke down how DDactic queries 13 intelligence sources to discover subdomains and validates them with AI. That process, stage 1 of our pipeline, typically produces 40 to 200 verified assets per organization.
But a list of subdomains is not an attack surface assessment. It is a phone book. You still need to know what is running on each host, how it is protected, whether credentials have leaked, what vulnerabilities exist, and which assets an attacker would target first.
This post covers stages 2 through 7: the six stages that transform a list of domains into a prioritized, actionable test plan.
The Full Pipeline at a Glance
Each stage consumes the output of the previous one. Nothing runs in isolation, and nothing is wasted. Here is the complete flow:
The entire pipeline runs in 5 to 15 minutes for a single company, depending on the size of the attack surface. Results stream to the dashboard in real time as each stage completes.
2 Port Scanning
Stage 1 gives us a list of hostnames. Stage 2 answers a different question: what services are actually running on these hosts?
A subdomain that only serves HTTPS on port 443 presents a very different risk profile than one running SSH on port 22, a database admin panel on port 8080, and an unprotected API on port 3000. You cannot assess the attack surface without knowing what ports are open.
Tier-Aware Scanning
Not every scan needs the same depth. We use a tiered approach based on the customer's plan and the nature of the target:
| Tier | Ports Scanned | What It Catches |
|---|---|---|
| Basic | 5 ports (80, 443, 8080, 8443, 22) | Web services and SSH |
| Standard | 13 ports (+ 21, 25, 53, 110, 3306, 5432, 3389, 6379) | Databases, mail, FTP, RDP, Redis |
| Full | 30+ ports (+ SIP, DNS, custom app ports, high-range services) | VoIP, game servers, IoT, custom services |
CDN Filtering
Here is a subtlety that most port scanners miss entirely. When a hostname resolves to a CDN IP address (Cloudflare, Akamai, Fastly), scanning that IP's ports tells you about the CDN, not about the target. Port 80 and 443 are open because the CDN is listening, not because the origin server has those ports exposed.
Our scanner detects CDN-proxied hosts and filters them from port scan results. This eliminates false positives and avoids wasting time scanning infrastructure that belongs to a third party. The CDN layer gets its own analysis in stage 3.
Why This Matters for DDoS
Open ports that bypass CDN protection are direct paths to the origin server. An exposed database port or an API running on a non-standard port often has no DDoS mitigation at all. These are the assets that go down first.
What Open Ports Reveal
Port scan results feed directly into the next stages. Finding port 3306 (MySQL) or 5432 (PostgreSQL) open on a public IP means the database is internet-facing, likely without WAF protection. Port 6379 (Redis) with no authentication is a critical finding. Port 22 (SSH) tells us there is direct server access that could be targeted with brute-force or used as a DDoS vector against the authentication layer.
The port scan does not just enumerate services. It builds the topology map that every subsequent stage depends on.
3 L7 Reconnaissance
Knowing that port 443 is open tells you very little. Stage 3 probes the application layer to answer: what software is running, how is it configured, and what protection sits in front of it?
HTTP Fingerprinting
For every HTTP-serving asset, the pipeline collects:
- Response headers: Server type, framework identifiers, caching behavior, security headers (or lack thereof)
- TLS certificate details: Issuer, validity, SANs (Subject Alternative Names that often reveal additional hostnames)
- Response characteristics: Status codes, redirect chains, response sizes, timing
- Login detection: Whether the page contains authentication forms, OAuth flows, or API key input fields
Technology Detection
The scanner matches responses against over 1,000 technology signatures to identify:
- Web frameworks (React, Angular, Next.js, Django, Rails, Spring)
- CMS platforms (WordPress, Drupal, Joomla, Contentful)
- Cloud hosting platforms (AWS, Azure, GCP, Vercel, Netlify)
- Load balancers and reverse proxies (Nginx, HAProxy, Envoy, Traefik)
- Analytics, tracking, and third-party scripts
Technology identification is not academic. A WordPress site with known plugin vulnerabilities is a different risk than a static site on Vercel. A Spring Boot API behind Nginx without rate limiting is a different target than one behind a managed API gateway.
WAF and CDN Identification
This is where L7 recon becomes directly relevant to DDoS resilience. For every asset, we determine:
- Which WAF vendor (if any) fronts the asset
- Which CDN provider serves the content
- Whether the WAF is in detection-only mode or actively blocking
- Whether the origin IP is discoverable despite CDN protection (see our CDN bypass post)
The Configuration Gap
Having a WAF is not the same as having a properly configured WAF. We frequently find organizations with enterprise-grade WAF subscriptions where rate limiting is disabled, bot management is in log-only mode, or DDoS protection thresholds are set so high they never trigger. Stage 3 detects these configuration gaps. For a deeper look at this problem, read our WAF configuration analysis.
Multi-Protocol Probing
L7 reconnaissance is not limited to HTTP. The pipeline also probes:
- DNS: Recursive resolution behavior, zone transfer attempts, DNSSEC validation
- SMTP: Mail server configuration, open relay detection, SPF/DKIM/DMARC records
- SIP: VoIP infrastructure exposure (increasingly common in enterprise environments)
- Direct-to-Router (D2R): Probing for network devices accessible from the public internet
Each protocol has its own DDoS attack vectors. A DNS server vulnerable to amplification, a mail server without rate limiting, or an exposed SIP gateway can each be leveraged for service disruption. The pipeline identifies these per-protocol risks rather than treating every asset as "just a web server."
4 Breach Database Integration
This stage often surprises people. Why does a DDoS resilience platform check breach databases?
Because credential exposure is attack surface. And it is the part of the attack surface that firewalls, CDNs, and WAFs cannot see.
What We Check
The pipeline queries multiple breach intelligence sources, including Have I Been Pwned (HIBP) and several commercial breach monitoring feeds. For each target organization, we:
- Harvest email addresses associated with the organization's domains through passive sources (search engines, public directories, certificate transparency logs)
- Check each address against breach databases to determine whether credentials have been exposed
- Correlate breached accounts with discovered assets to identify which services those credentials could access
Why Breach Data Matters for DDoS
Consider this scenario: an organization has invested heavily in Cloudflare Enterprise for their public website, AWS Shield Advanced for their API, and a managed scrubbing service for their network layer. Their perimeter looks solid.
But 340 employee email addresses appeared in a data breach two years ago. Some of those employees still use the same passwords. Now an attacker can:
- Authenticate to the VPN portal and access internal services that have zero DDoS protection
- Log in to admin panels discovered in stage 1 that sit behind the CDN but have no rate limiting for authenticated users
- Access API endpoints with valid tokens, bypassing bot detection and rate limiting that only applies to unauthenticated traffic
- Reach internal dashboards (Grafana, Jenkins, Kibana) that are internet-facing but rely on authentication as their only defense
Shadow IT Discovery
Breach data also reveals services that the security team may not know exist. When employee credentials appear in breach dumps associated with third-party SaaS tools, development platforms, or personal projects hosted on company domains, it often surfaces shadow IT that was never included in the organization's asset inventory.
The Correlation Step
Raw breach counts are not useful on their own. The value comes from correlating breach data with the assets discovered in stages 1-3. If we found an exposed VPN portal in stage 1 and 200 breached employee credentials in stage 4, those findings together represent a much higher risk than either one alone.
This correlation happens automatically. By the time the pipeline reaches stage 6 (AI analysis), it has both the infrastructure topology and the credential exposure data needed to assess combined risk.
5 Active Reconnaissance
Stages 2-4 are largely passive: they probe, fingerprint, and query external databases. Stage 5 shifts to active testing, carefully probing each asset for exploitable conditions.
Sensitive Path Discovery
The scanner probes for paths that should not be publicly accessible:
- Configuration files:
.env,config.json,wp-config.php,.git/config - Backup files:
.sql,.bak,.tar.gzfiles left in web-accessible directories - Admin interfaces:
/admin,/wp-admin,/phpmyadmin,/console - API documentation:
/swagger,/api-docs,/graphql(with introspection enabled) - Debug endpoints:
/debug,/status,/healthpages that leak internal information
Like port scanning, path discovery is tier-aware. Basic scans check a focused list of high-signal paths. Full scans probe hundreds of paths informed by the technology stack detected in stage 3: if we identified WordPress, we check WordPress-specific paths; if we found a Spring Boot app, we probe Spring Actuator endpoints.
Vulnerability Template Matching
The pipeline runs over 10,000 vulnerability detection templates against each asset. These templates perform passive detection, identifying known vulnerabilities by their response signatures without sending exploit payloads. This includes:
- CVEs in detected software versions
- Misconfigured security headers
- Information disclosure through error pages
- Default credentials on management interfaces
- Exposed API endpoints with missing authentication
Cloud Storage Discovery
Many organizations have misconfigured cloud storage buckets (S3, Azure Blob, GCP Cloud Storage) that are publicly accessible. The active recon stage checks for storage resources associated with the target's domain names, brand names, and known project identifiers. A publicly readable backup bucket is both a data breach risk and an indicator of broader infrastructure hygiene problems.
Deep Crawl and Measurement
For assets that serve web content, the pipeline performs a constrained crawl to map the application structure, discover additional endpoints, and measure response characteristics under normal load. These baseline measurements become the reference point for stage 7's test plan, where we need to know what "normal" looks like before we can define what "under stress" means.
Controlled and Scoped
Active reconnaissance never sends exploit payloads, never attempts to modify data, and never exceeds the scope defined by the domain ownership verification in the customer's account. It probes for the existence of vulnerabilities through response analysis, not through exploitation.
6 AI-Powered Analysis
By stage 6, the pipeline has accumulated a substantial dataset: hundreds of subdomains, port scan results, technology fingerprints, WAF detection data, breach exposure counts, vulnerability findings, and crawl data. A human analyst could spend hours reviewing this. The AI analysis stage processes it in under 30 seconds.
Asset Classification
The first AI task is classification. Every discovered asset gets labeled with its role in the organization's infrastructure:
| Classification | Examples | DDoS Relevance |
|---|---|---|
| Customer-facing portal | my.company.com, app.company.com | High, direct revenue impact |
| API endpoint | api.company.com, gateway.company.com | Critical, often bypasses CDN cache |
| Internal tool | jenkins.company.com, grafana.company.com | Medium, operational disruption |
| Marketing site | www.company.com, blog.company.com | Lower, usually CDN-cached |
| Parked/defensive registration | company-typo.com, companyx.com | None, filtered from results |
This classification step is what separates a useful assessment from a noisy one. Without it, a security team receives a flat list of 150 domains and has to manually determine which ones matter. With it, they immediately see that 8 are customer portals, 12 are API endpoints, 6 are internal tools exposed to the internet, and 40 are parked domains that can be safely ignored.
Filtering Parked Domains and Defensive Registrations
Organizations often register dozens of domain variants (typosquatting protection, brand defense, future projects) that resolve to parking pages or redirect to the main site. Including these in a security assessment adds noise without adding value.
The AI identifies parked domains by combining signals: hosting on known parking services, identical redirect destinations, absence of unique content, WHOIS registration patterns consistent with defensive registration. These domains are flagged and filtered so the assessment focuses on assets that actually carry risk.
Priority Scoring
Each asset receives a priority score based on multiple factors:
- Business impact: Customer portals and revenue-generating APIs rank higher than internal tools
- Protection level: Assets without WAF or CDN protection rank higher than those behind enterprise-grade defenses
- Exposure indicators: Open non-standard ports, missing security headers, known vulnerabilities
- Credential risk: Assets where breached credentials could provide authenticated access
- Technology risk: Outdated software versions, known-vulnerable frameworks
The output is a ranked list. Not "here are 150 things to worry about," but "here are the 12 assets that should keep you up at night, in order."
Protection Gap Identification
The AI cross-references what it knows about the organization's defensive posture with what a complete defense should look like. Common gaps it identifies:
- API endpoints served directly from origin servers while marketing sites are CDN-protected
- Rate limiting configured for the main domain but missing on subdomains
- WAF rules in detection-only mode (logging but not blocking)
- DDoS protection thresholds set above realistic attack volumes
- Authentication endpoints without bot detection or challenge mechanisms
AI Cost
The entire AI analysis stage, including asset classification, priority scoring, and gap identification, costs under $0.02 per scan. It uses fast inference models optimized for structured data analysis, not large language models generating creative text. Speed matters as much as accuracy: the analysis adds less than 30 seconds to the pipeline runtime.
7 Test Plan Generation
The final stage transforms all findings into a concrete test plan. This is the output that security teams actually act on.
From Findings to Attack Vectors
The test plan generator maps each finding to specific attack techniques. This is not a generic list of "things that could go wrong." It is a tailored mapping based on what the pipeline actually observed:
Finding: API endpoint at api.company.com:443
- No CDN protection detected
- Rate limiting: none observed
- Technology: Node.js/Express
- Authentication: API key (header)
Mapped Test Vectors:
1. HTTP flood (direct to origin, no CDN absorption)
2. Slowloris (Node.js single-threaded event loop)
3. API abuse (expensive query patterns)
4. Authentication endpoint stress
(brute-force rate with no limiting)
Priority: CRITICAL
Reason: Revenue-generating API with no L7 protection
Attack Vector Selection
DDactic maintains a matrix of over 200 distinct attack vectors across multiple protocol layers and architecture types. The test plan generator does not select from this matrix randomly. It uses the data from all previous stages to determine which vectors are relevant:
- Protocol layer: HTTP/1.1, HTTP/2, HTTP/3, DNS, SIP, SMTP, raw TCP/UDP
- Architecture type: Direct-to-origin, CDN-proxied, load-balanced, API gateway, serverless
- Defense posture: Unprotected, WAF-only, CDN+WAF, full scrubbing
- Technology stack: Framework-specific vectors (e.g., Spring4Shell for Spring Boot, ReDoS for regex-heavy parsers)
Prioritized Test Schedule
The test plan is not just a list of vectors. It is a prioritized schedule that tells the security team (or DDactic's automated testing infrastructure) what to test first and why:
- Critical assets without protection - API endpoints and customer portals serving traffic directly from origin servers
- Assets with misconfigured protection - WAF in detection-only mode, rate limits set too high, bot detection disabled
- Assets with credential exposure - Services where breached credentials could bypass perimeter defenses
- Assets with known vulnerabilities - Software versions with published DDoS-relevant CVEs
- Properly protected assets - Testing that defenses actually work as configured under realistic load
Actionable Output
The test plan includes specific remediation recommendations for each finding. These are not generic advice like "enable rate limiting." They are vendor-specific CLI commands and configuration changes based on the exact technology stack detected in stage 3. If you are running Cloudflare, you get Cloudflare commands. If you are behind AWS WAF, you get AWS WAF rules.
Why Sequential Stages Matter
You might wonder: why not run everything in parallel and save time?
Because each stage depends on the output of the previous ones, and that dependency is what makes the results useful.
- Port scanning (stage 2) needs stage 1's output to know which hosts to scan and which are CDN-proxied (and should be filtered)
- L7 recon (stage 3) needs stage 2's output to know which ports to probe for HTTP services, and which services run non-HTTP protocols
- Active recon (stage 5) needs stage 3's output to select the right vulnerability templates based on detected technology stacks
- AI analysis (stage 6) needs everything to classify assets accurately and identify protection gaps
- Test plan generation (stage 7) needs everything to map findings to the correct attack vectors for each specific architecture
A flat, parallel scan that checks subdomains, ports, and vulnerabilities independently produces a spreadsheet. A sequential pipeline that builds context at each stage produces an assessment.
What This Looks Like in Practice
Here is a simplified example of how the pipeline's stages compound to produce findings that no single stage could generate alone:
Stage 1: Discovers staging.company.com
Stage 2: Finds ports 443, 3000, 5432 open
Stage 3: Port 443 = React app, Port 3000 = Express API,
Port 5432 = PostgreSQL. No WAF detected.
Stage 4: 12 developer credentials breached (company.com
domain in 2024 breach)
Stage 5: /api-docs publicly accessible on port 3000,
GraphQL introspection enabled, .env file
exposed at /.env
Stage 6: AI classifies as "staging environment with
production database connection" (priority: CRITICAL)
Stage 7: Test plan includes: direct DB connection test,
API abuse via documented endpoints, credential
stuffing against developer accounts
Combined finding: Staging environment with production
data, no perimeter defense, full API documentation
public, developer credentials compromised.
No single stage produces this conclusion.
All seven together do.
The Gap Between Discovery and Assessment
Most attack surface management tools stop at discovery. They give you a list of assets, maybe with some port information and basic fingerprinting. That is valuable, but it is the beginning of the work, not the end.
The gap between "here are your assets" and "here is what an attacker would do with them" is where organizations are most vulnerable. Security teams receive asset inventories and then have to manually determine risk, prioritize remediation, and design test plans. That manual process takes weeks, and by the time it is complete, the attack surface has changed.
DDactic's 7-stage pipeline closes that gap automatically. From a company name to a prioritized, actionable test plan in under 15 minutes.
See Your Full Attack Surface
Run a free scan and see what all 7 stages discover about your organization. No account required. Results in minutes, not weeks.
Start a Free Scan