11 Ways to Find a Hidden Origin (and How to Stop Them)

The Premise

Putting your site behind Cloudflare gives you a CDN-shaped silhouette. Attackers don't see the real server, they see the edge. That's the deal Cloudflare sells, and for HTTP it works.

The problem is that "behind Cloudflare" doesn't mean "invisible." There are at least eleven distinct techniques an attacker can use to recover the real origin IP, and most of them require nothing more sophisticated than a Python script and a free API key. We know because we run all eleven on every scan we do.

This post is a tour of the pipeline. Not the marketing version, the actual one: what we query, in what order, what works, what doesn't, and what defenders can do about each.

The Setup: What "Origin Bypass" Actually Means

Every method below has the same goal: produce a list of candidate IP addresses that might be the real origin behind a CDN-fronted domain. The methods generate candidates. A separate validation step probes those candidates and decides which are actually serving the customer's content.

The validation step matters as much as the discovery. A candidate IP from passive DNS history is just a guess until we connect to it on port 443 and see whether it serves the customer's TLS cert and HTTP body. Skipping validation produces noisy reports full of false positives. Doing it well is the difference between "we found 18 candidates" and "we found one origin and it's serving production."

Discovery, Method by Method

Method 1: Certificate Transparency (crt.sh)

What it does: queries crt.sh for every TLS certificate ever issued to *.example.com. Each cert lists the subdomains it covers in its Subject Alternative Names. Resolve those subdomains, filter out anything that points at a known CDN range, and what's left is candidate origin IPs.

Free Passive Hit rate: high

Why it works so well: every cert issuance since 2018 has been logged publicly. Forgotten subdomains, internal-named services, staging environments. internal-app.example.com, staging-old.example.com, vpn-corp.example.com, all there. Most companies have certs they don't even remember requesting, and many of them point straight at the origin.

How to defend: audit your CT logs (use crt.sh directly), put every subdomain behind the CDN, kill the certs you don't need.

Method 2: MX Record Resolution

What it does: pulls the MX records for the domain. If mail.example.com resolves to a non-CDN IP, that's almost always the origin (or a nearby host on the same network).

Free Passive Hit rate: medium

SMTP can't ride a free CDN tunnel, so the MX hostname is a reliable origin pointer when the company self-hosts mail.

How to defend: don't self-host receiving mail on the origin box. Use a managed receiver (SES, Postmark, Mailgun) and pull mail to the origin via IMAP through the tunnel.

Method 3: SPF Record Parsing

What it does: reads the domain's v=spf1 TXT record and extracts every ip4: directive. Those are the IPs allowed to send mail on behalf of the domain. They very often include the origin.

Free Passive Hit rate: high when present

SPF directives are publicly readable. The whole point of the record is to advertise "these are my outbound IPs" to mail receivers. Attackers read the same record.

How to defend: route outbound mail through a managed relay so the SPF only lists the relay's IPs, not yours.

Method 4: Common Subdomain Brute Force

What it does: tries the obvious names. direct, origin, real, backend, cpanel, whm, old, legacy, staging, dev, internal, vpn. If any resolve to non-CDN IPs, those are candidates.

Free Passive Hit rate: surprising

It feels lazy. It works embarrassingly often. The direct.example.com subdomain literally meant for direct origin access, exists in production at companies you'd recognize.

Method 5: IPv6 AAAA Enumeration

What it does: queries AAAA records. If the domain has IPv6 and the v6 prefix isn't a known CDN's (Cloudflare's 2606:4700::/32, etc.), the v6 address is the origin.

Free Passive Hit rate: low but reliable

Often forgotten when admins set up CDN protection only for IPv4. Cloudflare Tunnel terminates v6 too, but only if you configure it explicitly. Many setups don't.

Method 6: Historical DNS (AlienVault OTX + HackerTarget)

What it does: queries passive DNS archives that record what every domain resolved to in the past. If example.com resolved to 203.0.113.10 two years ago, before Cloudflare was deployed, that record is still in the archive and the IP is still likely the origin.

Free Passive Hit rate: brutal

Until April 2026 we used SecurityTrails for this. Their free tier dried up, so we moved to AlienVault OTX (no auth required) and HackerTarget (50 free queries per day per IP). Same data, no cost. The "I added Cloudflare last year so I'm safe now" assumption breaks here, instantly.

How to defend: when you put a CDN in front of an origin, change the origin's IP. The old one is in every archive forever.

Method 7: Favicon Hash Correlation (Shodan)

What it does: fetches the favicon from the CDN-fronted domain, computes its MurmurHash3, then asks Shodan "what other IPs serve this exact favicon?" If the answer includes non-CDN IPs, those probably belong to the same site.

API key required Active Hit rate: surprisingly high

The favicon is the most visually-shared piece of a site, and developers almost never randomize it across environments. The same favicon is on the production CDN-fronted host, the staging host, the disaster-recovery host, the dev VM. Shodan indexes them all.

Method 8: TLS Cert Fingerprint Search (Shodan)

What it does: grabs the TLS certificate served at example.com, computes its SHA-1 fingerprint, then asks Shodan "what other IPs serve this exact cert?" Same cert on a non-CDN IP is high-confidence proof of the origin.

API key required Active Hit rate: high when present

Until April 2026 we used Censys for this. Their free tier killed API access, so we now reuse the same Shodan key as Method 7. The query is ssl.cert.fingerprint:<sha1>.

Method 10: TLS SNI Mismatch (no Method 9, FOFA was deleted)

What it does: this one runs during validation, not discovery. For every candidate IP from Methods 1-8 and 11, we connect to candidate-ip:443 twice. Once with no SNI, once with the customer's domain as the SNI. We grab the cert each time and check two things: does the cert's SAN list the customer's domain, and does the cert's SHA match the CDN's reference cert.

Free Active Highest single-signal confidence

If a candidate IP hands you a cert whose SAN includes the customer's domain, that IP is the origin. There is no other reasonable interpretation. If the cert SHA matches the CDN's, the customer is serving the same cert from the unprotected origin, also the origin.

The "no SNI" variant is even worse: an attacker who knows only the IP and types https://203.0.113.10 in a browser sees a cert that confirms whose IP it is. That's a default-vhost leak, and it's depressingly common.

Method 11: GitHub Code Search

What it does: queries the GitHub code search API for files mentioning the customer's domain in .env, .yaml, .json, .tf, .conf, etc. Then regex-extracts IPs from the matched snippets and filters out CDN ranges. What's left is real-IP leaks committed by developers, contractors, or build pipelines.

Token required Active Hit rate: nasty when it hits

The leaks here are usually committed by accident: a developer pushes a docker-compose.yml with DB_HOST=203.0.113.10, or a Terraform file with the prod IP literal. GitHub's secret scanner won't flag it (it's an IP, not a credential), so it stays. We've seen full backend IPs in CI configs years after the developer who committed them left the company.

How to defend: pre-commit hooks that block IP literals near production hostnames; rotate origin IPs on a schedule; treat repo audits as part of the security posture.

Method 12: SSRF Callback (Customer-Authorized)

What it does: we issue the customer a unique URL (something like https://api.ddactic.net/api/v1/callback/abc123def456). They paste it into one of their webhook configs: a Stripe webhook, a GitHub Actions notification, an internal monitoring alert, an image proxy. When their infrastructure hits the URL, our endpoint logs the source IP. Anything outside the customer's CDN ranges is a confirmed outbound origin leak.

Customer opt-in Active Definitive proof

This is the most expensive method to deploy (the customer has to do something) and the most decisive when they do. The other methods produce probabilistic candidates. This one produces packets, with the source IP printed on the envelope. If your origin ever sends a webhook, talks to a monitoring service, or downloads an image proxy resource, that traffic exits with your real IP. The SSRF callback proves it.

How to defend: route outbound traffic through the same egress proxy you'd use for outbound mail. Origin should never directly initiate connections to anything on the public internet.

Validation: Where Candidates Become Confirmations

Once Methods 1-11 finish, we have a deduplicated list of candidate IPs per CDN-fronted domain. The validation step is what makes the list trustworthy:

Fetch the CDN's HTTP fingerprint: status code, page title, content hash (with CDN injections stripped), content length, server header.
Grab the CDN's reference TLS cert: SHA-256, SAN list, CN.
For each candidate IP, probe: HTTP GET with Host: example.com, plus a TLS connect with no-SNI and with correct-SNI. Capture the responses.
Detect shielding: if the candidate returns a 403 with "Akamai" in the body, that's SiteShield. If the connection is refused or times out, that's an origin ACL. Both are signs that the candidate is real but the customer has an additional defense.
Match: combine HTTP and TLS signals. HTTP body hash match is medium-strong, TLS SAN match is near-certain, both is "very_high" confidence.
Merge confirmed origins back into the asset record. The customer's report shows: this CDN domain has these confirmed origin IPs, found via these methods, with this evidence.

Defense: A Compact Checklist

You've seen the offense. Here's the defensive equivalent, ordered by impact:

Audit crt.sh for every cert your domain has issued. Kill the ones you don't recognize.
Move authoritative DNS, MX, and outbound mail off the origin. Origin should never be in any public DNS record except via a CDN.
Rotate the origin IP after deploying a CDN. The pre-CDN IP is in passive DNS forever.
Put IPv6 behind the same tunnel as IPv4, or remove the AAAA record entirely.
Configure a default vhost on the origin that returns blank or decoy content for IP-only requests.
Search GitHub for your own domains plus IP regexes. Find the leaks before someone else does.
Route every outbound HTTP, NTP, mail, and webhook from the origin through an egress proxy with a different IP.
Lock down the firewall on every port, not just 443. SSH/SMTP/HTTP banners on the raw IP confirm origin identity.
Wire one webhook into a callback test URL. Confirm the source IP that appears.

None of this is hard. All of it is skipped, regularly, by competent security teams who assumed "Cloudflare was enough."

The Workarounds for Protocols Tunnels Don't Cover

The hardest part of origin concealment isn't the methods above. It's that Cloudflare Tunnel only covers HTTP. Authoritative DNS, SMTP receiving, raw TCP, UDP game traffic, IP-whitelisted partner connections, none of these ride a free tunnel.

The general pattern that works: cheap public frontend ↔ private tunnel ↔ hidden origin. Run a $5 VPS as the public-facing terminator for whichever protocol your tunnel can't carry. WireGuard from the VPS to the origin. The internet sees the VPS, the origin stays hidden, and if the VPS is identified you replace it.

We covered the protocol-by-protocol playbook in a separate post: Hiding Origins When Your CDN Won't Cover the Protocol. That's the matching defensive doc to this offensive one.

Why We Built This

Most CDN/WAF vendors pitch "protection." We pitch "tested protection." The difference is whether anyone has actually tried to find your origin from outside, with the same tools an attacker would use, before you find out under load.

The 11-method pipeline above is the test. It runs on every DDactic scan. If we find an origin, you get the IP, the method, the evidence, and a fix recommendation in the report. If we don't find one, you get the audit transcript proving we couldn't, which is the clearest evidence of "your origin concealment actually works" you'll ever have.

Don't trust the marketing. Run the test.

DDactic helps organizations validate that their origin concealment actually conceals. Contact us at [email protected] for a free origin-concealment audit, or run a self-scan at ddactic.net.