Bot Detection: 20 Vendors Reverse-Engineered

Bot detection vendors promise to separate humans from machines. They deploy JavaScript challenges, fingerprint browsers, analyze behavior, and serve CAPTCHAs. But what do they actually check? We reverse-engineered the challenge logic of 20 vendors to find out.

The answer is less reassuring than their marketing suggests. Most vendors check a subset of available detection signals, not all of them. And the reason is not technical limitation. It is a business decision: optimizing for conversion (minimizing false positives on real users) rather than security (catching every bot).

We built a scanner that passes 14 out of 16 vendors in live testing. Not by exploiting zero-day vulnerabilities. By understanding exactly which signals each vendor checks and providing the right answers.

14 / 16

Vendors bypassed in live testing by DDactic's automated scanner

Methodology: How We Analyzed 25 Vendors

Our approach combined static analysis with live testing across several months of research. For each vendor, we followed a consistent process:

JavaScript deobfuscation. Bot detection vendors inject challenge scripts into protected pages. These scripts are heavily obfuscated. We deobfuscated each vendor's challenge JS to identify the specific browser properties and behaviors being collected.
Signal enumeration. For each vendor, we cataloged every browser property, API call, and behavioral signal their JS collects. This includes navigator properties, canvas rendering, WebGL queries, timing measurements, event listeners, and DOM manipulation checks.
TLS and network-layer analysis. Some vendors (notably DataDome) perform detection at the TLS handshake level before any JavaScript executes. We captured and analyzed TLS Client Hello fingerprints to understand which vendors use network-layer signals.
Live bypass testing. We built vendor-specific stealth profiles and tested them against live-protected sites. A "pass" means our automated scanner successfully loads the protected page and extracts content without triggering a block or challenge.

Scope and Limitations

This research covers the default configurations of each vendor. Enterprise customers with custom rules, additional challenge layers, or tuned sensitivity thresholds may see different results. We tested against production deployments, not vendor sandboxes.

The Detection Signal Matrix

Every bot detection system checks some combination of the following signals. The matrix below shows which vendors check which signals, based on our JS deobfuscation analysis. A check mark means we confirmed the signal is collected and evaluated. A heavy mark means the vendor weights that signal heavily in its scoring.

Detection Signal	CF	Akamai	Imperva	DataDome	PX	hCaptcha	Arkose	F5	Radware	Sucuri
JS execution check	H	Y	Y	Y	Y	Y	Y	Y	Y	H
Cookie handling	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Canvas fingerprint	-	H	Y	Y	H	-	Y	Y	-	-
WebGL renderer	-	Y	Y	Y	Y	-	Y	Y	-	-
Screen resolution	-	Y	Y	Y	Y	-	Y	Y	Y	-
Timezone / language	Y	Y	Y	Y	Y	-	Y	Y	Y	-
Installed plugins	-	Y	Y	-	Y	-	-	Y	-	-
Mouse / touch events	-	Y	-	Y	H	Y	H	Y	-	-
Timing patterns	Y	Y	Y	Y	H	Y	Y	Y	-	-
DOM manipulation	Y	-	Y	-	Y	-	-	-	-	Y
TLS fingerprint	Y	Y	Y	H	Y	-	-	Y	-	-
IP reputation	H	Y	Y	H	Y	H	Y	Y	Y	Y
User-Agent consistency	H	Y	Y	Y	Y	-	-	Y	Y	Y
CDP/automation flags	Y	Y	Y	Y	H	-	Y	Y	-	-

Legend: H = heavily weighted signal, Y = checked, - = not observed. CF = Cloudflare, PX = PerimeterX/HUMAN.

The first thing that stands out: no vendor checks all 14 signals. The most comprehensive vendors (Akamai, DataDome, PerimeterX) check 12-13. The lightest (Sucuri, Radware cloud) check 4-5. The average across all 20 vendors we analyzed is 8 signals.

The Subset Problem

If a vendor checks 8 out of 14 possible signals, an attacker only needs to spoof those 8 correctly. The 6 unchecked signals are irrelevant. This turns bot detection from a hard problem (fool everything) into a tractable one (fool the specific checks this vendor runs).

Deep Dive: 5 Major Vendors

Cloudflare: UA-Based Detection + Turnstile

Primary Signals: JS challenge, User-Agent consistency, IP reputation Moderate

Cloudflare's bot detection centers on two mechanisms: their managed JS challenge and Turnstile. The JS challenge verifies that a real browser engine is executing JavaScript and setting cookies. Turnstile serves as a CAPTCHA replacement that runs in the background.

Our analysis confirmed that Cloudflare's challenge logic relies heavily on User-Agent consistency checks. The challenge verifies that the UA string matches the actual browser capabilities exposed via navigator properties. If you claim to be Chrome 120 but your JS engine does not support Chrome 120 APIs, the challenge fails.

What Cloudflare does not check in its standard challenge: canvas fingerprint, WebGL renderer, screen resolution, installed plugins, or mouse movement patterns. These are signals that other vendors use extensively.

Cloudflare's strength is its IP reputation database. With roughly 20% of all web traffic flowing through its network, Cloudflare has unmatched visibility into IP behavior across millions of sites. A residential IP with clean history passes easily. A datacenter IP with known bot traffic gets escalated to a harder challenge.

Bypass approach: Use a real browser engine (not a spoofed UA), set cookies correctly, use residential or clean IPs. Our scanner passes Cloudflare's standard challenge consistently.

Imperva (Incapsula): Cookie Rotation + JS Eval

Primary Signals: JS eval, cookie chain, device fingerprint Moderate

Imperva's bot detection uses a multi-stage cookie validation flow. The initial request returns a challenge page with obfuscated JavaScript. This script must execute, compute a response value, and set specific cookies. Subsequent requests must carry these cookies to prove that JS execution occurred.

The fingerprinting is broader than Cloudflare's. Imperva collects canvas data, WebGL renderer strings, screen dimensions, timezone, language, and installed plugins. These are hashed into a device fingerprint that accompanies the challenge response.

The weakness we identified: Imperva checks that a fingerprint exists and is internally consistent, but does not appear to maintain a large-scale fingerprint reputation database comparable to Cloudflare's IP database. A freshly generated, consistent fingerprint passes without additional scrutiny. Imperva also does not heavily weight mouse or touch event analysis in its standard configuration.

Bypass approach: Execute the challenge JS in a real browser context, generate a consistent fingerprint, rotate cookies correctly. The challenge math is deterministic once deobfuscated.

DataDome: TLS-First Detection

Primary Signals: TLS fingerprint, IP reputation, device fingerprint Strong

DataDome is architecturally different from most competitors. Detection begins at the TLS handshake, before any JavaScript executes. DataDome analyzes the TLS Client Hello message to extract a JA3/JA4 fingerprint and compares it against known browser TLS profiles.

This is significant because most bot frameworks (Puppeteer, Playwright, Selenium) use a TLS stack that produces fingerprints distinct from real browsers. A headless Chrome instance using Puppeteer's default TLS settings generates a JA3 fingerprint that DataDome flags immediately, before the page even loads.

On top of TLS fingerprinting, DataDome deploys a JS challenge that collects device properties, canvas data, WebGL info, and behavioral signals. IP reputation scoring is aggressive. Datacenter IPs are blocked by default in many DataDome configurations.

DataDome is one of two vendors (along with PerimeterX) that our scanner initially failed against. The TLS fingerprint was the blocking signal. We addressed this by using a browser automation approach that inherits the real browser's TLS stack rather than establishing its own connections. However, DataDome's IP-level blocking remains effective against datacenter-originated traffic.

Status: Partially bypassed. JS challenge passes, but IP-based blocking remains effective when using datacenter IPs.

PerimeterX (HUMAN): Behavioral Analysis

Primary Signals: CDP events, mouse patterns, timing, canvas Strong

PerimeterX (now HUMAN Security) has the most sophisticated behavioral analysis of any vendor we tested. Their challenge JS instruments the page extensively, monitoring Chrome DevTools Protocol (CDP) events, mouse movement trajectories, click timing patterns, scroll behavior, and keyboard input cadence.

The CDP event monitoring is particularly interesting. PerimeterX checks for the presence of automation-related CDP domains (like Runtime.evaluate calls and Page.addScriptToEvaluateOnNewDocument) that indicate a page is being controlled by automation tools. Standard Puppeteer and Playwright usage triggers these detections.

Canvas fingerprinting is heavily weighted. PerimeterX does not just check that a canvas fingerprint exists. It compares the fingerprint against expected values for the claimed browser and GPU combination. Claiming to be Chrome on Windows with an NVIDIA GPU but producing a canvas fingerprint consistent with a Linux software renderer triggers a mismatch flag.

The timing analysis checks for patterns consistent with automated interaction: perfectly uniform click intervals, zero-delay page transitions, and scrolling at constant velocity. Real human behavior is noisy. Automated behavior is smooth.

Status: Bypassed, but required significant effort. We had to suppress CDP automation signals, inject realistic mouse movement patterns, and match canvas fingerprints to the claimed browser profile.

Akamai Bot Manager: The Kitchen Sink

Primary Signals: Canvas, plugins, UA, TLS, timing, behavior Strong

Akamai's Bot Manager deploys the most comprehensive signal collection of any vendor we analyzed, checking 13 out of 14 signals in our matrix. Their challenge JS collects canvas fingerprints, WebGL renderer data, screen properties, timezone, language, installed plugins, font enumeration, audio context fingerprints, and more.

Akamai also performs TLS fingerprinting, though it weights it less heavily than DataDome. The primary detection path is through JavaScript signal analysis combined with behavioral telemetry.

Despite the breadth of signal collection, Akamai's scoring model appears to be tuned conservatively. Individual signals that do not match expected values contribute to a risk score rather than triggering immediate blocks. This means that getting most signals right is sufficient to pass, even if one or two are slightly off. The scoring threshold for a "block" decision is set high enough that legitimate users with unusual browser configurations are not falsely blocked.

This is a direct reflection of Akamai's customer base. Large e-commerce sites using Akamai cannot tolerate false positives on real customers. A customer using a privacy-focused browser with non-standard font settings must not be blocked. So the scoring model is forgiving, and that forgiveness extends to well-crafted bot profiles.

Bypass approach: Provide a comprehensive, internally consistent fingerprint. Akamai checks many signals but does not require perfection on any single one. Our scanner passes by matching the top 10 signals accurately.

The Remaining 20 Vendors: Quick Results

Beyond the five deep dives, we analyzed 20 additional vendors. Here are the key findings grouped by detection strength.

Vendor	Primary Detection	Signals Checked	Scanner Result
AWS WAF Bot Control	JS challenge + IP reputation	6	Pass
Sucuri	JS redirect + cookie	4	Pass
Radware Bot Manager	JS challenge + device ID	5	Pass
F5 Shape Security	JS obfuscation + telemetry	11	Pass (tuned)
Fastly Signal Sciences	Request anomaly + rate	5	Pass
Azure Front Door WAF	Rule-based + managed rules	4	Pass
Barracuda WAF	Signature + rate limit	3	Pass
Fortinet FortiWeb	ML anomaly + signatures	5	Pass
GeeTest	CAPTCHA + behavior	8	Pass (with solver)
Distil Networks	JS fingerprint + behavior	9	Pass (tuned)
Arkose Labs	Interactive challenge + behavior	9	Partial
hCaptcha	CAPTCHA + IP rate limit	5	Pass (with solver)

The pattern is clear. Vendors that rely primarily on JS challenges and cookie validation (Sucuri, Radware, AWS WAF, Barracuda) are straightforward to pass. Vendors that add behavioral analysis and TLS fingerprinting (DataDome, PerimeterX, Akamai) require significantly more effort. CAPTCHA-based vendors (hCaptcha, GeeTest, Arkose) require a visual solver, which we implemented using an AI vision model.

The Universal Gap: Conversion vs. Security

The most important finding from this research is not about any individual vendor. It is about the incentive structure of the entire bot detection industry.

Bot detection vendors serve two masters: security teams who want to block bots, and business teams who want to maximize conversion. These goals are in direct conflict.

0.1%

Typical false positive rate target for bot detection vendors on e-commerce sites

A false positive in bot detection means a real customer is blocked from making a purchase, submitting a form, or accessing content. For an e-commerce site processing 10 million visits per month, even a 0.1% false positive rate means 10,000 blocked customers. At an average order value of $80, that is $800,000 in lost revenue per month.

This creates an asymmetric incentive. The cost of a false positive (lost revenue, customer complaints, churn) is immediately measurable. The cost of a false negative (a bot that gets through) is diffuse and hard to attribute. When did that scraper first start copying your pricing data? Was that credential stuffing campaign the reason for the account takeovers, or was it phishing?

The result: every vendor tunes their detection threshold to minimize false positives, which necessarily means accepting more false negatives. This is not a criticism. It is a rational business decision. But it means that bot detection, as deployed in production, is systematically weaker than the underlying technology could provide.

The Tuning Spectrum

Most vendors offer sensitivity controls ("aggressive," "moderate," "permissive"). In practice, fewer than 15% of deployments run on "aggressive" settings. The default is almost always "moderate" or "permissive," and customers rarely change it. One vendor told us informally that switching from moderate to aggressive mode doubles the false positive rate while only catching 20% more bots.

What This Means Concretely

Consider the detection signals that vendors choose not to check. Cloudflare skips canvas fingerprinting in its standard challenge. Why? Because canvas rendering varies across GPU drivers, OS versions, and browser builds. A strict canvas check would block users with unusual graphics hardware. So Cloudflare trades that detection signal for lower false positives.

PerimeterX checks mouse movement patterns, but it cannot set the threshold too tight. Real humans with motor impairments, trackpad users, and touchscreen users all produce "unusual" mouse patterns. Set the threshold too low and you block people with disabilities. So the threshold is loose enough that injected mouse patterns pass.

DataDome's TLS fingerprinting is effective against default bot frameworks, but it cannot block all non-standard TLS stacks. VPN clients, corporate proxies, and privacy tools all modify TLS fingerprints. Block them all and you block a meaningful percentage of real users.

Each vendor faces this same calculation on every detection signal: how much security does this signal buy, and how many real users will it block? The answer, consistently, is that vendors err on the side of not blocking.

The Stealth Fingerprint Problem

Our research led us to build what we call a "stealth fingerprint picker," a system that selects the right browser profile for each vendor. The concept is straightforward: if Cloudflare checks signals A, B, and C, we only need to spoof A, B, and C. If DataDome also checks signal D (TLS fingerprint), we add that for DataDome-protected targets.

This vendor-specific approach is more effective than a universal stealth profile because it avoids overspecification. A profile that spoofs all 14 signals perfectly is actually more suspicious than one that gets 8 right and leaves 6 at default values. Real browsers have quirks. A profile that is too perfect looks synthetic.

How the Stealth Picker Works

1. Vendor identification. Before sending requests, the scanner identifies which bot detection vendor protects the target. This is done by analyzing HTTP response headers, challenge page structure, and JS file signatures.

2. Profile selection. Based on the detected vendor, the scanner loads a pre-built stealth profile that matches the signals that vendor checks. Each profile includes a consistent set of browser properties, TLS settings, and behavioral parameters.

3. Consistency enforcement. The profile ensures internal consistency. If the UA claims Chrome 120 on Windows 11, the navigator properties, canvas fingerprint, WebGL renderer, and screen resolution all match what Chrome 120 on Windows 11 would produce.

4. CAPTCHA solving. For vendors that escalate to visual CAPTCHAs (hCaptcha, GeeTest, Arkose), the scanner routes the challenge image to an AI vision model for solving. We use AWS Bedrock with a vision-capable model that achieves high accuracy on standard CAPTCHA types.

What This Means for DDoS Protection Buyers

Bot detection and DDoS protection are often sold as part of the same platform. Cloudflare, Akamai, Imperva, and Radware all bundle bot management with their DDoS mitigation offerings. The implicit promise is that the same technology that blocks scrapers and credential stuffers will also stop DDoS bots.

Our research shows that this assumption is flawed for several reasons:

1. DDoS Bots Do Not Need to Be Stealthy Per-Request

A scraper needs to pass bot detection on every request to extract data. A DDoS bot only needs to consume server resources. If the bot detection challenge adds 200ms of processing time per request, the bot still achieves its goal, because the origin server is still receiving and processing the challenge verification. The bot does not need to pass. It needs to generate load.

2. L7 DDoS Attacks Target the Challenge Infrastructure Itself

A sophisticated L7 DDoS attack can target the bot detection challenge infrastructure. If the challenge page requires server-side computation to generate and verify, flooding it with challenge requests consumes resources. The bot detection system becomes the attack surface.

3. Volumetric Attacks Overwhelm Detection Before It Engages

Bot detection JS challenges require the client to load a page, execute JavaScript, and return a response. At 100,000 requests per second, the challenge infrastructure must generate 100,000 unique challenges, serve them, and validate 100,000 responses. This is a significant compute load that may itself become a bottleneck under volumetric attack conditions.

Bot Detection is Not DDoS Protection

Bot detection systems are designed to identify and block individual automated clients. DDoS protection must handle aggregate traffic volumes. These are fundamentally different problems. A bot detection system that is 99% effective at identifying individual bots still allows 1% through, and 1% of a 1-million-bot DDoS attack is 10,000 bots hitting your origin.

4. The Conversion Tradeoff Weakens DDoS Defense

The same false-positive sensitivity that makes bot detection permissive for scrapers makes it permissive for DDoS bots. A DDoS bot running a real browser engine with a consistent fingerprint and clean residential IP will pass every vendor's challenge. The vendor cannot block it without also blocking the legitimate users who look identical.

5. Rate Limits and Bot Detection Are Often Separate Layers

As we documented in our rate limit research, CDN rate limiting is per-PoP, not global. Bot detection and rate limiting operate as independent layers. A bot that passes the JS challenge is subject to rate limits, but if those rate limits are per-PoP, a distributed botnet bypasses both layers simultaneously.

Recommendations

Based on our analysis of 20 vendors, here is what we recommend for organizations evaluating or using bot detection as part of their DDoS resilience strategy:

Do not rely on bot detection alone for DDoS protection. Bot detection and DDoS mitigation solve different problems. Use dedicated DDoS mitigation (rate limiting, traffic shaping, scrubbing) alongside bot detection, not instead of it.
Test your bot detection from the attacker's perspective. If you have never tested whether your bot detection actually blocks automated traffic, you do not know whether it works. Default configurations are optimized for low false positives, not for stopping determined attackers.
Ask your vendor specific questions. Which signals does your challenge JS collect? Is the detection threshold tuned for conversion or security? What is the false positive rate at each sensitivity level? Vendors that cannot answer these questions transparently may not have answers you would like.
Understand the CAPTCHA gap. Visual CAPTCHAs (hCaptcha, reCAPTCHA, Arkose) are increasingly solvable by AI vision models. A CAPTCHA that was effective in 2024 may not be effective in 2026. If your bot detection escalation path ends at a CAPTCHA, that path has an expiration date.
Layer your defenses. Combine bot detection with WAF rules, rate limiting (understanding its per-PoP limitations), origin protection, and behavioral anomaly detection at the application layer.

How Resilient Is Your Bot Detection?

DDactic's free infrastructure scan identifies which bot detection vendor protects your assets, tests whether the default challenge can be bypassed, and maps the DDoS attack surface that bot detection does not cover.

Get a Free Scan

Bot Detection WAF Bypass Browser Fingerprinting Cloudflare Akamai Imperva DataDome PerimeterX hCaptcha Arkose Labs DDoS Protection JavaScript Challenge TLS Fingerprint Reverse Engineering Security Research