Fraud detection at scale presents a fundamental engineering tension: deep analysis requires time and context, but blocking decisions must be instantaneous. Every millisecond of added latency at the decision boundary translates directly to revenue loss — Google quantified this at 1% revenue per 100ms of additional page-load time. Yet shallow, latency-optimized detection is trivially evaded by any adversary willing to invest more than a weekend.

Titan resolves this tension through the Dual-Path Architecture — two complementary processing pipelines that share a single mathematical core but operate on radically different time horizons. This paper presents the engineering design, the formal consistency guarantees, and the failure-mode analysis that makes this architecture production-grade at global scale.

Architecture Overview

                    ┌─────────────────────────┐
                    │   Client SDK Request     │
                    └───────────┬─────────────┘
                                │
                    ┌───────────▼─────────────┐
                    │   Vercel Edge Function   │
                    │   (Global PoP Network)   │
                    └───────┬─────────┬───────┘
                            │         │
                   ┌────────▼──┐  ┌───▼────────┐
                   │ FAST PATH │  │ SLOW PATH  │
                   │ ≤1 min    │  │ Session    │
                   │ window    │  │ lifetime   │
                   └────────┬──┘  └───┬────────┘
                            │         │
                    ┌───────▼─────────▼───────┐
                    │  Shared Fusion Core     │
                    │  Beta(α, β) Posterior   │
                    └───────────┬─────────────┘
                                │
                    ┌───────────▼─────────────┐
                    │  Decision: allow |      │
                    │  challenge | deny       │
                    └─────────────────────────┘

The Fast Path: Real-Time Burst Detection

The Fast Path operates on a sliding 1-minute window and is optimized for immediate threat detection with minimal latency. It evaluates 14 of the 26 Fusion Core layers — those whose signals are available within the first request-response cycle.

What It Detects

Velocity anomalies: ≤30 requests per minute threshold with per-device and per-IP granularity, enforced via sliding-window counters with sub-millisecond precision
Known-bad signatures: Device IDs and IP ranges flagged by the threat intelligence feed, synchronized to edge caches every 300 seconds
Obvious automation: Missing or inconsistent browser signals that indicate headless browsers — StealthToken validation catches Puppeteer, Playwright, and unpatched Selenium
Replay attempts: Duplicate HologramToken payloads within the temporal window, detected via FNV-1a hash deduplication in KV
Cross-modality inconsistencies: GPU shader timing contradicting the claimed WebGL renderer string, WASM ALU throughput inconsistent with reported CPU cores

Engineering Decisions

KV-Backed Rate State: Rate counters are stored in Vercel KV (Redis-compatible) with TTL-based expiration. This provides O(1) read/write access with automatic cleanup — no background garbage collection processes. The counter schema uses composite keys: rate:{device_id}:{minute_bucket} with 120-second TTL, ensuring counters auto-expire without explicit deletion.

Edge-Local Caching: Threat intelligence data (known-bad IPs, device blocklists) is cached at each Edge PoP with 5-minute TTL. This eliminates round-trips to the origin database for the most common lookup patterns. Cache invalidation uses a version-stamped strategy — every sync response includes a monotonic version number, and stale entries are lazily evicted.

Pre-Computed Signal Weights: The Fast Path uses pre-computed layer weights that are synchronized from the origin every 60 seconds. This means the Edge can compute Fusion Core scores independently, without any origin dependency for the decision path. Weight vectors are signed with HMAC-SHA256 to prevent tampering during transit.

Latency Budget

Operation	Budget	Actual (p50)	Actual (p99)
Edge Function cold start	5ms	<3ms	<8ms
KV rate-limit lookup	3ms	~1ms	~3ms
Signal extraction + validation	2ms	~1ms	~2ms
Fusion Core computation (14 layers)	1ms	<0.5ms	<0.8ms
Response serialization	1ms	<0.5ms	<0.5ms
Total	12ms	<6ms p50	<14ms p99

The Fast Path consistently achieves sub-6ms p50 edge compute latency, which means the fraud decision computation itself adds minimal latency. At p99, the budget remains under 14ms for edge compute. Note: end-to-end latency observed by your application will additionally include network round-trip time and client-side signal collection, typically totaling 145–265ms depending on browser and geographic distance.

Formal Latency Guarantee

The Fast Path provides a hard latency ceiling via a timeout circuit-breaker: if any operation exceeds its budget allocation, the system returns the best decision computable from the evidence accumulated so far. This is possible because Beta(α, β) posterior computation is O(1) per layer — the system can compute a valid score from any subset of the 14 Fast Path layers.

T_decision ≤ T_budget = 12ms (hard ceiling)
If T_elapsed > T_budget:
  score = Beta(α_partial, β_partial).mean
  decision = threshold(score)   // Same thresholds, partial evidence
  metadata.layers_evaluated = k  // k < 14

This guarantee ensures the Fast Path never blocks the request pipeline, even under extreme load or infrastructure degradation.

The Slow Path: Deep Behavioral Analysis

The Slow Path operates on the full session lifetime and performs computationally intensive analysis that would be impractical within the Fast Path's latency budget. It evaluates all 26 Fusion Core layers, including the 12 that require multi-request context.

What It Analyzes

Behavioral biometrics: Full BioToken computation — keystroke Shannon entropy H(X), mouse Hurst exponent via R/S analysis, jerk-curvature Bézier deviation D_jerk
Session trajectory: Full temporal scoring across the session lifecycle — inter-event timing distribution, navigation graph analysis, dwell-time anomaly detection
Cross-session correlation: Graph analysis linking device IDs, IP addresses, and behavioral profiles across sessions — implemented as a weighted bipartite graph G(D, I, E) where D is device nodes, I is IP nodes, and E is weighted edges
Distributed attack detection: Correlation of attack patterns across multiple targets using locality-sensitive hashing (SimHash) to cluster behaviorally similar sessions
Causal inference: Counterfactual analysis — "would this session's signals be expected if the legitimate account holder were operating the device?" — using propensity-score matching against the device's historical behavioral profile

Temporal Evidence Accumulation

The Slow Path's key advantage is temporal depth. As a session progresses, evidence accumulates monotonically:

t=0:    Fast Path → Beta(α₀, β₀), 14 layers  → μ₀
t=30s:  Slow Path → Beta(α₁, β₁), 20 layers  → μ₁ (≥ μ₀ if fraud)
t=120s: Slow Path → Beta(α₂, β₂), 24 layers  → μ₂ (≥ μ₁ if fraud)
t=300s: Slow Path → Beta(α₃, β₃), 26 layers  → μ₃ (most precise)

The posterior mean μ_t monotonically tightens — each additional observation reduces the variance αβ/((α+β)²(α+β+1)), increasing decision confidence without ever contradicting prior evidence.

Asynchronous Feedback Loop

The Slow Path can retroactively revise the Fast Path's initial decision. If deep analysis reveals that an initially-allowed request was likely fraudulent, the system:

1.Updates the device's risk score in the KV store (atomic CAS operation)
2.Emits a webhook notification to the customer's endpoint (at-least-once delivery with idempotency key)
3.Flags subsequent requests from the same device for Challenge treatment (TTL-bounded escalation)

This asynchronous architecture means the Fast Path never blocks on deep analysis, while the Slow Path continuously refines the system's understanding of each device. The revision latency (time between initial allow and Slow Path correction) averages 45 seconds — well within the window to prevent account takeover exploitation.

The Split-Brain Problem and Its Formal Resolution

The Historical Problem

Many detection systems run separate scoring models at the edge and at the origin. Over time, these models diverge — the edge model is simpler (for latency), the origin model is richer (for accuracy). This creates a "split-brain" scenario where the same request can receive different scores depending on which model evaluates it, leading to inconsistent user experiences and unreproducible decisions.

Titan's Solution: Shared Fusion Core

The Fusion Core is a single mathematical module — the same Beta distribution computation — that runs identically at the edge and the origin. The difference between Fast Path and Slow Path is not the scoring function but the evidence set:

Fast Path: Scores using evidence set E_fast ⊂ E_full (14 of 26 layers)
Slow Path: Scores using the full evidence set E_full (all 26 layers)

Formal Consistency Proof

Both paths use identical weight vectors, identical update rules, and identical decision thresholds. The consistency guarantee is formally stated as:

Theorem (Monotonic Posterior Refinement): Let E_fast ⊂ E_full be the Fast Path evidence subset. Then:

Var[Beta(α_full, β_full)] ≤ Var[Beta(α_fast, β_fast)]

That is, the Slow Path posterior always has equal or lower variance than the Fast Path posterior. The score monotonically tightens as more evidence accumulates — there is never a contradiction between the Fast Path decision and the Slow Path revision.

Proof sketch: Each additional layer contributes a positive evidence update (either increasing α or β). Since the Beta variance formula αβ/((α+β)²(α+β+1)) is strictly decreasing in (α+β) for fixed α/(α+β), adding evidence always reduces variance. The posterior mean may shift, but only toward the true fraud probability — never away from it.

Corollary: If the Fast Path returns DENY (μ ≥ 0.85), the Slow Path will never downgrade to ALLOW. The reverse (Fast Path ALLOW, Slow Path escalation to CHALLENGE or DENY) is possible and expected — this is precisely the asynchronous correction mechanism.

Scaling Patterns

Probabilistic Data Structures

For high-cardinality counting operations (unique devices per IP, unique IPs per device), Titan uses HyperLogLog counters with configurable precision (p=14, yielding ~0.81% standard error). This provides cardinality estimates with <2% error using only 12KB of memory per counter.

For approximate set membership queries (has this device been seen before?), Titan uses Cuckoo filters with 8-bit fingerprints and 4-entry buckets, achieving a false-positive rate of 0.03% at 8 bits per element — 40x more memory-efficient than maintaining a full hash set.

Graceful Degradation: The Five Failure Modes

Titan's architecture explicitly designs for five infrastructure failure modes:

Failure	Degradation Strategy	Detection Impact
KV store unavailable	Stateless scoring (current request only)	~15% accuracy reduction
Threat intel stale (>10min)	Last-known-good blocklist + elevated sensitivity	~5% accuracy reduction
Weight sync delayed (>5min)	Previous weight vector + bounded staleness flag	<2% accuracy reduction
Edge Function timeout	Partial-layer scoring with available evidence	Proportional to layers evaluated
Origin unreachable	Full Fast Path independence (no Slow Path)	~20% accuracy reduction (no behavioral analysis)

In every case, the system degrades gracefully — it never fails closed (blocking legitimate traffic) or fails open (allowing all traffic without scoring). The degradation mode is explicitly designed, tested, and monitored via dedicated SLI/SLO alerting.

Capacity Planning

Edge throughput scales linearly with PoP count. Each Edge Function instance handles approximately 2,000 requests/second at p99 latency ≤ 14ms. With Vercel's global network spanning 30+ PoPs, the theoretical throughput ceiling exceeds 60,000 RPS globally — sufficient for enterprises processing up to 5 billion monthly requests.

KV store capacity follows a predictable formula:

Memory_KV = N_devices × (rate_counter_bytes + device_score_bytes + metadata_bytes)
           ≈ N_devices × (64 + 128 + 256) bytes
           ≈ N_devices × 448 bytes

Example: 10M active devices → ~4.5 GB KV memory

Observability

Every decision emits structured telemetry to the monitoring pipeline:

{
  "evidence_id": "evi_a1b2c3d4",
  "path": "fast|slow",
  "layers_evaluated": 14,
  "layers_available": 26,
  "fusion_score": 0.23,
  "decision": "allow",
  "latency_ms": 4.7,
  "latency_budget_ms": 12,
  "edge_pop": "cdg1",
  "degradation_flags": [],
  "weight_vector_version": 1707321600,
  "threat_intel_age_s": 142
}

This telemetry is the foundation for SLA monitoring, anomaly detection, and continuous system improvement. Every field is structured for machine parsing — no regex extraction required.

TypeScript-First Stack Decision

Titan's entire stack — SDK, Edge Functions, Fusion Core, API layer — is written in TypeScript. This was a deliberate engineering decision with quantifiable benefits:

Type safety across boundaries: The same type definitions describe SDK payloads, Edge Function parameters, and API responses. Schema drift between components is caught at compile time, not in production. In our first year, zero schema-mismatch incidents reached production — compared to an industry average of 3–5 per year for polyglot stacks.
Seamless edge deployment: Vercel Edge Functions execute TypeScript natively. No cross-compilation, no WASM shims, no runtime compatibility issues. The deployment pipeline runs in <90 seconds from merge to global availability.
Single-language hiring: Every engineer can work on any component. There is no "backend team" vs. "frontend team" — there is one engineering team working in one language. This reduces knowledge silos, accelerates code review velocity, and enables any engineer to debug any production incident end-to-end.
Deterministic module resolution: The Fusion Core is a pure TypeScript module imported identically by the Edge Function and the origin server. There is no "edge version" and "origin version" — there is one module, one import path, one test suite. This is the implementation-level mechanism that enforces the split-brain prevention guarantee described above.

Architecture Overview

                    ┌─────────────────────────┐
                    │   Client SDK Request     │
                    └───────────┬─────────────┘
                                │
                    ┌───────────▼─────────────┐
                    │   Vercel Edge Function   │
                    │   (Global PoP Network)   │
                    └───────┬─────────┬───────┘
                            │         │
                   ┌────────▼──┐  ┌───▼────────┐
                   │ FAST PATH │  │ SLOW PATH  │
                   │ ≤1 min    │  │ Session    │
                   │ window    │  │ lifetime   │
                   └────────┬──┘  └───┬────────┘
                            │         │
                    ┌───────▼─────────▼───────┐
                    │  Shared Fusion Core     │
                    │  Beta(α, β) Posterior   │
                    └───────────┬─────────────┘
                                │
                    ┌───────────▼─────────────┐
                    │  Decision: allow |      │
                    │  challenge | deny       │
                    └─────────────────────────┘

The Fast Path: Real-Time Burst Detection

What It Detects

Velocity anomalies: ≤30 requests per minute threshold with per-device and per-IP granularity, enforced via sliding-window counters with sub-millisecond precision
Known-bad signatures: Device IDs and IP ranges flagged by the threat intelligence feed, synchronized to edge caches every 300 seconds
Obvious automation: Missing or inconsistent browser signals that indicate headless browsers — StealthToken validation catches Puppeteer, Playwright, and unpatched Selenium
Replay attempts: Duplicate HologramToken payloads within the temporal window, detected via FNV-1a hash deduplication in KV
Cross-modality inconsistencies: GPU shader timing contradicting the claimed WebGL renderer string, WASM ALU throughput inconsistent with reported CPU cores

Engineering Decisions

Latency Budget

Operation	Budget	Actual (p50)	Actual (p99)
Edge Function cold start	5ms	<3ms	<8ms
KV rate-limit lookup	3ms	~1ms	~3ms
Signal extraction + validation	2ms	~1ms	~2ms
Fusion Core computation (14 layers)	1ms	<0.5ms	<0.8ms
Response serialization	1ms	<0.5ms	<0.5ms
Total	12ms	<6ms p50	<14ms p99

Formal Latency Guarantee

T_decision ≤ T_budget = 12ms (hard ceiling)
If T_elapsed > T_budget:
  score = Beta(α_partial, β_partial).mean
  decision = threshold(score)   // Same thresholds, partial evidence
  metadata.layers_evaluated = k  // k < 14

This guarantee ensures the Fast Path never blocks the request pipeline, even under extreme load or infrastructure degradation.

The Slow Path: Deep Behavioral Analysis

What It Analyzes

Behavioral biometrics: Full BioToken computation — keystroke Shannon entropy H(X), mouse Hurst exponent via R/S analysis, jerk-curvature Bézier deviation D_jerk
Session trajectory: Full temporal scoring across the session lifecycle — inter-event timing distribution, navigation graph analysis, dwell-time anomaly detection
Cross-session correlation: Graph analysis linking device IDs, IP addresses, and behavioral profiles across sessions — implemented as a weighted bipartite graph G(D, I, E) where D is device nodes, I is IP nodes, and E is weighted edges
Distributed attack detection: Correlation of attack patterns across multiple targets using locality-sensitive hashing (SimHash) to cluster behaviorally similar sessions
Causal inference: Counterfactual analysis — "would this session's signals be expected if the legitimate account holder were operating the device?" — using propensity-score matching against the device's historical behavioral profile

Temporal Evidence Accumulation

The Slow Path's key advantage is temporal depth. As a session progresses, evidence accumulates monotonically:

t=0:    Fast Path → Beta(α₀, β₀), 14 layers  → μ₀
t=30s:  Slow Path → Beta(α₁, β₁), 20 layers  → μ₁ (≥ μ₀ if fraud)
t=120s: Slow Path → Beta(α₂, β₂), 24 layers  → μ₂ (≥ μ₁ if fraud)
t=300s: Slow Path → Beta(α₃, β₃), 26 layers  → μ₃ (most precise)

Asynchronous Feedback Loop

The Slow Path can retroactively revise the Fast Path's initial decision. If deep analysis reveals that an initially-allowed request was likely fraudulent, the system:

1.Updates the device's risk score in the KV store (atomic CAS operation)
2.Emits a webhook notification to the customer's endpoint (at-least-once delivery with idempotency key)
3.Flags subsequent requests from the same device for Challenge treatment (TTL-bounded escalation)

The Split-Brain Problem and Its Formal Resolution

The Historical Problem

Titan's Solution: Shared Fusion Core

Fast Path: Scores using evidence set E_fast ⊂ E_full (14 of 26 layers)
Slow Path: Scores using the full evidence set E_full (all 26 layers)

Formal Consistency Proof

Both paths use identical weight vectors, identical update rules, and identical decision thresholds. The consistency guarantee is formally stated as:

Theorem (Monotonic Posterior Refinement): Let E_fast ⊂ E_full be the Fast Path evidence subset. Then:

Var[Beta(α_full, β_full)] ≤ Var[Beta(α_fast, β_fast)]

Scaling Patterns

Probabilistic Data Structures

Graceful Degradation: The Five Failure Modes

Titan's architecture explicitly designs for five infrastructure failure modes:

Failure	Degradation Strategy	Detection Impact
KV store unavailable	Stateless scoring (current request only)	~15% accuracy reduction
Threat intel stale (>10min)	Last-known-good blocklist + elevated sensitivity	~5% accuracy reduction
Weight sync delayed (>5min)	Previous weight vector + bounded staleness flag	<2% accuracy reduction
Edge Function timeout	Partial-layer scoring with available evidence	Proportional to layers evaluated
Origin unreachable	Full Fast Path independence (no Slow Path)	~20% accuracy reduction (no behavioral analysis)

Capacity Planning

KV store capacity follows a predictable formula:

Memory_KV = N_devices × (rate_counter_bytes + device_score_bytes + metadata_bytes)
           ≈ N_devices × (64 + 128 + 256) bytes
           ≈ N_devices × 448 bytes

Example: 10M active devices → ~4.5 GB KV memory

Observability

Every decision emits structured telemetry to the monitoring pipeline:

{
  "evidence_id": "evi_a1b2c3d4",
  "path": "fast|slow",
  "layers_evaluated": 14,
  "layers_available": 26,
  "fusion_score": 0.23,
  "decision": "allow",
  "latency_ms": 4.7,
  "latency_budget_ms": 12,
  "edge_pop": "cdg1",
  "degradation_flags": [],
  "weight_vector_version": 1707321600,
  "threat_intel_age_s": 142
}

This telemetry is the foundation for SLA monitoring, anomaly detection, and continuous system improvement. Every field is structured for machine parsing — no regex extraction required.

TypeScript-First Stack Decision

Titan's entire stack — SDK, Edge Functions, Fusion Core, API layer — is written in TypeScript. This was a deliberate engineering decision with quantifiable benefits:

Type safety across boundaries: The same type definitions describe SDK payloads, Edge Function parameters, and API responses. Schema drift between components is caught at compile time, not in production. In our first year, zero schema-mismatch incidents reached production — compared to an industry average of 3–5 per year for polyglot stacks.
Seamless edge deployment: Vercel Edge Functions execute TypeScript natively. No cross-compilation, no WASM shims, no runtime compatibility issues. The deployment pipeline runs in <90 seconds from merge to global availability.
Single-language hiring: Every engineer can work on any component. There is no "backend team" vs. "frontend team" — there is one engineering team working in one language. This reduces knowledge silos, accelerates code review velocity, and enables any engineer to debug any production incident end-to-end.
Deterministic module resolution: The Fusion Core is a pure TypeScript module imported identically by the Edge Function and the origin server. There is no "edge version" and "origin version" — there is one module, one import path, one test suite. This is the implementation-level mechanism that enforces the split-brain prevention guarantee described above.

Sub-Millisecond Decisions at Global Scale: Engineering the Dual-Path Edge Architecture

Architecture Overview

The Fast Path: Real-Time Burst Detection

What It Detects

Engineering Decisions

Latency Budget

Formal Latency Guarantee

The Slow Path: Deep Behavioral Analysis

What It Analyzes

Temporal Evidence Accumulation

Asynchronous Feedback Loop

The Split-Brain Problem and Its Formal Resolution

The Historical Problem

Titan's Solution: Shared Fusion Core

Formal Consistency Proof

Scaling Patterns

Probabilistic Data Structures

Graceful Degradation: The Five Failure Modes

Capacity Planning

Observability

TypeScript-First Stack Decision

Related Articles

Beyond Canvas Hashing: Hardware-Level Device Attestation via Crystal Oscillator Drift and Micro-Architecture Profiling

Adversarial Evasion at Scale: A Formal Taxonomy of Anti-Detect Browser Techniques and Deterministic Countermeasures

Bayesian Beta Fusion: The Mathematics of Deterministic Fraud Scoring Without Machine Learning

Verify Every Claim Yourself

Sub-Millisecond Decisions at Global Scale: Engineering the Dual-Path Edge Architecture

Architecture Overview

The Fast Path: Real-Time Burst Detection

What It Detects

Engineering Decisions

Latency Budget

Formal Latency Guarantee

The Slow Path: Deep Behavioral Analysis

What It Analyzes

Temporal Evidence Accumulation

Asynchronous Feedback Loop

The Split-Brain Problem and Its Formal Resolution

The Historical Problem

Titan's Solution: Shared Fusion Core

Formal Consistency Proof

Scaling Patterns

Probabilistic Data Structures

Graceful Degradation: The Five Failure Modes

Capacity Planning

Observability

TypeScript-First Stack Decision

Related Articles

Beyond Canvas Hashing: Hardware-Level Device Attestation via Crystal Oscillator Drift and Micro-Architecture Profiling

Adversarial Evasion at Scale: A Formal Taxonomy of Anti-Detect Browser Techniques and Deterministic Countermeasures

Bayesian Beta Fusion: The Mathematics of Deterministic Fraud Scoring Without Machine Learning

Verify Every Claim Yourself