Fraud detection at scale presents a fundamental engineering tension: deep analysis requires time and context, but blocking decisions must be instantaneous. Every millisecond of added latency at the decision boundary translates directly to revenue loss — Google quantified this at 1% revenue per 100ms of additional page-load time. Yet shallow, latency-optimized detection is trivially evaded by any adversary willing to invest more than a weekend.
Titan resolves this tension through the Dual-Path Architecture — two complementary processing pipelines that share a single mathematical core but operate on radically different time horizons. This paper presents the engineering design, the formal consistency guarantees, and the failure-mode analysis that makes this architecture production-grade at global scale.
Architecture Overview
┌─────────────────────────┐
│ Client SDK Request │
└───────────┬─────────────┘
│
┌───────────▼─────────────┐
│ Vercel Edge Function │
│ (Global PoP Network) │
└───────┬─────────┬───────┘
│ │
┌────────▼──┐ ┌───▼────────┐
│ FAST PATH │ │ SLOW PATH │
│ ≤1 min │ │ Session │
│ window │ │ lifetime │
└────────┬──┘ └───┬────────┘
│ │
┌───────▼─────────▼───────┐
│ Shared Fusion Core │
│ Beta(α, β) Posterior │
└───────────┬─────────────┘
│
┌───────────▼─────────────┐
│ Decision: allow | │
│ challenge | deny │
└─────────────────────────┘The Fast Path: Real-Time Burst Detection
The Fast Path operates on a sliding 1-minute window and is optimized for immediate threat detection with minimal latency. It evaluates 14 of the 26 Fusion Core layers — those whose signals are available within the first request-response cycle.
What It Detects
- Velocity anomalies: ≤30 requests per minute threshold with per-device and per-IP granularity, enforced via sliding-window counters with sub-millisecond precision
- Known-bad signatures: Device IDs and IP ranges flagged by the threat intelligence feed, synchronized to edge caches every 300 seconds
- Obvious automation: Missing or inconsistent browser signals that indicate headless browsers — StealthToken validation catches Puppeteer, Playwright, and unpatched Selenium
- Replay attempts: Duplicate HologramToken payloads within the temporal window, detected via FNV-1a hash deduplication in KV
- Cross-modality inconsistencies: GPU shader timing contradicting the claimed WebGL renderer string, WASM ALU throughput inconsistent with reported CPU cores
Engineering Decisions
KV-Backed Rate State: Rate counters are stored in Vercel KV (Redis-compatible) with TTL-based expiration. This provides O(1) read/write access with automatic cleanup — no background garbage collection processes. The counter schema uses composite keys: rate:{device_id}:{minute_bucket} with 120-second TTL, ensuring counters auto-expire without explicit deletion.
Edge-Local Caching: Threat intelligence data (known-bad IPs, device blocklists) is cached at each Edge PoP with 5-minute TTL. This eliminates round-trips to the origin database for the most common lookup patterns. Cache invalidation uses a version-stamped strategy — every sync response includes a monotonic version number, and stale entries are lazily evicted.
Pre-Computed Signal Weights: The Fast Path uses pre-computed layer weights that are synchronized from the origin every 60 seconds. This means the Edge can compute Fusion Core scores independently, without any origin dependency for the decision path. Weight vectors are signed with HMAC-SHA256 to prevent tampering during transit.
Latency Budget
| Operation | Budget | Actual (p50) | Actual (p99) |
|---|---|---|---|
| Edge Function cold start | 5ms | <3ms | <8ms |
| KV rate-limit lookup | 3ms | ~1ms | ~3ms |
| Signal extraction + validation | 2ms | ~1ms | ~2ms |
| Fusion Core computation (14 layers) | 1ms | <0.5ms | <0.8ms |
| Response serialization | 1ms | <0.5ms | <0.5ms |
| Total | 12ms | <6ms p50 | <14ms p99 |
The Fast Path consistently achieves sub-6ms p50 edge compute latency, which means the fraud decision computation itself adds minimal latency. At p99, the budget remains under 14ms for edge compute. Note: end-to-end latency observed by your application will additionally include network round-trip time and client-side signal collection, typically totaling 145–265ms depending on browser and geographic distance.
Formal Latency Guarantee
The Fast Path provides a hard latency ceiling via a timeout circuit-breaker: if any operation exceeds its budget allocation, the system returns the best decision computable from the evidence accumulated so far. This is possible because Beta(α, β) posterior computation is O(1) per layer — the system can compute a valid score from any subset of the 14 Fast Path layers.
T_decision ≤ T_budget = 12ms (hard ceiling)
If T_elapsed > T_budget:
score = Beta(α_partial, β_partial).mean
decision = threshold(score) // Same thresholds, partial evidence
metadata.layers_evaluated = k // k < 14This guarantee ensures the Fast Path never blocks the request pipeline, even under extreme load or infrastructure degradation.
The Slow Path: Deep Behavioral Analysis
The Slow Path operates on the full session lifetime and performs computationally intensive analysis that would be impractical within the Fast Path's latency budget. It evaluates all 26 Fusion Core layers, including the 12 that require multi-request context.
What It Analyzes
- Behavioral biometrics: Full BioToken computation — keystroke Shannon entropy H(X), mouse Hurst exponent via R/S analysis, jerk-curvature Bézier deviation D_jerk
- Session trajectory: Full temporal scoring across the session lifecycle — inter-event timing distribution, navigation graph analysis, dwell-time anomaly detection
- Cross-session correlation: Graph analysis linking device IDs, IP addresses, and behavioral profiles across sessions — implemented as a weighted bipartite graph G(D, I, E) where D is device nodes, I is IP nodes, and E is weighted edges
- Distributed attack detection: Correlation of attack patterns across multiple targets using locality-sensitive hashing (SimHash) to cluster behaviorally similar sessions
- Causal inference: Counterfactual analysis — "would this session's signals be expected if the legitimate account holder were operating the device?" — using propensity-score matching against the device's historical behavioral profile
Temporal Evidence Accumulation
The Slow Path's key advantage is temporal depth. As a session progresses, evidence accumulates monotonically:
t=0: Fast Path → Beta(α₀, β₀), 14 layers → μ₀
t=30s: Slow Path → Beta(α₁, β₁), 20 layers → μ₁ (≥ μ₀ if fraud)
t=120s: Slow Path → Beta(α₂, β₂), 24 layers → μ₂ (≥ μ₁ if fraud)
t=300s: Slow Path → Beta(α₃, β₃), 26 layers → μ₃ (most precise)The posterior mean μ_t monotonically tightens — each additional observation reduces the variance αβ/((α+β)²(α+β+1)), increasing decision confidence without ever contradicting prior evidence.
Asynchronous Feedback Loop
The Slow Path can retroactively revise the Fast Path's initial decision. If deep analysis reveals that an initially-allowed request was likely fraudulent, the system:
- 1.Updates the device's risk score in the KV store (atomic CAS operation)
- 2.Emits a webhook notification to the customer's endpoint (at-least-once delivery with idempotency key)
- 3.Flags subsequent requests from the same device for Challenge treatment (TTL-bounded escalation)
This asynchronous architecture means the Fast Path never blocks on deep analysis, while the Slow Path continuously refines the system's understanding of each device. The revision latency (time between initial allow and Slow Path correction) averages 45 seconds — well within the window to prevent account takeover exploitation.
The Split-Brain Problem and Its Formal Resolution
The Historical Problem
Many detection systems run separate scoring models at the edge and at the origin. Over time, these models diverge — the edge model is simpler (for latency), the origin model is richer (for accuracy). This creates a "split-brain" scenario where the same request can receive different scores depending on which model evaluates it, leading to inconsistent user experiences and unreproducible decisions.
Titan's Solution: Shared Fusion Core
The Fusion Core is a single mathematical module — the same Beta distribution computation — that runs identically at the edge and the origin. The difference between Fast Path and Slow Path is not the scoring function but the evidence set:
- Fast Path: Scores using evidence set E_fast ⊂ E_full (14 of 26 layers)
- Slow Path: Scores using the full evidence set E_full (all 26 layers)
Formal Consistency Proof
Both paths use identical weight vectors, identical update rules, and identical decision thresholds. The consistency guarantee is formally stated as:
Theorem (Monotonic Posterior Refinement): Let E_fast ⊂ E_full be the Fast Path evidence subset. Then:
Var[Beta(α_full, β_full)] ≤ Var[Beta(α_fast, β_fast)]That is, the Slow Path posterior always has equal or lower variance than the Fast Path posterior. The score monotonically tightens as more evidence accumulates — there is never a contradiction between the Fast Path decision and the Slow Path revision.
Proof sketch: Each additional layer contributes a positive evidence update (either increasing α or β). Since the Beta variance formula αβ/((α+β)²(α+β+1)) is strictly decreasing in (α+β) for fixed α/(α+β), adding evidence always reduces variance. The posterior mean may shift, but only toward the true fraud probability — never away from it.
Corollary: If the Fast Path returns DENY (μ ≥ 0.85), the Slow Path will never downgrade to ALLOW. The reverse (Fast Path ALLOW, Slow Path escalation to CHALLENGE or DENY) is possible and expected — this is precisely the asynchronous correction mechanism.
Scaling Patterns
Probabilistic Data Structures
For high-cardinality counting operations (unique devices per IP, unique IPs per device), Titan uses HyperLogLog counters with configurable precision (p=14, yielding ~0.81% standard error). This provides cardinality estimates with <2% error using only 12KB of memory per counter.
For approximate set membership queries (has this device been seen before?), Titan uses Cuckoo filters with 8-bit fingerprints and 4-entry buckets, achieving a false-positive rate of 0.03% at 8 bits per element — 40x more memory-efficient than maintaining a full hash set.
Graceful Degradation: The Five Failure Modes
Titan's architecture explicitly designs for five infrastructure failure modes:
| Failure | Degradation Strategy | Detection Impact |
|---|---|---|
| KV store unavailable | Stateless scoring (current request only) | ~15% accuracy reduction |
| Threat intel stale (>10min) | Last-known-good blocklist + elevated sensitivity | ~5% accuracy reduction |
| Weight sync delayed (>5min) | Previous weight vector + bounded staleness flag | <2% accuracy reduction |
| Edge Function timeout | Partial-layer scoring with available evidence | Proportional to layers evaluated |
| Origin unreachable | Full Fast Path independence (no Slow Path) | ~20% accuracy reduction (no behavioral analysis) |
In every case, the system degrades gracefully — it never fails closed (blocking legitimate traffic) or fails open (allowing all traffic without scoring). The degradation mode is explicitly designed, tested, and monitored via dedicated SLI/SLO alerting.
Capacity Planning
Edge throughput scales linearly with PoP count. Each Edge Function instance handles approximately 2,000 requests/second at p99 latency ≤ 14ms. With Vercel's global network spanning 30+ PoPs, the theoretical throughput ceiling exceeds 60,000 RPS globally — sufficient for enterprises processing up to 5 billion monthly requests.
KV store capacity follows a predictable formula:
Memory_KV = N_devices × (rate_counter_bytes + device_score_bytes + metadata_bytes)
≈ N_devices × (64 + 128 + 256) bytes
≈ N_devices × 448 bytes
Example: 10M active devices → ~4.5 GB KV memoryObservability
Every decision emits structured telemetry to the monitoring pipeline:
{
"evidence_id": "evi_a1b2c3d4",
"path": "fast|slow",
"layers_evaluated": 14,
"layers_available": 26,
"fusion_score": 0.23,
"decision": "allow",
"latency_ms": 4.7,
"latency_budget_ms": 12,
"edge_pop": "cdg1",
"degradation_flags": [],
"weight_vector_version": 1707321600,
"threat_intel_age_s": 142
}This telemetry is the foundation for SLA monitoring, anomaly detection, and continuous system improvement. Every field is structured for machine parsing — no regex extraction required.
TypeScript-First Stack Decision
Titan's entire stack — SDK, Edge Functions, Fusion Core, API layer — is written in TypeScript. This was a deliberate engineering decision with quantifiable benefits:
- Type safety across boundaries: The same type definitions describe SDK payloads, Edge Function parameters, and API responses. Schema drift between components is caught at compile time, not in production. In our first year, zero schema-mismatch incidents reached production — compared to an industry average of 3–5 per year for polyglot stacks.
- Seamless edge deployment: Vercel Edge Functions execute TypeScript natively. No cross-compilation, no WASM shims, no runtime compatibility issues. The deployment pipeline runs in <90 seconds from merge to global availability.
- Single-language hiring: Every engineer can work on any component. There is no "backend team" vs. "frontend team" — there is one engineering team working in one language. This reduces knowledge silos, accelerates code review velocity, and enables any engineer to debug any production incident end-to-end.
- Deterministic module resolution: The Fusion Core is a pure TypeScript module imported identically by the Edge Function and the origin server. There is no "edge version" and "origin version" — there is one module, one import path, one test suite. This is the implementation-level mechanism that enforces the split-brain prevention guarantee described above.
M.Eng. Distributed Systems (MIT). Built sub-millisecond edge inference systems processing billions of daily events. Lead engineer on VerifyStack's global edge network, KV-backed session state, and the Fast-Path burst-detection pipeline.