Live Monitoring
& Anomaly Analytics
Complete observability into the fraud detection pipeline — from raw signal ingestion to final verdict emission. Monitor decisions in real time, detect distributional drift with CUSUM change-point analysis (Page, 1954), and visualise session trajectories as Markov chain state transitions with impossible-path flagging.
Every decision is cryptographically auditable. Every detection layer exports calibration telemetry. Every anomaly carries a complete causal explanation chain.
Observability Architecture
Every stratum of the fraud detection pipeline — from signal ingestion through Bayesian fusion to verdict emission — is instrumented for real-time observation, historical forensic analysis, and multi-channel automated alerting.
Real-Time Decision Stream
Observe every allow, challenge, and deny verdict the instant it materialises. The live feed decomposes each decision into its constituent risk score, posterior confidence interval, contributing detection-layer weights, and the Bayesian evidence ratio that tipped the verdict — streamed from the edge with sub-second propagation.
Dual-path architecture: fast-path decisions (<5 ms, memoised feature vectors) stream alongside slow-path deep analysis (<200 ms, full ensemble inference).
Anomaly Detection — CUSUM Control Charts
Cumulative Sum (CUSUM) change-point detection (Page, 1954) continuously monitors the statistical process control of decision-rate distributions. When the underlying generating process shifts — sudden deny-rate spikes, novel attack-vector emergence, or coordinated campaign onset — the algorithm triggers alerts before drift becomes visible in aggregate dashboards.
Sequential probability ratio test with configurable allowance parameter k and decision interval h. Detects mean-shift Δμ ≥ 0.5σ within O(1/Δμ²) observations — orders of magnitude faster than Shewhart charts.
Markov Chain Session Trajectory Analysis
Model multi-request session flows as discrete-time Markov chains. The temporal engine estimates transition probability matrices from legitimate traffic baselines, then flags impossible transitions (e.g., checkout → registration), sudden action-rate spikes exceeding 3σ of the ergodic distribution, and fingerprint drift mid-session indicative of session hijacking.
Anomaly taxonomy: IMPOSSIBLE_TRANSITION, BEHAVIOR_SPIKE, FINGERPRINT_DRIFT, VELOCITY_ANOMALY, PROBE_PATTERN, AUTOMATION_SIGNATURE. Stationary distribution computed via power iteration on the transition kernel.
Per-Layer Detection Observability
Granular telemetry across all 151 detection techniques in 12 analyzers. Inspect per-analyzer posterior score distributions, calibration reliability diagrams, Brier scores, hit rates, and mutual information with the final verdict. Identify which analyzers are decision-dominant and where coverage gaps introduce blind spots.
Layer taxonomy: device fingerprint, behavioral biometrics, network topology, identity graph, velocity, temporal sequence, information-theoretic entropy, causal inference (do-calculus), and ensemble fusion.
Geographic Threat Cartography
Geo-velocity analysis projected onto a real-time global threat map. Haversine great-circle distance computation detects impossible-travel violations, ASN concentration Herfindahl indices reveal proxy-farm clustering, and regional fraud density heatmaps expose geographic attack corridors.
Impossible-travel threshold: sessions requiring >900 km/h velocity between consecutive requests trigger IMPOSSIBLE_TRAVEL flags with geographic provenance chains.
Configurable Alert Policy Engine
Define multi-dimensional alert surfaces spanning anomaly detection thresholds, velocity spike magnitudes, consortium threat-signal propagation, and infrastructure health degradation. Route alerts via Slack, PagerDuty, email, or webhook — each enriched with full decision context, evidence IDs, and recommended remediation actions.
Trigger taxonomy: rate-based (requests/s), threshold-based (score > θ), pattern-based (regex on reason codes), and composite (Boolean combinations). Configurable cooldown periods prevent alert storms.
Adaptive Threshold Optimisation
The real-time adaptation engine continuously recalibrates detection thresholds via Thompson Sampling (explore/exploit on the Beta-Bernoulli posterior) and multiplicative weight updates (Freund & Schapire, 1997) for adversarial robustness. Observe how the system autonomously shifts its decision boundary in response to evolving attack distributions.
Tracks precision, recall, F₁ score, Matthews correlation coefficient, and attack-pattern memory (up to 100 learned signatures with exponential decay weighting).
Historical Replay & Forensic Audit
Access the complete decision audit trail with cryptographically-linked evidence IDs for post-incident forensics. Replay historical traffic against updated policy configurations to measure counterfactual impact before production deployment. Export compliance-grade reports for SOC 2 Type II, PCI DSS 4.0, and GDPR Article 22 audits.
Evidence chain includes: raw signal provenance, per-layer score computation logs, Bayesian fusion weight matrices, and post-decision pipeline actions with nanosecond timestamps.
Lead & Booking Management
Track advance booking requests and enterprise leads through a dedicated administrative workflow. Manage the qualification pipeline from initial signup through risk-profile segmentation and activation — with automatic scoring from the detection engine informing lead prioritisation.
Integrated with the decision pipeline: leads are automatically scored using the same 151 detection techniques across 12 analyzers, enabling risk-stratified onboarding workflows.
Command & Control Dashboard
The unified control surface provides real-time visibility into decision streams, per-layer detection health, pipeline latency percentiles, infrastructure subsystem status, and enterprise lead management — all correlated on a single temporal axis.
Infrastructure Health & Resilience Telemetry
Kubernetes-native health probes with component-level granularity and circuit-breaker integration. The health subsystem continuously monitors every infrastructure dependency — coordinating graceful degradation via the bulkhead pattern when individual components exceed pressure thresholds.
Kubernetes Probes
ActiveLiveness and readiness endpoints conforming to K8s probe contracts — TCP socket checks, HTTP GET with configurable periodSeconds, failureThreshold, and initialDelaySeconds
KV Store Health
ActiveEdge KV store connectivity, replication lag, and p99 read/write latency monitoring with automatic failover to in-memory LRU cache on degradation
Detection Engine
ActiveAll 12 analyzers (151 techniques) report operational status, last-execution timestamp, and Brier score calibration — unhealthy analyzers are circuit-broken automatically
API Key Sync
ActivePostgres → KV synchronisation pipeline health with stale-while-revalidate semantics, conflict resolution via last-write-wins CRDT, and sync lag alerting
Database Pressure
ActiveConnection pool utilisation (active/idle/waiting), p50/p95/p99 query latency, and deadlock detection with automatic query cancellation on timeout
Graceful Shutdown
ActiveCoordinated SIGTERM handling with in-flight request draining, connection pool teardown, and final health checkpoint emission before process exit
Decision Explainability Engine
Every fraud verdict includes a complete causal explanation chain — from raw signal provenance through per-layer scoring to final Bayesian posterior collapse. The explainer engine transforms 20+ machine-readable reason codes into human-interpretable narratives with severity classifications, domain categories, and recommended response actions — satisfying GDPR Article 22 and CCPA automated decision-making transparency requirements.
reason_codetitleseveritycategoryrecommendation