Six SQL Patterns for Scalable Transaction Fraud Detection

Six SQL patterns I use to catch transaction fraud

Program Integrity Analyst Fixel Smith leverages standard SQL over complex machine learning models to identify high-risk anomalies in movement-of-money logs. One critical signal involves flagging transactions occurring in two distant locations faster than a commercial jet’s 600 mph cruise speed.

Why This Matters

While current industry trends emphasize graph databases and machine learning, the technical reality for program integrity teams is that SQL remains the most efficient tool for identifying fraud shapes. Relying on complex models often increases the iteration loop to weeks, whereas SQL window functions allow analysts to test and deploy new fraud hypotheses in hours. Static thresholds frequently fail in production due to seasonality and merchant size variations; therefore, implementing rolling baselines, such as 168-hour trailing averages, is necessary to minimize false positives and prevent legitimate transaction blocks. Failure to properly handle sentinel values like ‘9999-12-31’ or performing window functions on unfiltered datasets can lead to significant warehouse credit waste and missed signals.

Key Insights

Impossible travel detection utilizes the Haversine distance function to identify cloned cards used in distant locations within a 600 mph threshold.
Velocity patterns use sliding windows (1-minute, 5-minute, and 1-hour) to distinguish between rapid card-testing server hits and slower benefit-trafficking rings.
Amount anomalies target round values like $1.00 for card testing and values just under thresholds, such as $99.99 or $499.99, to evade ID checks or ATM caps.
Suspicious merchant detection requires a 168-hour rolling average to account for weekly seasonality, flagging spikes three times higher than the baseline.
Off-hours analysis establishes a 90-day behavioral baseline for cardholders, requiring at least two purchases in a specific hour to qualify it as ‘normal’ behavior.
Window function primitives like LAG and ROW_NUMBER enable chained signals that allow analysts to filter complex fraud rings using simple Boolean expressions.

Working Examples

Basic velocity check using hour buckets and count thresholds.

SELECT cardholder_id, date_trunc('hour', timestamp) AS hour_bucket, count(*) AS tx_count, min(timestamp) AS first_tx, max(timestamp) AS last_tx FROM transactions WHERE timestamp >= current_date - INTERVAL '30 days' GROUP BY 1, 2 HAVING count(*) > 10;

Sliding-window velocity using the QUALIFY clause for high-frequency detection.

SELECT cardholder_id, timestamp, count(*) OVER (PARTITION BY cardholder_id ORDER BY timestamp RANGE BETWEEN INTERVAL '5 minutes' PRECEDING AND CURRENT ROW ) AS tx_in_last_5min FROM transactions QUALIFY tx_in_last_5min >= 5 ORDER BY cardholder_id, timestamp;

Impossible travel logic using Haversine distance and a 600 mph velocity threshold.

WITH ordered_tx AS (SELECT cardholder_id, timestamp, location, LAG(timestamp) OVER (PARTITION BY cardholder_id ORDER BY timestamp) AS prev_ts, LAG(location) OVER (PARTITION BY cardholder_id ORDER BY timestamp) AS prev_loc FROM transactions) SELECT cardholder_id, prev_ts, timestamp, haversine(prev_loc, location) / nullif(EXTRACT(EPOCH FROM (timestamp - prev_ts)), 0) * 3600 AS mph FROM ordered_tx WHERE prev_ts IS NOT NULL AND haversine(prev_loc, location) / nullif(EXTRACT(EPOCH FROM (timestamp - prev_ts)), 0) * 3600 > 600;

Practical Applications

Credit card issuers utilize velocity thresholds to block stolen cards being drained; Pitfall: Failing to whitelist high-volume legitimate users like vending machine operators leads to customer friction.
Public-sector benefit programs use off-hours analysis to flag 3am transactions for users with 9-to-5 habits; Pitfall: Applying this to new accounts without a 90-day history results in unreliable alerts.
E-commerce platforms identify card-testing rings by monitoring for round dollar amounts like $1.00; Pitfall: Static merchant thresholds fail to account for size, where a Costco naturally processes more volume than a bookshop.

References:

https://dev.to/fixelsmith/six-sql-patterns-i-use-to-catch-transaction-fraud-coc

On This Page

Six SQL patterns I use to catch transaction fraud

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Mastering the Top 12 SQL Interview Patterns for Data Engineers

Vector Sync Patterns: Keeping AI Features Fresh When Your Data Changes

Mastering CSV Data Handling in Python: Key Parameters and Techniques