Skip to content

codewithdaniel1/AML_transaction_monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

AML Alert Prioritization and Risk Scoring System

Overview

This project implements a two stage, risk based Anti Money Laundering monitoring framework that mirrors how financial institutions design and operate transaction monitoring programs. The system combines scenario based detection with model driven alert prioritization to improve investigator efficiency while maintaining interpretability and regulatory alignment.

The objective is not to automate suspicious activity reporting, but to prioritize alerts for human review in accordance with common AML regulatory expectations.


System Architecture

The framework is structured into two primary stages.

Stage 1: Scenario Based Transaction Monitoring

Stage 1 focuses on coverage and detection. Transaction level rules are applied to identify behaviors associated with known money laundering typologies.

Key characteristics:

  • Rolling time window feature engineering
  • Scenario rules aligned to CAMS typologies
  • Alerts generated at the account day level
  • Emphasis on interpretability and explainability

Example scenarios include:

  • High transaction velocity
  • Cross border activity bursts
  • Cash intensive behavior
  • Rapid transaction patterns

An alert is generated when one or more scenario rules are triggered for an account on a given day.


Stage 2: Alert Risk Scoring and Prioritization

Stage 2 focuses on efficiency and prioritization. Alerts generated in Stage 1 are enriched and ranked to determine which alerts should be reviewed first by investigators.

Stage 2 is divided into three sub steps.

Stage 2A: Alert Aggregation and Labeling

  • Transactions are grouped into alerts
  • An alert is labeled positive if any underlying transaction is labeled as laundering

Stage 2B: Alert Enrichment and Typology Attribution

  • Alerts are enriched with a dominant laundering typology for analysis
  • Typology labels are used only for validation and monitoring
  • Typology information is intentionally excluded from model features to avoid leakage

Stage 2C: Alert Scoring Model

  • Supervised models are trained to estimate alert risk
  • Logistic regression is used as an interpretable baseline
  • Gradient boosted trees are used for improved ranking performance
  • Alerts are ranked by risk score for investigation

End to End Monitoring Flow

Raw Transactions
    |
    v
Transaction Feature Engineering
    - Rolling counts and amounts
    - Velocity and frequency metrics
    - Cross border and cash indicators
    |
    v
Stage 1: Scenario Based Monitoring
    - High transaction velocity
    - Cross border activity bursts
    - Cash intensive behavior
    - Rapid transaction patterns
    |
    v
Alert Generation
    - Alerts created at account day level
    |
    v
Stage 2: Alert Enrichment
    - Transaction aggregation
    - Alert labeling using laundering outcomes
    - Typology attribution for validation
    |
    v
Stage 2: Alert Risk Scoring
    - Interpretable baseline model
    - Gradient boosted ranking model
    |
    v
Ranked Alerts
    - Risk scores
    - Investigator reason codes
    |
    v
Investigator Review and Decisioning

This flow illustrates how scenario based detection and model driven prioritization work together to support risk based AML investigations.


Dataset Access

The dataset is not included in this repository due to GitHub file size limits. It can be downloaded from Kaggle and placed in the data/ directory as saml_d.csv.

Key fields include:

  • Sender and receiver account identifiers
  • Transaction amount, currency, and payment type
  • Sender and receiver bank locations
  • Ground truth laundering indicator
  • Laundering typology labels

Evaluation Approach

Model performance is evaluated using AML relevant metrics, rather than accuracy alone.

Primary evaluation criteria:

  • Precision at the top 1 percent, 5 percent, and 10 percent of alerts
  • Concentration of true positives within investigator capacity
  • Stability across time based train test splits

This evaluation reflects how alert scoring models are assessed in production AML environments.


Explainability and Reason Codes

To support investigator decision making, each alert is accompanied by reason codes derived from the most extreme contributing risk factors. These reason codes highlight behaviors such as elevated transaction volume, cross border activity, or cash intensity.

This approach supports transparency and aligns with expectations for explainable AML systems.


Governance Considerations and Limitations

  • The system is designed for alert prioritization, not automated decisioning
  • Scenario thresholds require periodic tuning as behavior evolves
  • Model performance should be monitored for drift and stability
  • All alerts require human review and investigator judgment

Project Structure

risk_based_aml_monitoring/
├── aml_alert_scoring.ipynb
├── README.md
└── data/
    └── saml_d.csv

Results Summary

The two stage framework demonstrates strong alert prioritization performance in a highly imbalanced AML setting.

Key outcomes:

  • Alert positive rate after scenario tuning: ~1 percent
  • Gradient boosted model outperforms baseline logistic regression in ranking risk
  • Meaningful concentration of true positives within the top reviewed alerts
  • Investigator capacity simulation shows improved yield per day compared to random review

These results illustrate how combining scenario based monitoring with model driven prioritization can significantly improve investigative efficiency.


Key Takeaway

This project demonstrates how rule based monitoring and machine learning can be combined to create a practical, regulator friendly AML alert prioritization system that improves efficiency without sacrificing interpretability.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors