This project demonstrates financial fraud detection by leveraging Google Cloud Graph capabilities—specifically Spanner Graph and BigQuery Graph—combined with the visual analytics of Kineviz GraphXR. We transform PaySim transaction data into property graph format for fraud detection analysis.
We build upon synthetic data generated by PaySim, the Mobile Money Payment Simulator. PaySim, originally created by Dr. Edgar Lopez-Rojas (http://edgarlopez.net), simulates authentic transaction behavior observed on a mobile money platform. The platform enables users to transfer funds between the electronic wallets on their mobile phones.
Specifically, we utilize PaySim 2, a version developed by David Voutila. After configuring the simulation with parameters such as the number of steps, clients, merchants, banks, and the probabilities of various activities, the simulation can be executed.
# Clone this repository
git clone git@github.com:Kineviz/paysim.git
cd paysim
# Install and setup uv
pip install uv
uv venv --python=python3.11
.venv\Scripts\activate # Windows; Linux/Mac: source .venv/bin/activate
# Install dependencies
uv pip install pandas google-cloud-bigquery google-cloud-spanner pandas-gbq db-dtypes python-dotenv
# Prepare data to be loaded to Spanner or BigQuery
uv run src/prepare_data.pyRequirements: Python 3.11, GCP credentials, CSVs generated by PaySim simulator (data/raw/transactions.csv, data/raw/clients.csv, data/raw/merchants.csv)
See Data Preparation Pipeline for details. The pipeline generates:
- Entity nodes: Banks, Emails, PhoneNumbers, SSNs
- PII relationships: Client → Email/Phone/SSN
- Transaction relationships: Client ↔ Transaction ↔ Merchant/Bank
- Spanner with Schema
- Spanner Schemaless (Note: uses lowercase labels)
- BigQuery
Now that data is loaded in Spanner or BigQuery, and you run DDL to define a graph, you can connect GraphXR Explorer to it following instruction on Google Cloud Marketplace:
Or create an account on https://graphxr.kineviz.com/ and connect to your instance.
graph LR
Client["<b>Client</b><br/>id, name, isfraud"]
Transaction["<b>Transaction</b><br/>id, amount, timestamp<br/>action, globalstep<br/>isfraud, isflaggedfraud<br/>typedest, typeorig"]
Merchant["<b>Merchant</b><br/>id, name, highrisk"]
Bank["<b>Bank</b><br/>id, name"]
Email["<b>Email</b><br/>id, name"]
PhoneNumber["<b>PhoneNumber</b><br/>id, name"]
SSN["<b>SSN</b><br/>id, name"]
Client -->|PERFORMS| Transaction
Transaction -->|TO_CLIENT| Client
Transaction -->|TO_MERCHANT| Merchant
Transaction -->|TO_BANK| Bank
Client -->|HAS_EMAIL| Email
Client -->|HAS_PHONE| PhoneNumber
Client -->|HAS_SSN| SSN
style Client fill:#e1f5ff
style Transaction fill:#fff3e0
style Merchant fill:#f3e5f5
style Bank fill:#e8f5e9
style Email fill:#fce4ec
style PhoneNumber fill:#fce4ec
style SSN fill:#fce4ec
Nodes: Client, Transaction, Merchant, Bank, Email, PhoneNumber, SSN
Edges: PERFORMS, TO_CLIENT, TO_MERCHANT, TO_BANK, HAS_EMAIL, HAS_PHONE, HAS_SSN