This is the code for Detecting Suspicious Activity in the NFT Ecosystem using Temporal Graph Analysis .
The code can be run under any environment with Python 3.12 and above. (It may run with lower versions, but we have not tested it).
Install the required packages:
pip install -r requirements.txt
Clone this repo:
git clone https://github.com/slitiWassim/NFT-Suspicious-Activity.git
cd NFT-Suspicious-Activity/
A dataset is a directory with the following structure:
$ tree data
NFTs_Dataset
├── mapping
│ ├── nft_id_mapping
│ └── wallet_id_mapping
│
├── collections.csv
└── opensea_nft_transactions.parquet
| Descriptions | Statistics |
|---|---|
| Start date(dd-mm-yyyy,UTC) | 23-06-2017 21:05 |
| End date (dd-mm-yyyy, UTC) | 22-12-2023 19:06 |
| Number of NFT collections | 1,746,379 |
| Number of NFT tokens | 41,292,572 |
| Number of account addresses | 7,062,831 |
| Number of transactions | 76,300,244 |
| Chains | 10 |
In this study, we conducted a temporal cycle-driven analysis to identify groups of interconnected traders, and then examined the rhythm, ordering, and frequency of their transactions within these cycles to uncover patterns that deviated from normal market activity.
Illustrative example of temporal cycle extraction from a temporal graph.
- (
$\textbf{a}$ ) is an example of a temporal cycle$(a \rightarrow b \rightarrow c \rightarrow d \rightarrow a)$ . - (
$\textbf{b}$ ) Temporal graph with edges annotated by their corresponding timestamps. - (
$\textbf{c}$ ) A valid temporal cycle$(a \rightarrow b \rightarrow c \rightarrow d \rightarrow a)$ instance within the temporal graph, with duration$\delta = 9$ and length$L = 4$ . - (
$\textbf{d}$ ) A structurally valid directed cycle that fails to satisfy temporal ordering, and therefore does not qualify as a temporal cycle.
Following the implementation of Johnson’s algorithm , the search is restricted to Strongly Connected Components . Strongly Connected Components (SCCs) are identified using Raphtory built-in method.
Running the algorithm by examining cycles from every temporal edge individually would cause an exponential increase in complexity due to the vast number of temporal edges in temporal graphs. To improve efficiency, we instead focus on identifying potentially temporal cycles. This is achieved by locating structural cycles where consecutive edges
This filtering step significantly reduces the search space while preserving cycles that are likely to be temporally valid.
Within each strongly connected component (SCC), an adapted Johnson backtracking algorithm is then employed to enumerate structural cycles. Each candidate cycle undergoes early pruning through an interval compatibility check ,to ensure temporal feasibility between consecutive edges.
Each candidate (structural) cycle is passed through validate, which performs a fine grained temporal validation by checking for a strictly increasing sequence of timestamps across edges. Only cycles satisfying full temporal consistency are accepted.
Instead of computing the full Cartesian product of all timestamp combinations (which can explode combinatorially), it performs incremental DFS pruning:
- At each step, it picks the next timestamp that is strictly larger than the last chosen one.
- It stops early when no valid next timestamp exists.
To run the temporal cycles detection algorithm on a dataset, run:
python extract_temporal_cycles.py --dataset </path/to/transactions-file> For example, to detect temporal cycles with Maximum duration δ = 24 hours and Maximum cycle length L = 15:
python extract_temporal_cycles.py \
--dataset data/nft_transactions.parquet \ # Path to the dataset
--window "7 day" \ # Size of the temporal rolling window
--step "6 day" \ # Step size between consecutive rolling windows
--max-duration "1 day" \ # Maximum cycle Duration
--max-length 15 \ # Maximum cycle length
--num-processes 8 # Number of parallel processes (1 = sequential execution)
Please note that, in order to detect suspicious trading activities, temporal trading cycles must be detected first; alternatively, you can download pre-detected cycles from here
To identify suspicious trading activities within transactional data, run the following:
python suspicious.py \
--dataset </path/to/transactions-file> \
--cycles </path/to/cycles_data> For example :
python suspicious.py \
--dataset data/nft_transactions.parquet \
--cycles cycles_data
Flagged suspicious trading activities will be saved in the output/results directory.
To investigate the wallets flagged as suspicious and better understand their trading behavior despite the lack of labeled data, we conduct a temporal motif analysis. This method helps us identify recurring transaction patterns and interaction structures, compare them with normal trading behavior in the broader ecosystem, and provides additional evidence supporting the anomalous or potentially illicit nature of these wallets activities.
An example of extracting a particular temporal motif from a temporal graph. (
Our Temporal Motif-based Characterization is based on the work of Naomi et al. and has been adapted as a validation approach to address the lack of labeled data, effectively demonstrating the unusual trading behavior of flagged traders.
python motifs.py \
--dataset </path/to/transactions-file> \
--nodes </path/to/suspicious-wallets> For example :
python motifs.py \
--dataset data/nft_transactions.parquet \
--nodes output/results
Temporal Motif-based Characterization will be saved in the output/motifs directory.
To evaluate the stability and robustness of the proposed framework, we conduct a sensitivity analysis on the key threshold parameters used to filter suspicious wallets. The figure below reports the sensitivity of the proposed approach to variations in the key detection parameters. Each row corresponds to a specific parameter value, and the cell colors indicate the similarity (%) with respect to the baseline results.
If you find our work useful, please consider citing:
Paper submitted
For any question, please file an issue or contact:
Wassim Sliti : wassim.sliti@upm.es
This work was carried out within the STRAST Research Group at the Information Processing and Telecommunications Center (IPTC), Universidad Politécnica de Madrid, as part of the CEDAR project, funded by the Horizon Europe Programme (Grant Agreement No. 101135577).