This is the code-repo for "S3AND: Efficient Subgraph Similarity Search Under Aggregated Neighbor Difference Semantics".
Code ✅
Dataset Source ✅
README ✅
- networkx 3.1 or above
- torch_geometric (for loading datasets)
- yake (to divide the title for dblp_v14 to obtain the keywords)
| Name | |V(G)| | |E(G)| | |∑| |
|---|---|---|---|
| 4,039 | 88,234 | 1,284 | |
| PubMed | 19,717 | 44,338 | 501 |
| Elliptic | 203,769 | 234,355 | 166 |
| TWeibo | 2,320,895 | 9,840,066 | 1,658 |
| DBLPv14 | 2,956,012 | 29,560,025 | 7,990,611 |
usage: main.py [-h] [-i INPUT] [-DS DATASET] [-q QUERY] [-qs QUERYSIZE] [-qk QUERYKEYWORDS] [-m GROUPNUMBER] [-p PARTITIONNUMBER] [-t ITERATIONNUMBER] [-tSUM THRESHOLDSUM] [-tMAX THRESHOLDMAX]
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
path of graph input file
-DS DATASET, --dataset DATASET
the name of dataset
-q QUERY, --query QUERY
the query edge set
-qs QUERYSIZE, --querySize QUERYSIZE
the query vertex set size
-qk QUERYKEYWORDS, --queryKeywords QUERYKEYWORDS
the query vertex keyword set
-m GROUPNUMBER, --groupNumber GROUPNUMBER
the number of group
-p PARTITIONNUMBER, --partitionNumber PARTITIONNUMBER
the number of partitioning
-t ITERATIONNUMBER, --iterationNumber ITERATIONNUMBER
the number of iteration
-tSUM THRESHOLDSUM, --thresholdSUM THRESHOLDSUM
the sum threshold for AND
-tMAX THRESHOLDMAX, --thresholdMAX THRESHOLDMAX
the max threshold for AND
(A) For Facebook, PubMed, TWeibo, Elliptc
Step-1: (attr_datasets.py) load initial files from attr_datasets.py and obtain the initial graph G-xxxx.gml with keywords
Step-2: (offline.py) run offline.py to obtain G+-xxxx.gml with Aux data
Step-3: (argparser.py) set the query in argparser.py or in the command line
Step-4: (main.py) python main.py ................ (or not, if already set the query in argparser.py)
(B) For synthetic
Step-1: (generate.py) generate the G-distribution.gml data graph with keywords
Step-2: (offline.py) run offline.py to obtain G+-xxxx.gml with Aux data
Step-3: (argparser.py) set the query in argparser.py or in the command line
Step-4: (main.py) python main.py ................ (or not, if already set the query in argparser.py)
(C) For dblp_v14
Step-1: (dblp_yake.py) clear the source data, dblp.json
Step-2: (offline.py) get the graph with Aux data
Step-3: (argparser.py) set the query in argparser.py or in the command line
Step-4: (main.py) python main.py ................ (or not, if already set the query in argparser.py)
@article{wen2025s3and,
title={S3AND: Efficient Subgraph Similarity Search Under Aggregated Neighbor Difference Semantics},
author={Wen, Qi and Ye, Yutong and Lian, Xiang and Chen, Mingsong},
journal={Proceedings of the VLDB Endowment},
volume={18},
number={11},
pages={3708--3720},
year={2025},
publisher={VLDB Endowment}
}