Anonymous Authors
- Enron: https://www.cs.cornell.edu/~arb/data/email-Enron/
- DBLP: https://www.cs.cornell.edu/~arb/data/coauth-DBLP/ (we use 2016 as training, 2015 as query)
- P.School: https://www.cs.cornell.edu/~arb/data/contact-primary-school/
- H.School: https://www.cs.cornell.edu/~arb/data/contact-high-school/
- Foursquare: https://networks.skewed.de/net/foursquare (the
NYC_restaurant_checkincollection) - Hosts-Virus: https://zenodo.org/record/807517#.YgSOoerMJdg (the
data/associations.csvfile) - Directors: https://networks.skewed.de/net/board_directors (the
net2m_2002-05-01collection) - Crimes: https://networks.skewed.de/net/crime
We have processed all data from the sources and put them in data/. Notice that the P.School dataset is put in (and renamed) school, the H.School dataset is put in (and renamed) school2.
python >= 3.5, Anaconda3numpy>=1.20sklearn>=0.24networkx>=2.5.1graph-tool>=2.44(for evaluating the baseline Bayesian-MDL)cdlib>=0.2.5(for evaluating the baselines Demon, CFinder)tqdm- Build:
g++ -std=c++11 -pthread cmotif.cpp -o cmotif
python main.py --dataset <dataset> --beta <beta> --features <features>
<features>can be eithercountormotif.- As mentioned in the paper,
betadepend on datasets. Combinations:--dataset dblp --beta 1000000--dataset enron --beta 1000--dataset school --beta 350000--dataset school2 --beta 60000--dataset foursquare --beta 20000--dataset hosts --beta 6000--dataset directors --beta 800--dataset crime --beta 1000