Skip to content

jiyangbai/Poligras

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

146 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Poligras

An implementation of the "POLIGRAS: Policy-based Graph Summarization".

Datasets

All datasets can be accessed [here] (because of the size limitation in Github (<= 25MB), we cannot directly upload some datasets onto Github website).

We have uploaded the astro-ph and cnr-200 (in .zip file) into the ./dataset/. Before running code on a specific dataset, please make sure to create a file directory with the same name as the dataset at first, then download and unzip dataset files from the given link and put them into the created file directory.

For example, if running on the in-2004 dataset, users can execute the following steps:

  1. Create directory ./dataset/in-2004/.
mkdir ./dataset/in-2004/
  1. Download and unzip the in-2004 dataset files, which include the in-2004_graph file that contains the graph structure and the in-2004_feat file that contains node features
unzip in-2004_graph.zip
unzip in-2004_feat.zip

, then move in-2004_graph file and in-2004_feat file into ./dataset/in-2004/.

mv in-2004_graph in-2004_feat ./dataset/in-2004/

In detail, the in-2004_graph file includes the graph structure in the Networkx graph format, and it can be generated from other graph format (e.g., edge list) by the provided networkx_graph_generation.py file. The in-2004_feat file includes the node feature matrix with row size as the node number and column size as the feature dimension, and it can be generated by the provided node_feature_generation.py file.

Dependencies

Install the following tools and packages:

  • python3 : Assume python3 by default (use pip3 to install packages).
  • numpy
  • torch
  • random
  • networkx
  • copy
  • argparse
  • pickle
  • glob

Users can also run the provided installer.txt to install all the above packages.

pip3 install -r installer.txt

To run the code

The following commands train and execute the Poligras model on a specific dataset.

python3 src/run.py --dataset dataset_name

Users can also set up more model options by:

  • --dataset : dataset to run;
  • --counts : number of graph summarization iterations;
  • --lr : learning rate;
For example, if users want to run Poligras on the astro-ph dataset for 100 iterations with the learning rate as 0.001, users can use the following command:
python3 src/run.py --dataset astro-ph --counts 100 --lr 0.001

After the running, the graph summary (including the supernodes, superedges, and edge correction set) will be stored into a file named datasetname_graph_summary. The total summarization reward will be printed in the following example format:

#super edge: ####
correction set size: ####

-------SuperNode encoding ended, total reward is ####---------.

Apart from the graph summarization rewards, users can also see the final graph summary superedges number and correction set size.

Code structures

The run.py is to set up all hyperparameters and import the Poligras model implemented in model.py.

The model.py includes the Poligras model details, and will be imported by run.py when running the code.

The networkx_graph_generation.py is to generate the Networkx format graph from the given initial graph stored in other formats (e.g., graph stored in edge list format).

The node_feature_generation.py is to generate node features for the given Networkx graph generated by networkx_graph_generation.py.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages