-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Why did I choose this paper? This paper is one of the most related papers to our work (SEERa) especially to the Graph Embedding Layer - We can use this framework to dynamically model our user similarity graphs to a temporal representation - It is written by well-known experts in this area - Nice narration and write-up.
Main problem:
ROLAND is a graph learning framework for dynamic graphs. Dynamic graphs are graphs that change over time, such as social networks, transportation networks, and communication networks. These graphs are often large, complex, and difficult to model. The goal of ROLAND is to provide a flexible and efficient framework for learning the structure and dynamics of dynamic graphs. Main insight: Extending any static graph representations for dynamic data.
Applications:
Fraud detection, anti-money laundering, recommender systems.
Existing work:
Most of the common graph representation methods are designed for static graphs. There are multiple research work on dynamic graph representation, however, there are a few works that leverage the power of successful static graph representation approaches to model the dynamic or evolving data.
- GNN as feature extractor. Then a sequence model on the top of GNN for training GAP: for each new snapshot, all hidden layers recomputed from scratch and only the top layer is being updated.
- graph convolutional layer in RNN instead of linear layer (or similar methods) GAP: no incorporation from static GNN designs (skip-connections, batch normalization, edge embedding) = not adapting successful static GNNs for dynamic data.
Main limitations which ROLAND addresses:
- model design - no incorporation from static GNN designs (skip-connections, batch normalization, edge embedding) = not adapting successful static GNNs for dynamic data
- evaluation settings - ignoring the evolving nature of data and models, not updating model for new data
- training strategies - keeping entire or huge amount of data in memory (or GPU)
Inputs:
Three different steps (algorithms):
- GNN forward computation: graph snapshots. g1, g2, ..., gT + hidden state of layer T-1 (H_T-1)
- Live Update Evaluation: graph snapshots. g1, g2, ..., gT + y1, y2, ..., yT + trained GNN
- Training Algorithm: g1, ..., gT + y1, ..., yT + H_T-1 + smoothing factor + a model for fine-tuning (meta-model)
Outputs:
Three different steps (algorithms):
- GNN forward computation: yT = the probability of future edges. Updated HT
- Live Update Evaluation: Performance MRR + Trained GNN model
- Training Algorithm: GNN model + updated meta model
Method:
ROLAND consists of several components, including a dynamic graph representation, graph evolution model, graph learning algorithm, and evaluation framework. The dynamic graph representation is used to represent the graph at different time steps, while the graph evolution model is used to model the dynamics of the graph. The graph learning algorithm is used to infer the structure and dynamics of the graph from the data, and the evaluation framework is used to evaluate the performance of the algorithm. Three major components are: 1- Model Design: static GNN for dynamic graphs. 2- Training: high scalability. 3- Evaluation: capture evolving nature of dynamic data with hierarchically updating node embeddings.
Gaps:
- Using only one metric (MRR)
Experimental Setup:
- Dataset:
- BSI-ZK - 1.7M nodes - 56M edges - 257 snapshots
- AS-733 - 7.7K nodes - 12M edges - 733 snapshots
- Reddit-Title - 54K nodes - 571K edges - 178 snapshots
- Reddit-Body - 36K nodes - 286K edges - 178 snapshots
- BSI-SVT - 90K nodes - 190K edges - 49 snapshots
- UCI-Message - 1.8K nodes - 60K edges - 29 snapshots
- Bitcoin-OTC - 6K nodes - 36K edges - 279 snapshots
- Bitcoin-Alpha - 4K nodes - 24K edges - 279snapshots
- Metrics:
- MRR
Baselines:
- Evolve-GCN RNN
- T-GCN GRU (linear transformations -> graph convolutional operations
- GCRN-GRU generalizing TGCN with capturing temporal information using GRU
- GCRN-LSTM generalizing TGCN with capturing temporal information using LSTM
- GCRN-Baseline Chebyshev spectral graph convolution features -> LSTM
Results:
- Performance comparison
Three different reports are provided to showcase ROLAND's performance: 1) Baselines (standard), 2) Baselines (with ROLAND training strategy), 3) ROLAND. For ROLAND results, they compare three different updating methods: 1) Moving Average, 2) MLP, 3) GRU.
results show that for ROLAND comparison, using GRU leads to better results for most datasets. Moreover, most of the standard baselines cannot work on big datasets. ROLAND training can resolve this problem and for small datasets, it can outperform the standard baselines. Finally, ROLAND can beats all standard baselines and baselines with ROLAND training strategy.
Code:
https://github.com/snap-stanford/roland
Presentation:
There is no available presentation for this paper.