Skip to content

NetX-lab/dNE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dNE

This repo prototypes an efficient and lightweight framework for DNN-based traffic engineering (TE) described in the TNSE paper: Learning for Accelerated Traffic Engineering with Differentiable Network Modeling. It enables the exploration of the best performance among any DNN model for specific TE tasks/objectives, supports gradient-based training beyond existing DRL, and use cases (e.g., CNN-TE) in this repo can achieve near-optimal results similar to linear programming with negligible decision time. The framework, dNE, achieves these functions using a set of fully differentiable matrix operations to compute TE metrics for given decisions, enabling direct gradient chains from metrics to update DNN models. It is implemented as a plugin in any DNN training framework, with a template training framework file provided in this repo.

Repo Structure and Contents

The repo structure with a brief description of each file is described as follows:

dNE/
├── dataset/               # [Contains datasets for TE tasks.]
│   ├── capacities/        # [Store link capacity data.]
│   │   └── G.npy          # [Example capacity in dNE matrix format.]
│   ├── paths/             # [Stores pre-defined candidate paths.]
│   │   └── Yates_Format/  # [Raw path data computed by Yates.]
│   │       └── G.txt      # [Example raw path data.]
│   │   └── G.npy          # [Path data in dNE matrix format.]
│   ├── topologies/        # [Stores network topology data.]
│   │   └── G.graphml      # [Example topology in GraphML format.]
│   ├── traffic/           # [Stores traffic matrix data]
│       └── G.npy          # [Traffic matrix in dNE format.]
├── methods/               # [Various TE algorithms with dNE framework]
│   ├── lp.py              # [Traditional LP method.]
│   ├── ecmp.py            # [Traditional ECMP method.]
│   ├── cnn.py             # [CNN-TE.]
│   ├── drl.py             # [DRL-TE.]
│   ├── fc.py              # [FC-TE.]
│   ├── lstm.py            # [LSTM-TE.]
│   └── template.py        # [A template for integrating new DNN-based TE methods.]
├── util/                  # [Contains utility scripts for the framework.]
│   ├── AssistFunc.py      # [Helper functions for DNN training.]
│   ├── framework.py       # [Fill: dNE core framework logic (evaluation and summarization stage).]
│   └── modeling.py        # [Transferring raw data into dNE (matrix) format.]
├── main.py                # [Entry script for running dNE.] 
└── README.md              # [This README file.]

The key features of this repository are summarized as follows:

  • Implements dNE using differentiable matrix operations with PyTorch.
  • Provides a template file demonstrating how to integrate dNE into any arbitrary DNN training framework. Includes use cases (e.g., FC-TE, LSTM-TE, CNN-TE, DRL-TE) to illustrate the development of specific DNN training scripts using the template.
  • Implements traditional traffic engineering methods (e.g., ECMP, LP) within the dNE framework for easy comparison with DNN-based methods.
  • Develops various functions to model matrix formats of networks required by dNE, either from raw data or outputs of other traffic engineering tools.

Getting Started

Environment and Prerequisites

This project does not enforce specific version requirements for the listed software and hardware. Below are the configurations we used during evaluation:

Hardware

  • GPU: Nvidia GeForce RTX 3090 (CUDA version 11.6)
  • CPU: 4 Intel Xeon Platinum 8268 24-Core processors with 128GB RAM

Software

  • Gurobi: v10.0.2 (required for LP tasks only)
  • Python: 3.6.13
  • Packages:
    • numpy: 1.19.2
    • networkx: 2.5.1
    • torch: 1.10.2
    • torchaudio: 0.10.2
    • torchvision: 0.11.3
    • gurobipy: 9.1.2

Training DNNs

  • Run main.py and specify the following mandatory parameters to configure the running environment:

    • --method: Select the TE algorithm to use (e.g., ECMP, LP, FC, LSTM, CNN, DRL, etc.).
    • --topo: Choose the topology for traffic engineering (e.g., G in our repository).
    • --train_flag: Set this parameter to True to train the specified DNN.
    • --metric_flag: Set this parameter to False to skip the evaluation of TE metrics in this step.
  • An example of training CNN-TE for topology G is shown below:

    python main.py \
        --method 'CNN' \
        --topo 'G' \
        --train_flag True \
        --metric_flag False
  • The resulting DNN models and metrics will be stored in results/{method}/{topo}/.

Evaluating TE Metrics

  • Run main.py in a similar way to the training step, but set --train_flag to False and --metric_flag to True.

  • An example of evaluating CNN-TE for topology G is shown below:

    python main.py \
        --method 'CNN' \
        --topo 'G' \
        --train_flag False \
        --metric_flag True

Detailed Parameters for the Main Function (Full)

Category Argument Type Default Description
Mandatory --method str N/A The method to use (ECMP, LP, FC-TE, LSTM-TE, CNN-TE, DRL-TE).
--train_flag bool N/A Flag to indicate whether training specified DNNs or running LP/ECMP algorithms.
--metric_flag bool N/A Flag to determine whether to show TE metrics (MLU, Throughput, and Congestion Loss).
--topo str N/A Topology name (e.g., G in this repo).
Optional (Network) --scale float 1150383 Scaling factor of the original traffic.
--k int 4 Number of candidate paths per flow.
--obj str MLU Objective function name (MLU, TP, or CLoss). Applicable for LP only.
Optional (Training) --train_size int 4500 Size of the training set.
--batch_size int 64 Batch size.
--alpha float 0 Alpha value to balance MLU and CLoss in the loss function (MLU + alpha * CLoss).
--learning_rate float 0.01 Learning rate.
--seed int 76 Random seed.
--epoch int 100 Number of training epochs.
--weight_decay float 0.001 Weight decay for the DNN training optimizer.
Optional (RL only) --buf_capacity int 10000 Replay buffer capacity.
--learning_rate_actor float 0.0001 Learning rate for the actor in DRL.
--learning_rate_critic float 0.0001 Learning rate for the critic in DRL.
--gamma float 0.99 Gamma value (discount factor) in DRL.
--tau float 0.01 Tau value (soft update rate for the target network) in DRL.
--sigma float 0.6 Sigma value (exploration noise) in DRL.

Defining New TE Metrics

  • To define a new TE metric, register the corresponding PyTorch formulations in summarization() in util/framework.py with its name and specific formulation.

  • For example, registering the normalized congestion loss metric can be done as follows:

    # Normalized Congestion Loss
    elif objective == 'CLoss':
        CLoss = torch.sum(torch.max(L_tensor - C_tensor, torch.tensor(0)))
        return CLoss / torch.sum(D_tensor)
  • Once registered, the new metric can be used in training. For example, if you want to use a mixture of MLU and congestion loss as the loss function for training DNNs, you can add the following code in the training framework (e.g., in methods/template.py):

    for ...
        ...
        L_tensor = evaluation(cur_MA, P_tensor, B_tensor, data2[j])  # Evaluation stage: load on each edge
        TE_MLU = summarization('MLU', L_tensor, C_tensor, data2[j])  # Summarization stage: calculate MLU
        TE_CLoss = summarization('CLoss', L_tensor, C_tensor, data2[j])  # Summarization stage: calculate congestion loss
        loss[j] = TE_MLU + self.alpha * TE_CLoss
    ...
    
    loss = torch.mean(loss)
    loss.backward()
    optimizer.step()

Defining New DNN-TE Models

  • Users can define their own DNN models in the DNN Python class within the template file (methods/template.py).

  • Once defined, register the new method in run.py. For example, if you propose a new method CNN-TE, you can register it as follows:

    from methods.cnn import RUN_CNN
    ...
        elif method == 'CNN':
            running = RUN_CNN(...)
    ...

Modeling dNE Matrix

  • In util/modeling.py, we provide several utility functions to convert raw data from common data sources to the dNE matrix format. These functions support tools like YATES, which generates realistic synthetic traffic data and candidate paths with good TE performance, and SNDLib, which contains real traffic traces. The provided functions include:

    • Topo_dNE2Yates(): Translates a NetworkX-format topology (dNE) to YATES format.
    • Traffic_dNE2Yates(): Converts dNE-format traffic (NumPy matrix) to YATES format (text file).
    • Path_Yates2dNE(): Converts YATES-format candidate paths (using the oblivious algorithm) to dNE format.
    • Traffic_Yates2dNE(): Converts YATES-format synthetic traffic to dNE format.
    • Traffic_SND2dNE(): Converts SNDLib traffic format to dNE format.
  • Note:

    • Clone the YATES repository (links are provided in the next section) into the same parent directory as this dNE repository to ensure these functions work correctly.
    • Place the calculated candidate paths' results in dataset/paths/Yates_Format/.

Useful Links

  • Internet Topology Zoo: Contains a variety of WANs in GraphML format that can be used for evaluation.
  • SNDLib: Provides raw traffic data for several WANs.
  • YATES: A powerful TE tool that can generate realistic synthetic traffic for a topology and calculate candidate routing paths using the oblivious routing algorithm that can potentially result in good TE performance.

Citing This Work

If you find our repository helpful in your research, please cite our paper.

@article{ding2025dNE,
  title={{Learning for Accelerated Traffic Engineering with Differentiable Network Modeling}},
  author={{Ding, Wenlong and Liu, Libin and Chen, Li and Xu, Hong}},
  journal={{Transactions on Network Science and Engineering}},
  year={2025},
  month={June}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages