This repo prototypes an efficient and lightweight framework for DNN-based traffic engineering (TE) described in the TNSE paper: Learning for Accelerated Traffic Engineering with Differentiable Network Modeling. It enables the exploration of the best performance among any DNN model for specific TE tasks/objectives, supports gradient-based training beyond existing DRL, and use cases (e.g., CNN-TE) in this repo can achieve near-optimal results similar to linear programming with negligible decision time. The framework, dNE, achieves these functions using a set of fully differentiable matrix operations to compute TE metrics for given decisions, enabling direct gradient chains from metrics to update DNN models. It is implemented as a plugin in any DNN training framework, with a template training framework file provided in this repo.
The repo structure with a brief description of each file is described as follows:
dNE/
├── dataset/ # [Contains datasets for TE tasks.]
│ ├── capacities/ # [Store link capacity data.]
│ │ └── G.npy # [Example capacity in dNE matrix format.]
│ ├── paths/ # [Stores pre-defined candidate paths.]
│ │ └── Yates_Format/ # [Raw path data computed by Yates.]
│ │ └── G.txt # [Example raw path data.]
│ │ └── G.npy # [Path data in dNE matrix format.]
│ ├── topologies/ # [Stores network topology data.]
│ │ └── G.graphml # [Example topology in GraphML format.]
│ ├── traffic/ # [Stores traffic matrix data]
│ └── G.npy # [Traffic matrix in dNE format.]
├── methods/ # [Various TE algorithms with dNE framework]
│ ├── lp.py # [Traditional LP method.]
│ ├── ecmp.py # [Traditional ECMP method.]
│ ├── cnn.py # [CNN-TE.]
│ ├── drl.py # [DRL-TE.]
│ ├── fc.py # [FC-TE.]
│ ├── lstm.py # [LSTM-TE.]
│ └── template.py # [A template for integrating new DNN-based TE methods.]
├── util/ # [Contains utility scripts for the framework.]
│ ├── AssistFunc.py # [Helper functions for DNN training.]
│ ├── framework.py # [Fill: dNE core framework logic (evaluation and summarization stage).]
│ └── modeling.py # [Transferring raw data into dNE (matrix) format.]
├── main.py # [Entry script for running dNE.]
└── README.md # [This README file.]
The key features of this repository are summarized as follows:
- Implements dNE using differentiable matrix operations with PyTorch.
- Provides a template file demonstrating how to integrate dNE into any arbitrary DNN training framework. Includes use cases (e.g., FC-TE, LSTM-TE, CNN-TE, DRL-TE) to illustrate the development of specific DNN training scripts using the template.
- Implements traditional traffic engineering methods (e.g., ECMP, LP) within the dNE framework for easy comparison with DNN-based methods.
- Develops various functions to model matrix formats of networks required by dNE, either from raw data or outputs of other traffic engineering tools.
This project does not enforce specific version requirements for the listed software and hardware. Below are the configurations we used during evaluation:
Hardware
- GPU: Nvidia GeForce RTX 3090 (CUDA version 11.6)
- CPU: 4 Intel Xeon Platinum 8268 24-Core processors with 128GB RAM
Software
- Gurobi: v10.0.2 (required for LP tasks only)
- Python: 3.6.13
- Packages:
numpy: 1.19.2networkx: 2.5.1torch: 1.10.2torchaudio: 0.10.2torchvision: 0.11.3gurobipy: 9.1.2
-
Run
main.pyand specify the following mandatory parameters to configure the running environment:--method:Select the TE algorithm to use (e.g., ECMP, LP, FC, LSTM, CNN, DRL, etc.).--topo:Choose the topology for traffic engineering (e.g.,Gin our repository).--train_flag:Set this parameter toTrueto train the specified DNN.--metric_flag:Set this parameter toFalseto skip the evaluation of TE metrics in this step.
-
An example of training
CNN-TEfor topologyGis shown below:python main.py \ --method 'CNN' \ --topo 'G' \ --train_flag True \ --metric_flag False
-
The resulting DNN models and metrics will be stored in
results/{method}/{topo}/.
-
Run
main.pyin a similar way to the training step, but set--train_flagtoFalseand--metric_flagtoTrue. -
An example of evaluating
CNN-TEfor topologyGis shown below:python main.py \ --method 'CNN' \ --topo 'G' \ --train_flag False \ --metric_flag True
| Category | Argument | Type | Default | Description |
|---|---|---|---|---|
| Mandatory | --method |
str |
N/A | The method to use (ECMP, LP, FC-TE, LSTM-TE, CNN-TE, DRL-TE). |
--train_flag |
bool |
N/A | Flag to indicate whether training specified DNNs or running LP/ECMP algorithms. | |
--metric_flag |
bool |
N/A | Flag to determine whether to show TE metrics (MLU, Throughput, and Congestion Loss). | |
--topo |
str |
N/A | Topology name (e.g., G in this repo). | |
| Optional (Network) | --scale |
float |
1150383 | Scaling factor of the original traffic. |
--k |
int |
4 | Number of candidate paths per flow. | |
--obj |
str |
MLU |
Objective function name (MLU, TP, or CLoss). Applicable for LP only. | |
| Optional (Training) | --train_size |
int |
4500 | Size of the training set. |
--batch_size |
int |
64 | Batch size. | |
--alpha |
float |
0 | Alpha value to balance MLU and CLoss in the loss function (MLU + alpha * CLoss). | |
--learning_rate |
float |
0.01 | Learning rate. | |
--seed |
int |
76 | Random seed. | |
--epoch |
int |
100 | Number of training epochs. | |
--weight_decay |
float |
0.001 | Weight decay for the DNN training optimizer. | |
| Optional (RL only) | --buf_capacity |
int |
10000 | Replay buffer capacity. |
--learning_rate_actor |
float |
0.0001 | Learning rate for the actor in DRL. | |
--learning_rate_critic |
float |
0.0001 | Learning rate for the critic in DRL. | |
--gamma |
float |
0.99 | Gamma value (discount factor) in DRL. | |
--tau |
float |
0.01 | Tau value (soft update rate for the target network) in DRL. | |
--sigma |
float |
0.6 | Sigma value (exploration noise) in DRL. |
-
To define a new TE metric, register the corresponding PyTorch formulations in
summarization()inutil/framework.pywith its name and specific formulation. -
For example, registering the normalized congestion loss metric can be done as follows:
# Normalized Congestion Loss elif objective == 'CLoss': CLoss = torch.sum(torch.max(L_tensor - C_tensor, torch.tensor(0))) return CLoss / torch.sum(D_tensor)
-
Once registered, the new metric can be used in training. For example, if you want to use a mixture of MLU and congestion loss as the loss function for training DNNs, you can add the following code in the training framework (e.g., in
methods/template.py):for ... ... L_tensor = evaluation(cur_MA, P_tensor, B_tensor, data2[j]) # Evaluation stage: load on each edge TE_MLU = summarization('MLU', L_tensor, C_tensor, data2[j]) # Summarization stage: calculate MLU TE_CLoss = summarization('CLoss', L_tensor, C_tensor, data2[j]) # Summarization stage: calculate congestion loss loss[j] = TE_MLU + self.alpha * TE_CLoss ... loss = torch.mean(loss) loss.backward() optimizer.step()
-
Users can define their own DNN models in the
DNNPython class within the template file (methods/template.py). -
Once defined, register the new method in
run.py. For example, if you propose a new methodCNN-TE, you can register it as follows:from methods.cnn import RUN_CNN ... elif method == 'CNN': running = RUN_CNN(...) ...
-
In
util/modeling.py, we provide several utility functions to convert raw data from common data sources to the dNE matrix format. These functions support tools like YATES, which generates realistic synthetic traffic data and candidate paths with good TE performance, and SNDLib, which contains real traffic traces. The provided functions include:Topo_dNE2Yates():Translates a NetworkX-format topology (dNE) to YATES format.Traffic_dNE2Yates():Converts dNE-format traffic (NumPy matrix) to YATES format (text file).Path_Yates2dNE():Converts YATES-format candidate paths (using the oblivious algorithm) to dNE format.Traffic_Yates2dNE():Converts YATES-format synthetic traffic to dNE format.Traffic_SND2dNE():Converts SNDLib traffic format to dNE format.
-
Note:
- Clone the YATES repository (links are provided in the next section) into the same parent directory as this dNE repository to ensure these functions work correctly.
- Place the calculated candidate paths' results in
dataset/paths/Yates_Format/.
- Internet Topology Zoo: Contains a variety of WANs in GraphML format that can be used for evaluation.
- SNDLib: Provides raw traffic data for several WANs.
- YATES: A powerful TE tool that can generate realistic synthetic traffic for a topology and calculate candidate routing paths using the oblivious routing algorithm that can potentially result in good TE performance.
If you find our repository helpful in your research, please cite our paper.
@article{ding2025dNE,
title={{Learning for Accelerated Traffic Engineering with Differentiable Network Modeling}},
author={{Ding, Wenlong and Liu, Libin and Chen, Li and Xu, Hong}},
journal={{Transactions on Network Science and Engineering}},
year={2025},
month={June}
}