Stochastic Deep Q-Learning Model for Joint Optimization of 5G Cellular Network Delay and Energy Efficiency
Objective: Minimize the number of active radio units and distributed units, assign RUs to DUs.
Topology: We have a mesh topology between the RUs and DUs, i.e., any RU can have any DU perform its higher PHY functions. There are many RUs connected to one DU. One RU cannot be connected to more than one DU.
A stochastic Q-learning model is used to solve this problem, because it can have a large discrete action space.
Q-learning requires three structures:
| Variables | |
|---|---|
| Number of radio units | |
| Number of user equipments | |
| Number of distributed units | |
| Sets | |
| Set of UEs | |
| Set of DUs | |
| Set of RUs |
The
At time
where
At time
where
We define the delay matrix
where
from RU
At time
We represent our state with the vector
The action space here is discrete-- we only want the agent to be able to reassign RUs to DUs and either wake up RUs/DUs or put them to sleep.
where
Define a vector in
where
Define a vector
where
now we can represent our action space
where
Elaborate: Will we make a decision once every time interval or at every time step?
Discourage switching DUs and RUs too frequently.
Define functions
At time
where
and the connection capacity rewards, where
and
we have constants
We will be using a custom testbed environment built in Python (can be found in src) to train and test the model using these parameters:
-
$n=24$ RUs evenly spaced at 100m -
$m=6$ DUs -
$k=80$ UEs moving in a random walk - 1000m by 1000m simulation area
We use PyTorch to implement the stochastic DQN. We use
Evaluate the model by pulling the repo, then running
cd stochdqn-oran/test
python3 train.pyPretrained models can also be found in models
stochastic_dqn_alpha_1.2_beta_0.4 and stochastic_dqn_alpha_0.6_beta_0.4 both are neural networks which take an input of size