Please make your example easier to follow. You can use Plots or Graphs to visualize it, the states etc.
A simple didactic version of Q-learning on this example is definitely doable. Just follow the dynamic programming flow specified by the Bellman recursion.
Please make your example easier to follow. You can use Plots or Graphs to visualize it, the states etc.
A simple didactic version of Q-learning on this example is definitely doable. Just follow the dynamic programming flow specified by the Bellman recursion.