To the readers: The title "Alice's Adventures in the Markovian World" is to in honor of the novel "Alice's Adventures in Wonderland" and its author the mathematician Charles Lutwidge Dodgson. You may find this paper charming if you think the algorithm design, analyses, and the simulations as a verification of a reasonable way of living. It is because:
- Alice, the algorithm, is like a real human being instead of the agents built by humans. It is because the situation that Alice faces is like us: we are never born with a "prediction book" telling us all the possible futures about us (this is the assumption of the optimal control theory); we are never born with a pre-training opportunity, which allows us to do experiments for millions of times before we step into this big life game (this is the setting of the Reinforcement learning); we are also not guaranteed to born with a good enough policy ensuring us to be safe and sound while growing up in the world (this is the assumption of other recent online control work).
We are thrown into this battlefield named life directly and have to make decisions online while learning. It is a ridiculous situation since if we are only given one chance to live, life as a one-time experiment, has no meaning; but if we have the afterlife, there is no meaning to try to live a good life since life is endless in the nature. Milan Kundera started his " The Unbearable Lightness of Being" with this dilemma. Philosophers thousands years ago already realized such confusing situation. Plato gave his answer to this dilemma in his "The Republic": we need to develop our reason, seek and follow the eternal truth. Augustine of Hippo gave his answer in his "The City of God", that we should believe in the God to go to the city of God after the Last Judgement. The authors in this paper are trying to use the mathematical language to develop the reason about how to make good decisions in the life game: We should regret deeply, and use the experience for the past to update our future decision policy.
- It is an adventure for Alice, since she does not know the "truth" of the physical world (A matrix). But such truth exists and keeps influencing Alice's state. It is an interesting dimension of a math variable. For Alice and us the analyzers, we do not know the exact truth (A matrix). Only the "God" / environment knows such truth, and such truth rules the world.
Please install cvx package to run the simulations. It can be downloaded here: http://cvxr.com/cvx/download/
Experiment 1: Please run simulation 1
Experiment 2: Please run simulation2and3
Experiment 3: Please comment line 241 to 243 but uncomment line 240 of simulation2and3, and then run the program.