This is a node grid optimizer for the fake e-sport open-ended puzzle game Q-UP. It implements all nodes for the Wizard class and a small number of simple items. While it has some known shortcomings, this project is nonetheless good enough to get the Wizard to Novice in 98 flips (with items managed manually and not particularly well).
The algorithm is a simple greedy hill-climber with compound mutations. Each step applies one or more mutations to the current best configuration to obtain a new configuration, which is evaluated under relevant conditions (win/loss and/or Q/UP, depending on the nodes' trigger conditions). The new configuration is accepted if its expected value is greater than the previous best. The number of mutations applied in each step is chosen from a geometric distribution; applying multiple mutations at once allows the hill-climber to escape local maxima. Each mutation moves a node to a new cell (swapping with the node previously in that cell, if any), removes a node, replaces a previously moved node, or moves one or more upgrade points from one or more upgrade tracks to another upgrade track (including a pseudo-track representing unallocated points). When a node is moved or placed, the target cell is selected using a "heatmap" that gives higher weight to cells closer to other nodes. The algorithm terminates after a specified number of steps without improvement or when it reaches a time limit. In practice, separate instances of the algorithm are run repeatedly on all available threads until a timeout is reached (often 3-10 minutes, scaling with the number of unlocked nodes and upgrade points and thus with character level).
The evaluation function is recursive, but it does not recurse through the functions implementing the items and nodes. Instead, each function maps an input state to a list of (state, probability) pairs, where the state includes a stack of pending node activations. The evaluation function recurses on the next states, multiplying and summing the probabilities. Storing the stack of pending activations explicitly in the state object, rather than implicitly in the program counters of the node function stack frames, allows merging the evaluation of equivalent states by memoizing the evaluation function. (This is not useful for the Wizard, whose nodes don't involve randomness, but would be important for other characters.)
This project has a number of shortcomings (besides its limited scope):
- Evaluation ignores Q-Down on losses, so it overvalues mult.
- Evaluation considers only one flip. This isn't a problem for the Wizard, whose nodes don't care about previous flips, but it does impact some items and would strongly impact other characters.
- Evaluation ignores elements that depend on other players, such as the presence of other classes in the match, or their current Q or mult.
- The heatmap generation code handles all nodes similarly. It ought to only consider nodes that care about their neighborhood when assigning heat. If the partial implementation of Robot nodes is to be completed, the heatmap would need to consider linear neighborhoods.
- This project is written in Python, so it spends a lot of time managing memory and chasing pointers.
This project is based on the semantics of build ID 20749532. Some Wizard nodes have quirks that are probably bugs, and if they've been fixed, the code will need adjustments.