Adventure Games and Eigenvalues

April 23, 2017

Do adventure games have dead ends? The author of The Teeny Tiny Mansion proves the absence of dead ends in that game via formal verification (in particular, with the help of the cbmc model-checking program). Here I’ll demonstrate how to prove the same thing via Markov theory. The Markov approach requires more mathematics, but less computer code, and lets us quantify the risk of encountering a dead end via matrix manipulation.

Note: both methods are impractical with large state spaces (i.e. anything much larger than TTTM). If your goal is simply to identify dead ends, you’re probably better off with some kind of route-finding algorithm (through the state space) borrowed from graph theory. But the Markov approach below works well for small problems, and has some interesting mathematical properties that allow for calculations beyond what pure graph algorithms and formal verification methods provide.

Theory

There are three classes of game states: the “finished” state (game is over), an “invalid” state (from which it is impossible to finish the game), and a “valid” state (from which the game can still be finished). Think of game states as states of a Markov process, and randomly chosen actions as triggering transitions between states with some defined probability. This is the mathematical version of someone who plays TTTM by randomly choosing actions until the game is finished, and who keeps attempting actions (forever) even if an invalid state is reached.

In Markov terminology, a set of mutually accessible invalid states form an ergodic set (or recurrent class) from which there is no possibility of escape. The finish condition is also an ergodic set, but it has only one state. All other states are transient states — if the player plays forever, the probability of being in a transient state at any moment (in the distant future) approaches zero. That is, the player will eventually finish the game or get stuck in a set of invalid states.

It is easy to construct a stochastic matrix for the TTTM game — first index each possible game state \(i = 1,\ldots,N\), and construct an \(N \times N\) matrix that represents the transition probability between each pair of states. In TTTM this can be done by calling query_actions on each input state \(i\), and then assigning probability \(1/K_i\) to each output state produced by apply_action (assuming there are \(K_i\) possible actions for input state \(i\)). Note that multiple actions may lead to the same output state, in which case the transition probability will be a positive multiple of \(1/K_i\). Each row in the stochastic matrix represents an input state, and each column represents an output state. Each matrix value represents the transition probability between the two states, so each row sums to 1. The last row (and column) corresponds to the finished state, so the last row will be full of zeros except for the last entry, which will be 1.

The state space in TTTM is simple to enumerate. There are 2 controllable characters (Alice and Bob), 4 possible locations for each character (Red Room, West Room, East Room, Blue Room), and 4 possible locations for each of the three colored keys (West Room, East Room, Alice’s pocket, Bob’s pocket). Including the special “finished” state, \(N=2\times 4^2\times 4^3+1=2,049\). In this case, mapping between states and indexes might be achieved with a bit-encoding scheme — 1 bit for the character, 2 bits for each of two character locations, and 2 bits for each of three key locations to make an 11-bit number. This number can encode any game state, apart from the special “finished” state (but will need to be shifted by 1 to conform to the usual 1-indexing of linear algebra).

With a transition matrix \(P\) in hand, the recurrent classes can be identified by computing the right eigenvectors for all of the matrix’s unit eigenvalues. Each eigenvector then corresponds to a recurrent class, and will have a value of 1 for each state that appears in the class (dead end or finished state), a value of 0 for each state in other recurrent classes, and a value between 0 and 1 for all of the transient states.^ref If there are no invalid states in the game, the stochastic matrix will have exactly one unit eigenvalue, corresponding to the finished state. (Using the code at the end of this article, I found that the final version of TTTM has 59 recurrent classes.)

Thus all sets of invalid states in the game can be counted and identified using only query_actions and apply_action (and eigendecomposition). There is no need to define “game progress” or implement a naive game AI, as formal verification methods appear to require.

But a game with invalid states may still be perfectly playable — what the game designer wishes to avoid are “dead ends”, or sets of invalid states that are reachable from the beginning of the game. Because the transition matrix \(P\) includes no information at all about initial states, the analysis thus far has no way of distinguishing proper dead ends from invalid (but unreachable) states. As an example from TTTM, an invalid but unreachable state would be one where both characters are in the Red Room, but all the keys are in the East Room — this game state should be impossible from any sane set of initial game conditions.

The transition matrix \(P\) can be tricked into encoding initial game states in the following way: instead of assuming a finished game remains finished forever, assume that the only “action” available from the finished state is to restart the game — so that in the last row of the matrix, all of the initial game states now have equal probability weight (summing to 1), and all other states contain a zero. Call this modified matrix \(P^*\), the transition matrix of a game that automatically restarts on game completion.

With this modified game, there are now two possibilities for the finished state: either it is a transient state (and will ultimately lead to a dead end), or else it is part of one of \(P^*\)’s recurrent classes. In fact, the recurrent classes of \(P^*\) will be identical to the recurrent classes of \(P\) with the following exception: either it will be missing the recurrent class that originally included the finished state, or else that recurrent class will be expanded to include not just the finished state, but also every initial state, and every state reachable from an initial state. That is, in the absence of dead ends, there will be a recurrent class representing all valid gameplay.

Because the number of recurrent classes in a transition matrix is always equal to the number of unit eigenvalues, a simple way to verify the absence dead ends in the game is to answer the following question: Is the number of unit eigenvalues of the matrix \(P\) equal to the number of unit eigenvalues of the matrix \(P^*\)? If the counts are equal, there are no dead ends in the game. If the counts are unequal (i.e. \(P\) has one more unit eigenvalue than \(P^*\)), then a dead end must be present in the game. A check of this condition easily verifies that the “bad design” version of TTTM has dead ends, but the corrected version of TTTM does not.

The matrix formulation allows for other interesting questions to be asked. Given an initial state vector s, the long-run steady state is given by \(s^* = \lim_{t\to \infty} P^t s\). So we can see how “bad” a bug is by asking what percent of the time a button-mashing player will finish the game (equal to the last entry in \(s^*\)) versus hit a dead end (the sum of all other entries). I calculate for instance that “character needs help” bug in TTTM will affect 71.3% of button-mashing games, and the “green key” bug will affect a quarter of such games. (If both bugs are present, then 78.5% of such games will terminate in a dead end.) This information could be useful when determining the priority of bug fixes, or when weighing the benefits of introducing a hot-fix for your next bestselling text adventure game.

The TTTM formal verification code requires a value called MAX_FINISH_DEPTH, which represents the maximum number of steps required to finish the game from any valid state. It is unclear how the TTTM author arrived at the number 18, but the value can easily be calculated using the matrix representation of the game. Recalling that the finished state has the last matrix index \(N\), let \(V\) be the set of valid states, that is, states that will transition to the finished state with positive probability over an infinite time horizon: \(V = \{ v : \lim_{t\to \infty} (P^t)_{v, N} > 0 \}\). Then MAX_FINISH_DEPTH is equal to smallest number of transitions required to move all valid states to the finished state with positive probability: \(\min \{ t : (P^t)_{v, N} > 0,\; \forall v \in V \}\). In this manner, I find that the correct value is 14, rather than 18. Knowledge of this smaller value may be able to improve the running time of the formal verification code.

Code

The mathematical verification code is about one-third the size of the formal verification code presented by the TTTM author (150 lines of Julia versus 500 lines of C for the same tasks). The mathematical code requires no particular knowledge about the game itself apart from mapping game states to index values; everything else is calculated from the stochastic matrix, which is constructed using only repeated calls to (ported versions of) query_actions and apply_action. In contrast, a majority of the TTTM formal verification code requires access to, and knowledge of, internal game state. The mathematical code thus has better scaling properties in the sense that as the game becomes more complicated, the size of the mathematical code will grow much more slowly than the formal verification code. Furthermore, there is no need to limit the “search depth” in the mathematical code; conceptually, it operates on an infinite horizon.

The \(P\) matrix for this problem has over four million entries (\(2,049 \times 2,049\)), but the large majority of these entries will be zero. If you’re coding this up on a home PC from the early 90s, or working with a larger state space, a sparse matrix representation may be desirable.

For verifying the absence of dead ends, the run-time speed of the mathematical script appears to be faster than the “bounded” method described by the TTTM author (13 seconds versus 43 seconds on my machine), but slower than the “unbounded” method (3 seconds). Changing MAX_FINISH_DEPTH to 14 improved the run-time of the bounded method to 23 seconds, but resulted in an assertion failure. This failure indicates that there is room for improvement in the (simple) game AI needed by the formal-verification code.

The majority of the time in the mathematical script is taken up computing eigenvalues, a function call which I have not optimized. The computational complexity of finding eigenvalues is (typically) \(O(N^3)\); so the mathematical method might start to get frustrating with a state space much larger than the one in TTTM. For example, adding an extra key in TTTM — say, a skeleton key that can open any door — expands the state space by a factor of 4, which increases the matrix size by a factor of 16, and increases the time required for the computation of eigenvalues and eigenvectors by a factor of 64. But some clever optimizations might make a larger state space tractable — a sparse matrix representation requires only \(O(N\bar{K})\) space (where \(\bar{K}\) is the average number of actions available from a particular state), and something like Arnoldi iteration might be able to yield just the unit eigenvalues without a full \(O(N^3)\) computation.

It takes somewhat longer to compute how “bad” each bug is, as this computation requires the full set of left and right eigenvectors; on my machine this additional computation takes about 30 seconds. The computational complexity of this operation is also \(O(N^3)\); but there maybe fewer avenues for possible optimization compared to calculating only the eigenvalues.

Conclusion

Does Markov theory have role in the verification of adventure games? For the simple case of TTTM, the method works well and provides some interesting insights into the game behavior. But for larger games, the method as implemented quickly becomes less attractive, as the \(O(N^2)\) space requirement and \(O(N^3)\) time requirement make clear. Identifying more efficient algorithms for counting unit eigenvalues is certainly worth filing under “Future Research”; but as mentioned in the prefatory note, games with large state spaces might need an altogether different approach.

The mathematical method described here has a number of advantages over formal verification, including reduced code size and complexity. The primary deficiency of the mathematical method is that it runs significantly slower than the “unbounded” method described by the TTTM author. One possible avenue for optimization would be using the notion of game progress (defined by the unbounded method) in order to cull actions that go “backward” — this would produce a sparser transition matrix than the original matrix, which sparse-matrix functions will have an easier time processing.

Overall, I don’t anticipate that entire adventure games along the lines of King’s Quest — or even Colossal Cave Adventure — will be encoded in giant stochastic matrices anytime soon, but the methods described here might easily find a role verifying particular puzzles that have well-defined beginning and ending states — or various non-game workflows exhibiting complicated state transitions. So as an additional tool for testing and verifying bits of software, Markov theory might help developers get through a maze of twisty passages or two.

References

Gallager, R. G., Stochastic Processes, Theory for Applications, Cambridge University Press, Cambridge, UK, 2013. See in particular Chapter 4: Finite State Markov Chains and Section 4.4.2.

You’re reading evanmiller.org, a random collection of math, tech, and musings. If you liked this you might also enjoy:

Evaluating Splatoon’s Ranking System
Splatoon’s Ranking System Is Still Broken
How Not To Sort By Average Rating

Get new articles as they’re published, via LinkedIn, Twitter, or RSS.

Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!

Wizard
Statistics the Mac way

Back to Evan Miller’s home page – Subscribe to RSS – LinkedIn – Twitter