Randomness without Replacement
by David Kennerly


ADVERTISEMENT

A blade spider is at your throat. It hits and you miss. It hits again and you miss again. And again and again, until there's nothing left of you to hit. You're dead and there's a two-ton arachnid gloating over your corpse. Impossible? No. Improbable? Yes. But given enough players and given enough time, the improbable becomes almost certain. It wasn't that the blade spider was hard, it was just bad luck. How frustrating. It's enough to make a player want to quit.

"Game mechanics" is one of those terms that every designer talks about. Everybody agrees that game mechanics is an important subject within game design. Yet almost nobody discusses, in detail, how the design of a mechanism may satisfy a player. In this article, I'm going to dissect a mechanism germane to role-playing games, the randomization function for determining whether a player's attack hits or misses its target. Roll up your sleeves for a modest amount of mathematics—just enough to demonstrate what a game mechanism is.

Reducing Frustration

In the opening story, one way to prevent frustrating results is to reduce randomness. Several massively multiplayer role-playing games, such as Lineage 2, reduce variance of hits and misses and reduce the variance of damage. Reason? As Alex Chacha stated, if the probability for a string of failures is above zero, then the probability for that string occurring at least once in the lifespan of a long game is very high ("Randomness," MUD-Dev Mailing List, February 2004.). And once it does, it is a frustrating experience. So, in this article, let us call such a long string of consecutive failures frustration.

To reduce the occurrence of frustration, one possibility is to change the mechanism that models the randomness. For instance, let's change the method from sampling with replacement to sampling without replacement.

But first, to understand this lingo, we need to know a little probability theory. It shouldn't be too hard for us, since probability theory was commissioned, in 1654, as the science of dice and card games. So we're in familiar territory: the mathematics of dice and cards. In probability, when a result of a random function is taken, it is called sampling. There are two basic methods for sampling: with replacement (as in a die roll), or without replacement (as in drawing cards from a deck). The difference seems simple, but the implications, as we shall see, may be dramatic.

Back to the problem at hand: To reduce frustration, change from sampling with replacement to sampling without replacement. Basically, instead of rolling a die, draw a card. Let's use Open Game Content (OGC) as shared vocabulary. This is the system that Wizards of the Coasts based Dungeons & Dragons 3.5 on. So, instead of rolling a twenty-sided die (1d20), draw from a deck of twenty cards, perfectly shuffled.

The key difference in the mechanism may be illustrated by a deck of cards alone. With replacement, after each draw, the card is shuffled back into the deck. Without replacement, the drawn cards are then discarded. Only after the deck has been consumed is the discard pile shuffled to reconstitute a full deck. To keep the analysis simple and further reduce frustration, remove one of the hit cards from the deck, and shuffle the remaining 19 cards. Then place this hit card on top of the shuffled deck.

Suppose, in OGC terms, that the player has a melee attack of +0, then he would hit armor class 11 about 50% of the time. This may be modeled, with replacement, as a probability of success of 50%. To model this probability, without replacement, imagine there are 10 hit cards and 10 miss cards. If details are required to account for modifiers, then there may be 20 cards, valued from 1 to 20. Either way, 10 of these cards will result in a hit, and 10 will result in a miss. Without replacement, the player could sample 10 misses in a row, but the probability of this frustration is much less than by sampling with replacement. But how much less probable is it?

Counting Cards

Calculating the probability of frustration with replacement is simple. Suppose f equals the number of consecutive misses that will result in frustration, which we shall consider at 10 consecutive misses. With replacement, this is equivalent to a binomial distribution with a given probability of missing being the measure of consecutive trials. With a probability of a single failure being 50%, we may understand Equation 1. This logic is simple. The probability of failure ten times a row is the multiplication of the probability of failure ten times. So, in a sequence of ten attacks, the probability of frustration is about one in a thousand.


Equation 1. The probability of frustration, with replacement.

Calculating the probability of frustration without replacement is complicated, because the probability of each attack hitting depends on the success or failure of the previous attacks. A general method to solve problems of discrete probability is to count all the ways in which the event may occur, and divide this number by the count of all ways in which any outcome may occur.

In order to count possible occurrences of frustration, recall that the probability equals the number of ways to have a consecutive string of 10 misses out of 20 attacks. To simplify analysis and further limit frustration, set the first attack to be one of the hits. This corresponds to rolling a natural 20 in OGC. Since frustration (i.e., the string of 10 misses) is immutable, this is equivalent to the number of ways to select a single string of 1 out of 10. Recalling the basics of combinatorics (which is the mathematics of counting), this simplification makes computation trivial, as shown in Equation 2.


Equation 2. How many ways an attacker may be frustrated, without replacement.

Then count the total number of ways to have 9 hits and 10 misses in a sequence of 19 attacks, as shown in Equation 3.


Equation 3. How many ways an attacker may attack, without replacement.

By dividing the count of frustrations by the count of all attacks, we arrive at the probability of frustration, as demonstrated in Equation 4. So, the probability of frustration in the course of twenty attacks is about one in ten thousand.


Equation 4. The probability of frustration, without replacement.

By dividing the solution of Equation 4 by Equation 1, we see that the sample without replacement decreases the probability of frustration by an order of magnitude, shown in Equation 5. To keep the analysis simple, multiply the probability of frustration with replacement by 2 (or divide without replacement by 2), since the probability in a sequence of 20 attacks (without replacement) is being compared to a sequence of 10 attacks (with replacement).


Equation 5. Nonreplacement dramatically decreases frustration.

Clearly, this mechanism reduces frustration. Furthermore, the mechanism without replacement guarantees never to have more than 10 consecutive misses (since the 1+20kth occurrence always hits).

Although the above analysis is only a solution for one of the parameters that a player may have of hitting an opponent (i.e., melee attack of +0 versus armor class 11), the same analysis can be carried out on all possible parameters. Likewise, this analysis can be carried out for any given lower bound to frustration. If worst-case analysis of player satisfaction had predetermined 5 consecutive misses to be the maximum tolerance before frustration occurs, then the deck could be divided into two separate decks, each without replacement, so long as each deck had no more than 5 in it. For the sake of uniformity, two decks might be evenly divided (i.e., 75% success rate = 7 out of 10 and 8 out of 10).

God Does Not Play Dice—He Plays Cards

This deck mechanism is not without its costs. A little detour from the mathematics department and into the computer science department will explain why. Each mechanism is implemented from an algorithm (which is a precise procedure for the rules of the game). Computing the amount of time required for the die mechanism is straightforward, as listed in Table 1.

Step in the algorithm Computations
For 20 attacks: 20
            Roll a die.             d
Total Time Complexity 20 d

Table 1. Algorithm outline of a die mechanism.

The deck mechanism is not much more complicated. A naive shuffling algorithm can shuffle in linear time with one random call per element. So, this implies a running time not worse than 19 fold of the running time of the die mechanism. This shuffling only needs to be called once every 20 attacks. As listed in Table 2, the time complexity (which is an abstraction of the amount of time required to execute the algorithm) for a card mechanism is greater than the die mechanism, but not by much. Although there are two operations to perform instead of one, the drawing of a card (c) from a shuffled deck is a trivial operation. In addition, within an efficient shuffling algorithm, shuffling one card in a deck of 19 (s) is approximately comparable to rolling a die with 19 sides (d).

Step in the algorithm Computations
Shuffle a deck of 19 cards. 19 s
For 19 attacks: 19
            Draw a card.             c
Total Time Complexity 19 c + 19 s = 19 (c + s)

Table 2. Algorithm outline of a card mechanism.

A more elegant algorithm may shuffle one card per attack, which means one randomization per attack, as listed in Table 3. This does not alter the total time complexity, but does distribute the computations more evenly, so that there is not a delay (for shuffling) after each 20th attack.

Step in the algorithm Computations
For 19 attacks: 19
            Draw a card.             c
            Shuffle that card.             s
Total Time Complexity 19 (c + s)

Table 3. Algorithm outline of a gradually shuffled card mechanism.

Of course, compared to the die, the deck requires memory, but this may be as little as an additional 19 bits per player to as much as 50 bytes per player.

Albert Einstein once remarked on another detailed mechanism, a mechanism for the elementary fabric of the universe. When presented with the random mechanism for modeling quantum physics, Einstein said, "God does not play dice." In the case of our RPG, maybe the Creator plays cards, instead.

Does it Matter?

Does it matter whether a die or deck is used? Both probabilities are negligible, making it unlikely for a designer to encounter such frustration. Yet even small probabilities, such as one in a thousand, are worth considering given enough players.

To understand, first we need to learn another distribution common to discrete probability. We want to know how many instances of player frustration will occur. Each instance is rare, so the mean of a Poisson distribution, E(X), is a good approximation, as formulated in Equation 6.


Equation 6. Poisson distribution models the count of frustrations.

In general, the Poisson distribution is used to measure the number of occurrences (λ) per unit time. We have already calculated the probability of a single instance of frustration (p). What remains to be determined are the number of attack strings in which frustration may occur (n). Let's consider two scenarios with sufficient sequences of player attacks, both in massively multiplayer games and in single-player games.

Case Study: Massively Multiplayer RPG

For a simple example in a massively multiplayer role-playing game (MMORPG), there could average 1000 attacks per player per hour. To be crude, this could be coarsened to 100 strings of 10 attacks (Actually, only one of the ten possible combinations of ten consecutive failures conform to this demarcation). If—admittedly oversimplified—there were, for a reasonably small MMORPG, an average of 100 simultaneous players per day with uniform distribution of attacks, and independence of attacks, then there would be 100 strings / hour (24 hours / day) 30 days / month. The arithmetic equals 72,000 strings per month. Since this is very large and the probability is very small, the Poisson distribution approximates, with a rate of λ = n p. The expected number of frustrations is solved in Equation 7.


Equation 7. Expected number of frustrations per month, with replacement.

So there's going to be, on average for a low-traffic MMP, given these simplistic preconditions, 70 strings of frustration (i.e., 10 consecutive misses) per month using a mechanism analogous to a 1d20 die roll. Whereas, without replacement (a mechanism analogous to a 20-card deck), the expected number of frustrations is computed in Equation 8. For approximation, the number of trials is divided by two, because this is a sequence of 20, instead of a sequence of 10.


Equation 8. Expected number of frustrations per month, without replacement.

So, there are about 18 times fewer occurrences of frustration. A distribution of the frustration may be estimated by the Poisson distribution, as shown in Figure 1. Because only whole numbers can occur, the distribution has been stepped at the rounded (midpoint) value of a continuous Poisson distribution.


Figure 1. Distribution of frustration without replacement (blue) or with replacement (magenta).

From comparing the distributions, it is obvious that without replacement, there are almost certainly going to be fewer cases of frustration.

Case Study: Single-Player RPG

There's no theoretical or practical reason why such methods could not be applied to single-player and multiplayer RPG combat. The fact is, a game that sells a million copies, can expect similar figures to an MMP. It is a difference of penetration and time. The same Poisson distribution applies. Suppose the average player of a single-player RPG spends 10 hours with the game (since many more players quit sooner than those that play longer). If the rate of attacks remains constant (1000 attacks per hour), then only 7200 copies of the game need be sold to have the equivalent of a month of the modestly populated MMORPG above.

Every Mechanism Should be Designed as Simple as Possible, but Not Simpler

Einstein also said, "Everything should be made as simple as possible, but not simpler." Don't let mathematical precision dominate your analysis of a mechanism. While understanding the subtleties of a mechanism can make or break the design of a game, the relevance of the number crunching must be maintained.

In this example, we computed the difference, not in terms of all kinds of player frustration that may occur, but only with one single variation to a mechanism within the game. We simplified the situation in order to keep the math brief. We did not consider players with probabilities to hit at above or below 50%, nor did we strictly adhere to the definition of independence for a Poisson distribution in the die mechanism. Astute readers will have also noted that the deck mechanism does not randomize the first card drawn. Doing so would further complicate the calculation of probability. Therefore, our result is only an approximation for a special deck.

A more precise analysis was not necessary to prove the fitness of the deck mechanism. Bear in mind that a designer can correctly solve a problem but fail to solve the right problem. In this article, we honed in on one mechanism, ignoring the rest of the game and its effect on the player's satisfaction. Even if it were mathematically tractable, computing the distribution of loss of players due to this careful definition of one type of frustration is a tougher problem.

About the Author

David Kennerly directed five massively multiplayer games in the US and Korea. He localized Korea's first world, The Kingdom of the Winds, and designed the social system of Dark Ages: Online Roleplaying. Before joining Nexon in 1997, he designed The X-Files Trivia Game for 20th Century Fox, and troubleshot US Army networks in Korea.

David encourages creativity among developers and players. He helped organize MUD-Dev Conferences, and founded an online library of fan fiction. David has authored on game design for Charles River Media, ITT Tech, Westwood College, Gamasutra.com, and IGDA. To discuss this article with the author, please visit his website www.finegamedesign.com

Further Reading

Jay L. Devore, Probability and Statistics for Engineering and the Sciences. "Chapter 3: Discrete Random Variables and Probability Distributions." 6 ed. Brooks/Cole: USA, 2004.

Discuss this article in the forums


Date this article was posted to GameDev.net: 2/16/2005
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
Featured Articles
Game Mechanics

© 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!