WARNING: This article contains mathematics. Some of you might find math tedious and/or scary (including editors). Feel free to skip the equations, but you should read the text between the math to find out what questions go with what answers near the end of the article. Also note that the first half of the article isn't particularly earth shattering, but I must lay the groundwork. Please be patient, it picks up near the end (kinda like a blue control deck).
There is a classic word problem in probability and statistics classes. It goes something like this:
"An urn contains ten white balls and six black balls. If you remove one ball at a time from the urn without replacement (that is, you do not replace the ball you have drawn), what is the probability that you draw exactly three white balls and a black ball?"
There are three important facts we must note about the problem. The first is that we are sampling without replacement, which is stated in the problem. The second is that the order the balls are drawn in does not matter. It could be WWWB, WWBW, WBWW, or BWWW. The final fact is that there are only two colors, black and white, in the urn.
The answer to the problem can be found in what is called the hypergeometric distribution. This distribution is also the Magic player's best mathematical friend, because it answers this classic question:
"I have a forty-card deck with eighteen land. What is the probability that I will have drawn three land by turn 3?"
Again, we note three things:
1. We"sample" or draw cards from the library without replacement.
2. The order in which we draw the land does not matter.
3. There are only two"colors" in our"urn"; land and non-land.
So the hypergeometric is perfectly suited to answer this question. What is the formula for the hypergeometric? Cue the pesky formulas.
H(x,t) = C(x,l)*C(t-x,d-l)/C(t,d)
...where x is the number of land you want to draw, t is the number of cards drawn, l is the number of land in the deck, and d is the number of cards in the deck. C(x,l) is known as the binomial coefficient, and it determines the number of different ways you can draw x land from t draws. These are called combinations, which is why we use the C notation. The binomial coefficient is defined as:
C(x,l) = l!/x!(l-x)!
H(x,t), the hypergeometric distribution, will always generate a number between zero and one, since it is a probability. The denominator determines the total number of ways that one can draw t cards from a deck of d cards. The numerator determines how many ways we can draw t cards that meet are criteria (that is, x lands). The resulting quotient gives us our probability.
For the rest of this article, I will assume that I am playing first. Therefore, when I am talking about turn 3, I am assuming I have drawn nine cards. This is the original hand of seven on turn 1, and the two cards drawn on turns 2 and 3.
Using the formula, we find that the probability of drawing three land by turn 3 (or in nine cards) from a forty-card deck with eighteen land is 22.27%. Of course, the real question we need to ask is:
"I have a 40 card deck with 18 land. What is the probability that I will have drawn at least three land by turn 3?"
Well, from a probability standpoint, this is the probability that I draw three or four or five or six or seven or eight or nine land. Or, in symbols:
P(3 or 4 or 5 or 6 or 7 or 8 or 9)
Thankfully, the rules of probability state that the P(A or B) = P(A) + P(B), as long as A and B cannot happen at the same time. Since it is impossible to draw exactly three and exactly four lands in the first nine draws (it can only be one or the other), we can rewrite the above equation as:
P(3) + P(4) + P(5) + P(6) + P(7) + P(8) + P(9) =
H(3,9) + H(4,9) + H(5,9) + H(6,9) + H(7,9) + H(8,9) + H(9,9)
Using the above formula, we find that the probability of drawing at least three land by turn 3 (or, once again, in nine cards) from a forty-card deck with 18 land is 88.17%.
This should look familiar, as this is exactly what Tom Carpenter did in the first part of his article"Luck And The Land Draw." Tom even presented a spreadsheet that allows us to examine other combinations of land and deck size. This is a basic tool that every Magic player should have and learn how to use. I would suggest - and this is the math professor in me speaking - that you learn how to construct that spreadsheet yourself, so you better understand what the underlying theory is.
It was the second part of his article that troubled me. He asked the following question:
"Given deck A, what is the probability that I draw at least three lands and at least a spell that costs three mana or less by turn 3?"
Now, it should be noted that Tom described a forty-card deck with eighteen lands, seventeen spells with a casting cost of three or less, and five spells with a casting cost of four or more. Lets examine our three points again.
1. We"sample" or draw cards from the library without replacement.
2. The order in which we draw the land does not matter.
3. There are three types of cards, or"colors," in our deck: land, three-mana spells and under, and four-mana plus spells
This problem violates the assumptions of the hypergeometric, since the hypergeometric only allows for two colors. Since in this case we have more than two types of cards we are interested in, we need a better, stronger hypergeometric. We need the multivariate hypergeometric. (Hold on, this will get rough).
The multivariate hypergeometric is not usually covered in your basic statistics or probability class. If you do an internet search, you can find the same basic information found in this article. The term"multivariate" means more than one variable. The regular hypergeometric really looks at only one variable, say the number of land compared with cards that are not lands. The multivariate hypergeometric (MVH from now on) will allow us to calculate probabilities for forests, mountains, and two-mana and less spells all in the same deck, if we were interested in that.
The formula for the MVH is as follows: Say we have a deck made up of y1 number of type 1 cards, y2 number of type 2 cards, ..., and yn number of type n cards. The deck therefore is of size y1+y2+...+yn. We are interested in finding the probability of drawing exactly x1 of type 1 cards, x2 of type 2 cards,..., and xn of type n cards. This implies that we have drawn x1+x2+...+xn cards. The formula for finding this probability is:
MVH(x1,x2,...,xn,y1,y2,...,yn) =
(C(x1,y1)*C(x2,y2)*...*C(xn,yn))/C(x1+x2+...+xn, y1+y2+...+yn)
Now, C(x,y) is exactly the same as above for the hypergeometric. The denominator determines how many combinations of cards are possible drawing a given number of cards from a given deck. The numerator calculates the number of ways you can draw the exact combination of cards you are interested in obtaining. Lets do an example.
The deck used in Tom's article had eighteen lands (x1), seven cards with a casting cost of three or less (x2), and five cards with a casting cost of four or more (x3). What is the probability that you draw four land cards, four cards of 3cc or less, and one card 4cc or more? Plugging the information into the above formula, we get:
MVH(4,4,1,18,17,5) = (C(4,18)*C(4,17)*C(1,5))/C(9,40) = .133171, or 13.3%
You can replicate this calculation in Excel using the"COMBIN" function.
Of course, there are some limitations to this calculation. First of all, it doesn't take into account that you have the correct color of mana to cast any of those four three-mana or less spells. While that problem is complicated, you can use the MVH to answer it. A second problem is that what are really interested in is the probability of being able to cast a spell by turn 3. We are therefore looking for the probability of drawing at least three land and at least one 3cc or less spell in the first nine cards. In order to calculate this probability, we are going to have to find all the hands that satisfy this condition, and add all the probabilities, like we did above.
Lets start easy. Assume that we draw no 4cc or greater spells in the first nine cards. There are six possible hands that allow you to cast a spell by turn 3 (ignoring color):
|
Land |
3cc or less |
4cc or more |
|
3 |
6 |
0 |
|
4 |
5 |
0 |
|
5 |
4 |
0 |
|
6 |
3 |
0 |
|
7 |
2 |
0 |
|
8 |
1 |
0 |
So to start the calculation, we need to find the probability for each type of hand shown in the table above using the MVH, and sum the results. Now this can be done with a loop to make life easier. Note the lands run from 3 to 8, while the 3cc or less run from 6 to 1. If we have a loop that runs from i=3 to 8 (number of lands), the number of 3cc or less spells is 9-i. That makes things much easier.
Now assume that you draw one 4cc or more spell. Here are the possible hands:
|
Land |
3cc or less |
4cc or more |
|
3 |
5 |
1 |
|
4 |
4 |
1 |
|
5 |
3 |
1 |
|
6 |
2 |
1 |
|
7 |
1 |
1 |
Again, plugging each combination into the MVH and adding it the previous calculation is moving us closer to the final answer. Using loops would again be helpful. Note that the land runs from three to seven, while the 3cc or less run from five to one. We can let i=3 to 7 and define the number of 3cc or less spells be 8-i.
Now if you continue this exercise out, you will note that for the above set of loops, i (the number of lands) runs from three to (eight minus the number of 4cc or more spells) and the number of 3cc or less cards can be determined by (nine minus the number of 4cc or more spells)-i. In other words, you can use two loops (let j be the number of 4cc or more spells in your hand and i be the number of lands) to generate all the possible combinations.
According to my calculations, there are twenty-one possible hands that allow you cast at least one spell by turn 3, the final one being 3 land, 1 3cc or less card, and all 5 4cc or more cards (by the way, the probability of getting this draw is 0.00507%).
You can use Excel to generate these combinations, or you can write computer programs in languages that have loops, like BASIC, C+, PASCAL or FORTRAN. For my calculations I used a program called Mathematica, which is a programming language for math geeks, but I checked my calculations in Excel.
What is the final answer? Based on my calculations, the probability of being able to cast at least one spell by turn 3 is 87.88%. Again, this ignores checking for the proper colored mana for a given spell. It also means that there is 12.12% chance that you will not be able to cast a spell due to not having three lands, which matches the number Tom presented when he broke down his calculations.
We can also use this tool to quickly explore other deck construction ideas. What if we removed one of the lands and added and extra four-mana-plus spell? The probability of being able to cast at least one spell by turn three falls to 84.16%. Note that the loops used to determine the 21 possible hands that result in a spell being played by turn three are exactly the same, since we can't draw all six four-mana or more spells and still have three land and one 3cc or less spell. A deck that has eighteen lands, sixteen three-mana and under spells, and six four-mana or more spells has an 87.69% chance of playing a spell by turn 3. This is an interesting result, and shows that swapping out one of the weaker three-mana spells for a better four-mana spell or more spell has very little effect on the early game.
One more example before I go: Let's say I am constructing a two-color, forty-card Limited deck. How many of each land do I need to play in order to guarantee that I get at least three land, at least one of each color, by turn 3, 85% of the time? We can use the same process I outlined above. By my calculations, there are thirty-five different nine-card hand combinations that satisfy our requirements. Here is the probability breakdown:
- 16 land (8 of each): 71.86%
- 17 land (8 of one, 9 of the other): 76.21%
- 18 land (9 of each): 80.38%
- 19 land (9 of one, 10 of the other): 83.68%
- 19 land (8 of one, 11 of the other): 82.44%
- 20 land (10 of each): 86.76%
- 20 land (9 of one, 11 of the other): 86.22%
What? I have to run twenty land? Nobody runs twenty land, since that much land tends to result in mana flood. But, if you want to keep the probability of being mana/color screwed (on turn three) to 15% or less, you need those twenty lands. This table also shows that if I run 16 land, I will get manascrewed or colorscrewed three out of every ten games. That's once every 3.3 games, which is more than every other match. It seems that seventeen land is the minimum for a two-color Limited deck. I also toyed with the ratios of the colors, just so you can see the effect of having a green-heavy R/G deck, for example.
The applications of the multivariate hypergeometric in Magic are limitless. Hopefully, I have given you the tools to find the answers to some of those harder deck construction questions. I would suggest that you try to re-create my answers above before tackling your own questions. And I will admit, there are other ways to approach these problems, simulations being one example. I must stress that this material is not easy, but tools such as these can help with the deck construction process.
I would like to thank Tom Carpenter for inspiring me to explore this material.
|