Stats 101: Am I Shuffling Enough - Or Correctly, For That Matter?
My wife, Natalie, had just built a new deck for an upcoming tournament and she wanted to do a little playtesting... So I quickly built a U/G madness deck, using twenty-four lands. I pulled the thirty-six nonland cards first, then found the twenty-four lands that I would use. I decided at that point to conduct an experiment to see how well I shuffle my deck.
I turned the twenty-four lands upside down, so I could see that they were lands quite easily. I then proceeded to shuffle my deck like I normally would. It should be noted that as I began to shuffle, all the lands were in one clump of twenty-four at the top of my deck. I created four piles, putting one card in each as I go across (a traditional pile shuffle). I then take the first two piles, riffle shuffled, two overhands, and then another riffle. I repeated this process with the other two piles. Taking both new piles, I riffled them together a few times, followed by a couple of overhand shuffles, and a couple of more riffles. I then spread the cards out in front of me (making sure to preserve the order).
Natalie's first comment that there were too many clumps of land. I concurred, seeing a couple of clumps of three lands in a row, a few two in a row, and a couple of long stretches of non-land cards. I proceeded to shuffle some more and started playing, flipping over the lands when I had the chance.
Then a thought occurred to me. I teach statistics - why can't I figure out how many clumps of land should occur in a sufficiently randomized deck? So I did.
I present to you the results of my analysis, and some information on how you can utilize them to evaluate your shuffling technique and your ability to randomize your deck.
There have been many articles that describe the probability of getting the correct number of land in an opening hand of seven cards: The probability distribution that is the powerhouse behind these calculations is called the hypergeometric distribution. This is the distribution that answers the classic question -"If you have an urn with three black marbles and five white marbles in it, what is the probability that you draw two white marbles if you do not replace the marbles?" For those of you not interested in the formulas behind the calculations, skip to the next paragraph. If you have stuck around this far, the hypergeometric distribution is:
...where P(x) is the probability of drawing x land from a deck with c cards and l land. The number of cards drawn is n (typically n=7 for an opening hand). Since we are taking cards from the deck without returning them, we use the notation
...to represent the number of different combinations of the l land in your deck, taken x at a time. You can use this formula to determine the probability of getting x number of lands in your opening hand of n cards.
For this analysis, we are only interested in calculating the probability of drawing two lands in a row when we only draw two cards. We let n=2, l=24, and c=60 and calculate the probability that x equals two - which turns out to be 15.59%. In terms of our problem, if you take any two adjacent cards from the above deck, after it that has been sufficiently randomized, there is a 15.59% chance that they will both be lands.
That's great - but the real question is how many times should I see two lands adjacent (adjacent pairs) to each other when I spread my shuffled deck in front of me?
Assuming a sixty-card deck, there are n-1 pairs of adjacent cards, or 59 pairs. By multiplying the probability of seeing adjacent lands (.1559) times the number of possible pairs (59), the expected number of pairs observed is 9.2 pairs of adjacent land in a completely randomized deck of 60 cards (assuming 24 lands). In this analysis, a clump of three land is considered two sets of adjacent land, a clump of four lands is considered three pairs of adjacent land, etc.
Let's look at an example: I'll put my deck in its original state, with the twenty-four lands on top, and shuffle it as I described above. To me, this emulates the state of the deck at the start of the tournament, after I registered the deck for example and double checked to see if my land count was correct. Here are the 60 cards, in order (L=land, N=non-land) with the number of adjacent pairs counted:
LNNNNLNLNLNNLNNLLNNLNNLNNNLLLLNNNLL
NLLLNLNNNNNNNLNNLNLNNLLNN
This shuffle produced eight pairs of adjacent lands, which is pretty close to the 9.2 I was expected to get. I then pulled the top 20 cards - similar to a game in which thirteen turns were taken - separated out the land (since it would be separate in play) and repeated my shuffling technique. This produced nine pairs of adjacent land, which is where I would expect the count to be, given a sufficiently randomized deck.
What can we get from this analysis? The most startling point is that even if we shuffle like crazy for the full three minutes we are allowed, land will clump together. To double check my calculations, I used a computer to simulate 100,000 shuffles from the above deck, and the average number of adjacent pairs of land was 9.18 - right where the probability said it should be. I also had three clumps of three lands and a clump of four lands. Calculations, similar to the ones done above, show that you should expect three clumps of three lands and one clump of four land.
These results make excuses such as"I shuffled poorly/was unlucky and drew three lands in a row, which cost me the game" an even poorer excuse. You should expect to draw three lands in a row at some point - because if you completely randomized your deck, there should be at least one clump of three lands, if not more. Of course, these probabilities change depending on the number of lands, and in the table below you will see similar calculations for different land counts in 60 card constructed decks and 40 card limited decks.
If you think you fall victim to clumpy land draws more often then the average person, the information above gives you one way to check your shuffling technique. You can take any sixty-card deck with twenty-four flipped over lands. Be honest, and shuffle like you normally do. Then spread the deck out and count the number of adjacent pairs of land that are present, similar to the example above. It's a good idea to start with all the lands in large clump, because if you can de-clump your land from a pile of twenty-four, your shuffling technique should be satisfactory. Try this method four or five times, and calculate the average number of adjacent pairs of land for each shuffle.
If your average is higher than eleven, you need to think about adjusting your shuffling technique.
Another result of simulating 100,000 shuffles is that I could empirically calculate the range of the number of adjacent pairs of land we should see. Based on my simulations, 95% of the simulated shuffles had between six and twelve pairs of adjacent land (see the table below).
If you consistently get more than twelve pairs of adjacent land, then you really need to learn to shuffle better. But what about the other extreme?
Mana weaving, the process of removing all the land and placing it into the deck every two or three non-land cards, is illegal in sanctioned play (and frowned upon casually). The above results confirm these rulings. Placing land cards so that there are no clumps of land puts the deck in a"better than randomized" state, and gives players a major advantage. Mana weaving a deck, then shuffling, will still result in a deck with, on average, nine pairs of adjacent lands (assuming twenty-four lands) if the deck is sufficiently randomized (which is seven riffle shuffles by the way - but that's a different article).
If the situation occurs where you can look at an opponent's shuffled deck, perhaps because you are a judge or you have cast Extract, there is only a 2.5% chance that a sufficiently randomized deck will have five or less adjacent pairs of land with twenty-four land cards out of sixty. Decks that consistently have only a few pairs of adjacent land (for the given number of lands in the deck) are being illegally manipulated, and those players should be punished accordingly.
Magic: The Gathering has a huge random component built into it. You can control the odds better if you take the time to examine mana ratios and its effects on the decks you run. If you use the information above to test your ability to sufficiently randomize your deck, you should get as few land clumps as statistically possible, and the consequences of your land count decisions will be affected only by chance, not by bad shuffling.
Table for expected number of land clumps
|
60 Cards |
Expected number of clumps of |
|||
|
Lands |
2 lands |
3 Lands |
4 Lands |
95% Range |
|
18 |
5.1 |
1.4 |
0.4 |
(3,8) |
|
19 |
5.7 |
1.6 |
0.5 |
(3,8) |
|
20 |
6.3 |
1.9 |
0.6 |
(4,9) |
|
21 |
7.0 |
2.3 |
0.7 |
(4,10) |
|
22 |
7.7 |
2.6 |
0.9 |
(5,11) |
|
23 |
8.4 |
3.0 |
1.0 |
(5,11) |
|
24 |
9.2 |
3.4 |
1.2 |
(6,12) |
|
25 |
10.0 |
3.9 |
1.5 |
(7,13) |
|
26 |
10.8 |
4.4 |
1.8 |
(8,14) |
|
40 Card |
Expected number of clumps of |
|||
|
Lands |
2 lands |
3 Lands |
4 Lands |
95% Range |
|
16 |
6.0 |
2.2 |
0.7 |
(4,8) |
|
17 |
6.8 |
2.6 |
1.0 |
(4,9) |
|
18 |
7.7 |
3.1 |
1.2 |
(5,10) |
|
19 |
8.6 |
3.7 |
1.6 |
(6,11) |
The 95% Range gives the range for the number of adjacent pairs of land that 95% of sufficiently randomized decks will have. For example, a seventeen-land limited deck of forty cards will have between four and nine pairs of adjacent land 95% of the time, if the deck is sufficiently randomized.
