So Many Insane Plays - The Legacy Matchup Grid & The September/October Vintage Metagame Report

As you know by now, Wizards has announced two Legacy Grand Prix tournaments slated for 2010. The first is in February in Madrid. The second is in Columbus, Ohio, in late July. I’m psyched.

The StarCityGames.com $5000 Legacy Open series has pushed the evolution of the Legacy metagame. It began in Boston. The next was in Charlotte. Finally, the most recent was Philadelphia. From tournament to tournament, we’ve seen clear and measureable metagame shifts. I’ve documented these by providing full analysis of the performance of these decks in metagame. I’ve taken a snapshot of the metagame pie, measuring archetype representation in the field. I’ve measured Top 16 penetration given the percentage of an archetype in the field. That statistic tells us which decks are the best performing decks. I’ve graphed the distribution of each major archetype throughout the field itself, giving us a view of where each archetype tends to cluster in the final standings, another measure of their relative strength in the field. That way we can understand not only which decks tend to make Top 8, but also what sort of records various archetypes tend to accumulate.

I’ve done all that… But this article takes tournament data analysis a step further — perhaps further than anyone has ever done for a major Magic tournament. Jared Sylva was kind enough to provide me match results from the SCG $5K so I could do something unprecedented: I’ve created a matchup grid that shows the matchup results of every archetype in the field. Unfortunately, this was such a labor intensive project that there are a few entry errors in the grid. I went through and corrected some of them, but not all. I will be doing this again in the future, so hopefully as I get better at this there will be fewer errors in the future.

The results are even more astonishing than I could have imagined. It brings an incredible clarity to the format. This clarity is invaluable, both to Legacy regulars, but also to players who want to understand the basics of the format as they begin the process of preparing for the 2010 Legacy Grand Prix.

It’s quite a wide grid, and thus the easiest way to display/access it is to offer it as a downloadable xls file. You can find this here.

Here’s how you read the grid. Looking at the far lower, left hand corner, the vertical row line says Belcher Combo and the horizontal column line says CounterTop-Goyf. The first number is the number of victories of the deck on the left and the second number is the number of wins from the deck on the top. So this entry says “0-1,” indicating 0 wins for Belcher and 1 win for CounterTop Goyf. Note that if an entry has a third number that number indicates the number of unintentional draws. So, for example, look at the third entry in the first column, where it says Zoo. To the right is the Zoo versus CounterTop-Goyf matchup entry, which reads 5-8-3, for 5 wins for Zoo, 8 losses, and 3 unintentinoal draws. I decided to include unintentional draws because it may give us a sense of how close a match is.

Let’s start with the most lopsided matchup results.

Ad Nauseam versus CounterTop-Goyf (2-9 or 1-10)

This matchup was devastating for Ad Nauseam decks, which often cruised through the tournament. This is one of the key reasons that Ad Nauseam decks didn’t make it higher in the tournament, as a group. According to my grid, Ad Nauseam was either 2-9 or 1-10 versus CounterTop-Goyf. One of my entries was an error, but it still gives you an incredible sense of how lopsided this matchup is. CounterTop-Goyf crushes Ad Nauseam. And it makes sense. CounterTop-Goyf not only has annoyances like Force of Will and Daze, but Counterbalance is good game.

Ad Nauseam versus Zoo (5-0 or 4-0)

Zoo tore apart SCG Charlotte, winning the whole thing and putting five copies of itself in the Top 16. The player-base took notice, and the whole metagame, according to my grid, adjusted. The huge uptick in Ad Nauseam is a clear indicator of this. And the record between the two matchups, 5-0 or 4-0 (again, an entry error in one of the spots) shows that Ad Nauseam dominated this matchup. Zoo is a great metagame choice, but it simply doesn’t pack enough punch to beat Ad Nauseam, even with its burn. Ad Nauseam has too many unimpeded turns to combo out.

Merfolk versus Zoo (0-6)

This is old news. Merfolk were one of the most popular decks in SCG $5K Boston. The metagame responded with Zoo, which crushes Merfolk. These results confirm that this result still holds true.

Zoo versus Goblins (6-1 or 6-0)

Goblins was a surprise winner both in Zendikar and M10, with new toys to play. Goblins has seen a slight resurgence, but that appears to have been short-lived, and this is why. Zoo apparently crushes Goblins, which makes sense. Zoo has better creatures and plenty of spot removal to take out the best Goblin weapons.

CounterTop-Goyf versus Zoo (8-5-3)

This matchup wasn’t lopsided, but it was very important. This matchup is perhaps the most important reason for Zoo’s struggles in Philly. This was a match that Zoo dominated in Charlotte. Yet Zoo could only eek out 5 wins in 16 matchups! The three unintentional draws may actually be moral victories for the CounterTop-Goyf decks, which managed to grind the game to a halt, even if they couldn’t seal the deal. Those draws indicate that this match was close. But even worse, CounterTop-Goyf won half of these matchups. I definitely attribute that to the adjustments that CounterTop-Goyf pilots made in dealing with this matchup. Many of them added combo finishes like Natural Order or other combos to improve their odds in this matchup. CounterTop-Goyf lists that aren’t capable of competing with Zoo were weeded out in the previous SCG $5K, and it shows.

Canadian Threshold versus Zoo (3-2 or 2-2)

Canadian Threshold is the perennial contender this year. This deck split its matchups with Zoo, and that demonstrates that this is a close matchup. And that makes sense. Zoo has more beef, but Threshold has all the good Blue countermagic.

Canadian Threshold versus CounterTop-Goyf (5-4-1)

Like the Zoo matchup, this is a very close match, but it appears that Threshold has a slight advantage. I think this record is probably consistent with what most experienced Legacy players would expect.

Merfolk versus CounterTop-Goyf (4-3)

I had to double check the data for this one to make sure I corrected the errors. This was yet another revealing matchup. Merfolk had risen in the ranks of the metagame partly because of its strength against the Counterbalance decks, both Countertop-Goyf and Dreadtill. As you can see from the grid, Merfolk was 4-0 against Dreadtill, but its matchup against CounterTop-Goyf has shifted, and it’s only a split now.

Dredge versus The Field (2-2, 3-1, 2-1, 1-2, 2-2, 3-2, etc)

What’s so interesting is that the fact that there is no clear trend regarding Dredge’s matchups. And that makes perfect sense. Dredge’s matchups aren’t about the match — they are about the degree of hate that the opponent is packing for the matchup. Be skeptical of claims that certain archetypes have favorable or unfavorable matchups against Dredge.

Those were the most popular decks in the field, but it’s good to know how those matchups shake out. The other matchups may have sample sizes too small to say anything meaningful, but these matchups are pretty clear.

In response to an article I wrote a few weeks ago, Zac Hill flared up the age old-debate about the relative importance of in-game decision-making versus metagame positioning through deck selection. I was telling my buddy Paul Mastriano about these results, and he said they confirmed his belief that Legacy is hugely matchup dependent, far more than Vintage. In Vintage, the top decks like Tezzeret, TPS, or Stax are so enormously complicated that the gradations between average players and top players are often outcome determinative. Vroman, in his arguments for legalizing collusion, even suggests that there are no favorable matchups in Vintage because of the influence that a 1-2 card maindeck difference, skill of the pilot, and variable sideboard composition might have. From this Legacy data, it certainly appears that a top flight Ad Nauseam pilot and an average pilot has roughly the same chances in a host of matchups, whether it’s Zoo, Goblins, or CounterTop-Goyf. Similarly, the skill of the Merfolk pilots didn’t appear to make a difference in the Zoo matchup. And so on.

What that means is quite evident. Metagame positioning — skillful deck selection — is arguably the most important decision a Legacy player can make.

You might look at the Top 16 of SCG Philly and wonder if this is true. After all, didn’t Cedric Phillips Top 8 with Belcher? And didn’t Chris Woltereck Top 8 with 43 Land? And so on…

Look at Cedric’s matchups with Belcher:

Round 1: Goyf Sligh (a burn deck with Goyfs) — Won 2-1
Round 2: High Tide Combo — Won 2-1
Round 3: Mono White Stax — Lost 1-2
Round 4: Canadian Threshold — Won 2-0
Round 5: Zoo — Won 2-1
Round 6: Life Combo — Won 2-1
Round 7: Ubr Bitterblossom Control — Won 2-1
Round 8: CounterTop-Goyf — ID
Top 8: Canadian Threshold — Lose 0-2

He faced fringe, much slower combo decks like High Tide and Life Combo, which he tore apart, and linear aggro like Goyf Sligh and Zoo, which he tore apart. He lost to his a bad matchup, the Stax deck, and beat Canadian Threshold – a matchup that he later lost to in the top 8, a matchup he probably splits with. But he didn’t have to play a match against CounterTop-Goyf pilot or a Merfolk pilot. Nor did he have to coin flip a match against an Ad Nauseam pilot.

I wouldn’t recommend Belcher for anyone in Legacy unless your goal is to dodge decks that play 4 Stifle and 4 Force of Will.

I could show the same sort of matchup advantages for Chris Woltereck, but I won’t. I think you get the picture.

Matchups in Legacy are huge. They aren’t decisive, by any measure, but some matchups appear to be so lopsided that in-game decision-making skill probably can’t make much more than a dent, as this grid demonstrates.

I believe that metagame positioning in Legacy is critical. The key is to select decks that maximize your favorable matchups as far as possible, while minimizing bad matchups. That’s probably true of all formats. But the depth and breadth of the format’s card pool means that sideboarding strategies and tweaks are probably possible that can address, or at least tighten, bad matchups. For that reason, smart sideboarding plans, tight play, and a little bit of luck against the bad matchups will make the difference between players who make it into a Top 8 and those that don’t.

Much thanks for Jared Sylva for the matchup data.

The September/October Vintage Metagame Report

Vintage tournament-goers lack large scale events like Pro Tours and Grands Prix to set the metagame and inform players what the top decks are. A Pro Tour Top 8 can set a metagame in motion for the next season. Instead, Vintage is a far more decentralized tournament scene. Tournaments are much smaller and geographically dispersed. Yet there are profound trends. Vintage players who want every advantage should take note of these metagame reports. They provide critical metagame information about the state of the Vintage format and it’s ever changing dynamics.

I aggregate every single 33-player or more Vintage tournament reported from anywhere on the planet into one simple, easily accessible bimonthly metagame report. The reason I select 33-player or more tournaments is that 33-player tournaments feature 6 swiss rounds and a cut to Top 8. By requiring 6 swiss rounds, every deck that makes Top 8 will have played at least 4 opponents. This dramatically reduces the randomness when an opponent only has to face 3 good matchups and double draw into the Top 8.

In addition, these reports provide important information to the DCI and the Vintage community about trends within Vintage.

The September/October Metagame Breakdown by Archetype

There were 8 of tournaments of 33 or more players reported in September and October, for a total of 64 of possible Top 8 slots. Here’s what made Top 8:

14 Tezzeret Control (2,2,2,2,2,2,4,4,4,4,5,6,6,8)
7 Stax (1,2,3,3,5,7,7)
7 U/x Fish (1,3,4,6,6,7,8) (4 UGW, 2 UW, 1 UBW)
6 Dredge (1,1,4,5,8,8)
5 TPS (5,5,6,7,8)
5 Oath (1,3,4,6,6)
4 Steel City Vault (2,7,7,8)
3 MUD (3,3,4)
3 G/x Beats (1, 3, 7)
2 Drain Tendrils (1,3)
2 Ad Nauseam (1,8)
2 Bob Control (5,6)
2 Dragon Combo (5,7) (1 Painter hybrid)
1 Painter Control (5)
1 Counterbalance Control (8)

The numbers in parenthesis represent placement within Top 8. So, a (1) means that the deck got first place in one of the tournaments.

Here’s what that looks like using a fancy colored coded bar graph:

Graph

The blue segment represent tournament wins. The red represents 2nd place finishes, and so on.

Compare that, if you would like, to the July-August metagame chart.

Archetypes as a Percentage of Top 8s

Here is a pie chart that shows you the relative proportion of these archetypes in the top 8 field:

Pie

And here are the actual percentages:

September-October Archetypes As a Percentage of Top 8s

Tezzeret: 21.87%
Fish: 10.93%
Stax: 10.93%
Dredge: 9.37%
TPS: 7.81%
Oath: 7.81%
Steel City Vault: 6.25%
Rest of the Field: 25%
– MUD: 4.68%
– G/x Beats: 4.68%
– Drain Tendrils: 3.12%
– Ad Nauseam: 3.12%
– Bob Control: 3.12%
– Dragon Combo: 3.12%
– Painter Control: 1.56%
– CB Control: 1.56%

Here was the July/August dataset:

Tezzeret: 15.3%
Fish: 12.5%
Stax: 11.11%
MUD: 8.33%
TPS: 6.9%
G/x Beats: 6.9%
Drain Tendrils: 5.6%
Dredge: 5.6%
Oath: 5.6%
“Rest of the Field”: 22.22%

As you can see, Tezzeret has actually grown as a percentage of Top 8s. It went from 15% of Top 8s to 21%. But what’s interesting is that it isn’t winning tournaments, at least in this dataset. Tezzeret didn’t win a single tournament despite making Top 8 fourteen times. That’s really incredible. What’s even more incredible is how the deck seemed to cluster at the 2nd place spot. There were no 3rd place finishes either.

The biggest players in Vintage are clear: Tezzeret, Stax and Fish. Fish fell by a little more than 1%, from its peak last time period.

Graph

Stax fell by a quarter percentage point. What really fell was MUD, which lost almost 4 percentage points, half of its stock or a nearly 50% decline. TPS went up by a percentage point. Oath is clearly on the upswing, and went up 2 percentage points.

Graph

Oath saw a burst of popularity last year at this time following the printing of Hellkite Overlord, and then within six months fell to almost zero. The restrictions clearly contributed to Oath’s recent rise, but Iona and Spell Pierce have appeared to contribute even more. I expect to see more Oath in the next time period. Whether it can stay popular remains to be seen.

Dredge saw the most growth, and it increased by three and half percentage points, almost 40% growth. However, Dredge always tends to fluctuate wildly, as the chart below shows:

Graph

There does not appear to be any rhyme or reason to Dredge’s performance. It fluctates from 5 to 15% but seems to hit 10% in between the fluctuation. Perhaps it’s that the degree of Dredge hate changes in response to Dredge’s performance, and it rises or falls accordingly.

The large uptick in Tezzeret, 30% growth, is definitely concerning. It’s still well below the May/June 26% of the field, just before Thirst’s restriction:

Graph

But this data suggests that the restriction may be wearing off. I wouldn’t say that Tezzeret is dominant, particularly because it didn’t win a single tournament, but its numbers are much more concerning than last periods. It’s below Nov-Dec and May-June, but 21% is still higher than January through April data. I can only hope that this is an aberration, and that the July-August dip will be restored, but I’m wouldn’t bet on it.

History has shown that looking at archetypes, while important, may conceal actual problems, or, alternatively, overstate problems. Let’s take a look at the metagame by engine.

But what about by engine?

September/October Breakdown by Engine

19 Mana Drain decks: 29.68% of the Top 8 Field
10 Mishra’s Workshop decks: 15.62%
7 Dark Ritual decks: 10.93%
9 Bazaar of Baghdad decks: 14.06%

In addition to these engines, I am going to track a few other stats that seem relevant, the % of Time Vault decks, Null Rod decks, and Force of Will decks.

There were:

13 Null Rod Decks: 20.31% of top 8s, almost identical to July-Aug.
21 Time Vault decks: 32.81% of Top 8s, an increase of 5% from 26.38% in July-Aug

And perhaps most relevant for DCI purposes, there were:

43 Force of Will decks: 67.18% of Top 8s, up from 63.88% in July-August.

Time Vault’s numbers are high, but Force of Wills numbers are twice as high. Null Rod is not just being used by Fish and Beats, but also in a bunch of Workshop decks. Time Vault is also being used by a number of archetypes. 32.81% is not terribly concerning. I’m sure Tinker’s numbers are just as high, and probably more so.

Here is what the Engine trends look like over the last 6 months:

Graph

As you can see, the restriction of Thirst led to a dramatic drop in Mana Drains as a % of Top 8s. What’s really incredible is that despite the uptick in Tezzeret this time period, there hasn’t been a corallary uptick in Mana Drains more generally. They have even fallen off slightly. Workshops are the second most successful engine, but have lost about 25% of their stock from the last time period. Bazaar shot up because of a couple of Dragon decks in Top 8s, and a Beats deck that uses Bazaar, not to mention the increase in Dredge during this period. Dark Rituals are steady at just about 10%.

Here’s what that looks like so far this year:

Graph

And, just for good measure, here’s the entire graph, from June, 2007:

It’s been well over a year since the DCI’s restriction of Gush, Ponder, Brainstorm, Merchant Scroll, and Flash. The time period we are in now is not as competitive, by engine, as it was during that year, but it’s at least better than it was after those restrictions. In the last four months, Mana Drain has been well below where it was at any time since those restrictions. So long as that trend continues, I am less troubled by any uptick in Tezzeret.

There format is still evolving after the restriction of Thirst, and the printings in Zendikar have started to have an impact. Nothing appears to be in need of restriction. Interestingly, the June unrestrictions appeared to have very little impact so far. Crop Rotation, Entomb, and Grim Monolith have had a very marginal impact, if any. Enlightened Tutor shows up as a two-of in some of the GW Beats decks, and that’s about it.

Further unrestrictions are desirable. I continue to believe that the unrestriction of Gush could continue to bring greater diversity to the format at the engine level. Gush’s presence in the metagame before had two primary effects: 1) it pushed Mana Drains down, and 2) it pushed Workshops up. Unrestricted Gush would not be nearly good enough to bring back full scale use of the Gush-bond engine, but it would have the potential to produce two positive effects, at a much smaller scale: provide another check for Drain decks while reintroducing prey for Workshop decks. Importantly, the now popular Gaddock Teeg would be quite strong against Gush decks, and could boost the G/x Beats decks at the same time.

Until next time…

Stephen Menendian

Appendix:

8.23.09, Spain (36 players)
Dredge Won.
(This tournament was not included in the previous dataset and is close enough in time that I included it).

9.5.09, Philadelphia (52 players)
Dredge Won.

9.12.09, Catalan Vintage League (58 players)
Ad Nauseam Won.

9.13.09, Italy (290 players)
Drain Tendrils Won.

9.29.09, Switzerland (55 players)
Fish Won.

10.18.09, France (39 players)
B/G Beats Won.

10.24.09, New York (53 players)
Oath Won.

10.25.09, Badalona, Spain (52 players)
Stax Won.