I did some digging and compiled past tourney final match results. Is this the right pool of tourneys to collect data from? I was looking for standard CAS tourneys excluding experimental ones.
Fall '17 – Winter '17 is the LDT tournaments? That looks right.
The reason Bashing ends up with such a high Nash weight is partly because of how little it’s used. As the other spec matchups are played, and their uncertainty goes down, the remaining high uncertainty for Bashing matchups means that its sampled Nash weight will sometimes be huge.
I don’t use the balance tournament in the model. It’s different enough that effectively they’re completely new starters/specs, so I’d be roughly quadrupling the modelling time.
Im not sure winning player here is a fair statistic to track, as it appears you are only comparing if they won their last game as p1 or p2. Taken over all the matches of all the tournaments, p1 is slightly favored, as i recall, but not 8-3 favored.
@charnel_mouse I just checked out your website. Very impressive. Although I don’t understand most of it, it looks like valuable study. Is overall mean pick distribution predicting the winner? Surprised model weighs EricF’s deck more than Nightmare. Oh, I see. It’s because of Bashing as you explained above. So, am I reading this right? It looks like the model is predicting EricF’s win by a huge percentage. Wait, isn’t model’s choice [Demon/Necro]/Bashing then? Ohh, is that why EricF said he is messing with the model? Since we all know Bashing’s plot is prolly not real? I think I now get it. But why not Nightmare? I guess that was just your choice. Was your deck the 3rd strongest? Wait, no that is not your deck. Did you pick the weakest deck??
(Disclaimer: not a data scientist, been a long time since I took statistics)
I believe the model slightly favors Demon/Necro/Bashing over Nightmare, but just greatly favors EricF’s skill (rightfully so, his past performance against everyone is proof of it and what the model’s going off to arrive at that conclusion )
Model gives him best odds of being the best active player, with I believe you next @bansa, and Boly, Zhav and Persephone rounding out the top 5 but well behind you two. I don’t think that’s particularly far off reality, the skill distribution is wider for those who’ve played fewer games in the sample. I’ve played 170-ish games in that sample and the model has a fairly proven record Zhav and Eric have my number and a fair bit of data that you’ve taken about as many games off me as I have off you
FrozenStorm’s got it right. EricF’s skill level tends to make him a favourite regardless of the decks, but in this case his deck is the favourite before accounting for players. Again, Bashing gets a bit over-weighted for its high uncertainty, which I consider a feature, since this is the model automatically trying to balance exploration and exploitation. Meanwhile, Nightmare isn’t considered particularly strong, because the model doesn’t account for inter-spec synergies.
Overall mean pick distribution is something like: if you hosted a single match against another person, and you both picked a player/deck to play on your behalf, with what probability should you pick each of them? So it’s the Nash strategy for a single game, not really for winning the tournament. It shouldn’t be taken too seriously, but I thought it was interesting enough to display.
My deck has the highest pick distribution out of all possible multicolour decks, but it has the lowest among the decks in the tournament. Funny how these things work out.
Ha, you lost me there. Clearly, I don’t understand this Nash thing, thanks to my zero statistical background Maybe I’ll do some reading on your website to try learn what they are. Thanks for the info! @FrozenStorm I think I tricked you there. Model wasn’t predicting the winner sounds like. But honored that you think highly of me I’ve had some success in the past tourneys I entered and I have a theory on this. I think my style of play is under-represented here in the forum and you guys were prolly like what the xxxx is this guy doing? and yeah. I might have caught you guys off guard. But you will all get used to it and it will normalize as I generate more samples. I gotta limit my all in plays to get to the next level is my assessment to myself.
I know little about statistics so not sure what can be said about fairness of statistic but I was mainly interested in finding out who were winners and winning decks in past standard CAS tourneys.
I’m a classic data guy not a sabermetrician, so to me top 3 players are clear by most wins and this was the result I wanted to confirm and it matches my experience.
P1 wins was simply an observation I made and was a byproduct. I didn’t want to draw a conclusion from it since I agree that it is not enough data to put a weight on it. It is in line with my prediction though. Maybe slightly exaggerated. My gut feeling was somewhere between 6-4 and 7-3.
As to collecting just the final match results, again, I’m a big game pitcher believer not a sabermetrician. Don’t hate me for this. I think dataset on final matches actually represents more fairness between players and decks than data on all the matches because it makes the P1-P2 data less affected by different skill levels and strongness of decks. Yea, this may sound statistically wrong but my presumption is final match is expected to see skilled players with strong decks in that particular season. Although not equally powered, they are more likely to have a fairness value(?). I could be totally wrong and randomly referencing model here in hope that it supports my argument.
I think it holds some truth though even if model proves me wrong and I want to expand my research to all the tourney matches between top players to make the dataset bigger for P1-P2 win data.
Nash equilibria are from Game Theory, not Statistics. Simplifying to the main case we look at here, i.e. deck selection, the Nash equilibrium is what you’ll see when all the players are choosing decks (strategies) so as to make their worst-case matchup as good as possible. Each deck gets a Nash weight, and the weights sum up to one. Players randomly pick a deck, and a deck’s weight is the probability of it being picked.
Looking at just the finals is fine, but there are a few things it’ll naturally sweep under the rug. Some examples:
Being a good tournament player consists of at least two different measures of skill: skill at playing matches, and skill at constructing / choosing good decks. How much do you care about the latter? Players that are strong at one tend to be strong at the other.
Players that are very good but aren’t active often – or often go AWOL halfway through tournaments – are penalised by just counting final wins. Case in point: the model reckons Marto is most likely to be the best player – EricF is likely to be the best active player – but he doesn’t show up on tournament wins at all. Jadiel barely shows up either, even though he was also a very strong player.
Some of the stronger known decks have a Paper-Scissors-Stone dynamic: Nightmare is disadvantaged against Miracle Grow, which is disadvantaged against [Demonology]/Anarchy/Balance, which is disadvantaged against Nightmare. If you see one of these decks winning a tournament, is it because their counter wasn’t in the line-up that year?
Now, I’m not convinced yet, but it does look like matches in the final round tend to be more even / fair. But what are you trying to measure, that matchup fairness will help you more with? Player skill? Deck strength? A bit of both? Why do you want cases where who wins is closer to being a fair coin flip?
I was trying to agrue that dataset on final matches may reflect P1-P2 win disparity better than dataset on all tourney matches or at least have some meaning because they are less affected by different player skills or strong decks since they are presumably both skilled and both strong.
Oh, I see where you’re coming from. That’s fair enough, although tournament winners tend to be more aggressive decks, so probably favour going first more than average. It’s 9-2, by, the way, Player 1 won the CAWS19 final.
Interesting you mention Marto above. I had a causal match with him a couple years ago and I got destroyed. It was an eye opening moment for me. He was just playing defense as Green and any offense attempt by me was completely blocked by his patrol. He was brilliant and I felt powerless. I was still learning the game back then but he felt like a giant wall.
Having said that, I would still put our buddies on top 3 all day long. Model may speak the truth and if Marto shows up one day, he might win with high probability but nothing is proven yet.
Hey guys, ready to get excited! Here comes another fanboy data. Now featuring 29 samples from tourney matches amongst our top 3 players by most wins.
YEAR
SEASON
WINNER
DECK
OPPONENT
DECK
WIN PLAYER
2016
FALL R4
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P2
2016
FALL R6
FrozenStorm
[Past]/Peace/Anarchy
zhavier
Mono Green
P1
2016
FALL FINAL
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P1
2016
WINTER R2
zhavier
[Necro]/Blood/Truth
EricF
[Past]/Peace/Anarchy
P2
2016
WINTER R7
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Blood]/Strength/Growth
P2
2017
SPRING R5
EricF
[Peace]/Balance/Anarchy
zhavier
[Anarchy]/Strength/Growth
P2
2017
SPRING R8
FrozenStorm
[Demon/Necro]/Finesse
EricF
[Peace]/Balance/Anarchy
P1
2017
SUMMER R2
EricF
[Future]/Peace/Necro
zhavier
[Discipline]/Present/Anarchy
P1
2017
SUMMER R3
zhavier
[Discipline]/Present/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P1
2017
SUMMER R4
EricF
[Future]/Peace/Necro
FrozenStorm
[Past]/Peace/Anarchy
P1
2017
FALL R3
FrozenStorm
[Demon/Necro]/Finesse
zhavier
[Finesse]/Demon/Ninjutsu
P1
2017
FALL R5
FrozenStorm
[Demon/Necro]/Finesse
zhavier
[Finesse]/Demon/Ninjutsu
P2
2017
WINTER R3
FrozenStorm
[Future]/Necro/Peace
zhavier
[Anarchy]/Strength/Growth
P1
2017
WINTER R6
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P1
2017
WINTER R7
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P2
2017
WINTER FINAL
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P1
2018
SUMMER R3
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Demon/Necro]/Finesse
P2
2018
SUMMER R9
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Demon/Necro]/Finesse
P1
2018
WINTER R6
EricF
[Discipline]/Strength/Finesse
FrozenStorm
[Anarchy]/Necro/Growth
P1
2018
WINTER R7
EricF
[Discipline]/Strength/Finesse
zhavier
[Finesse]/Anarchy/Blood
P2
2019
SUMMER R1
EricF
[Anarchy/Blood]/Demonology
FrozenStorm
[Past/Future]/Finesse
P1
2019
SUMMER R6
zhavier
[Anarchy]/Strength/Growth
EricF
[Anarchy/Blood]/Demonology
P1
2019
SUMMER R9
zhavier
[Anarchy]/Strength/Growth
EricF
[Anarchy/Blood]/Demonology
P2
2019
WINTER R4
FrozenStorm
[Future]/Peace/Necromancy
EricF
[Feral]/Law/Fire
P2
2019
WINTER R6
FrozenStorm
[Future]/Peace/Necromancy
zhavier
[Balance]/Growth/Finesse
P1
2020
SUMMER R2
FrozenStorm
[Necromancy]/Blood/Fire
zhavier
[Future]/Necromancy/Peace
P1
2020
SUMMER R6
zhavier
[Future]/Necromancy/Peace
FrozenStorm
[Necromancy]/Blood/Fire
P1
2020
SUMMER R8
zhavier
[Future]/Necromancy/Peace
FrozenStorm
[Necromancy]/Blood/Fire
P2
2020
SUMMER FINAL
FrozenStorm
[Necromancy]/Blood/Fire
zhavier
[Future]/Necro/Peace
P2
17
12
Findings: P1 win percentage is a lot softer than final match dataset. I was thinking this may be because choice of decks in some seasons overshadowed their performance. For next post I think I will expand the sample size by adding tourney winner’s matches against top 3 players or maybe even include runner-up’s tourney samples. Stay tuned!
Here is the improved dataset with 68 samples. Basically combines the first (final matches) and second (matches amongst top 3 players) dataset plus alpha (season winner and runner-ups matches against top 3).
YEAR
SEASON
WINNER
DECK
OPPONENT
DECK
WIN PLAYER
2016
FALL_R4
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P2
2016
FALL_R6
FrozenStorm
[Past]/Peace/Anarchy
zhavier
Mono Green
P1
2016
FALL_FINAL
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P1
2016
WINTER_R2
zhavier
[Necro]/Blood/Truth
EricF
[Past]/Peace/Anarchy
P2
2016
WINTER_R5
petE
[Anarchy]/Strength/Growth
zhavier
[Necro]/Blood/Truth
P2
2016
WINTER_R7
EricF
[Past]/Peace/Anarchy
FrozenStorm
[Blood]/Strength/Growth
P2
2016
WINTER_R8
petE
[Anarchy]/Strength/Growth
EricF
[Past]/Peace/Anarchy
P2
2016
WINTER_FINAL
petE
[Anarchy]/Strength/Growth
zhavier
[Necro]/Blood/Truth
P1
2017
SPRING_R4
Bob199
[Strength]/Blood/Finesse
zhavier
[Anarchy]/Strength/Growth
P1
2017
SPRING_R5
FrozenStorm
[Demon/Necro]/Finesse
Bob199
[Strength]/Blood/Finesse
P1
2017
SPRING_R5
EricF
[Peace]/Balance/Anarchy
zhavier
[Anarchy]/Strength/Growth
P2
2017
SPRING_R6
EricF
[Peace]/Balance/Anarchy
Bob199
[Strength]/Blood/Finesse
P1
2017
SPRING_R8
FrozenStorm
[Demon/Necro]/Finesse
EricF
[Peace]/Balance/Anarchy
P1
2017
SPRING_FINAL
FrozenStorm
[Demon/Necro]/Finesse
Bob199
[Strength]/Blood/Finesse
P2
2017
SUMMER_R2
EricF
[Future]/Peace/Necro
zhavier
[Discipline]/Present/Anarchy
P1
2017
SUMMER_R3
Jadiel
[Necro]/Anarchy/Blood
EricF
[Future]/Peace/Necro
P1
2017
SUMMER_R3
zhavier
[Discipline]/Present/Anarchy
FrozenStorm
[Past]/Peace/Anarchy
P1
2017
SUMMER_R4
EricF
[Future]/Peace/Necro
FrozenStorm
[Past]/Peace/Anarchy
P1
2017
SUMMER_R5
zhavier
[Discipline]/Present/Anarchy
Jadiel
[Necro]/Anarchy/Blood
P2
2017
SUMMER_R6
Jadiel
[Necro]/Anarchy/Blood
FrozenStorm
[Past]/Peace/Anarchy
P2
2017
SUMMER_R7
Jadiel
[Necro]/Anarchy/Blood
zhavier
[Discipline]/Present/Anarchy
P2
2017
SUMMER_FINAL
Jadiel
[Necro]/Anarchy/Blood
EricF
[Future]/Peace/Necro
P1
2017
FALL_R2
zhavier
[Finesse]/Demon/Ninjutsu
rathyAro
[Feral]/Blood/Truth
P1
2017
FALL_R3
FrozenStorm
[Demon/Necro]/Finesse
zhavier
[Finesse]/Demon/Ninjutsu
P1
2017
FALL_R4
FrozenStorm
[Demon/Necro]/Finesse
rathyAro
[Feral]/Blood/Truth
P2
2017
FALL_R5
FrozenStorm
[Demon/Necro]/Finesse
zhavier
[Finesse]/Demon/Ninjutsu
P2
2017
FALL_R6
rathyAro
[Feral]/Blood/Truth
zhavier
[Finesse]/Demon/Ninjutsu
P1
2017
FALL_FINAL
FrozenStorm
[Demon/Necro]/Finesse
rathyAro
[Feral]/Blood/Truth
P1
2017
WINTER_R3
FrozenStorm
[Future]/Necro/Peace
zhavier
[Anarchy]/Strength/Growth
P1
2017
WINTER_R6
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P1
2017
WINTER_R7
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P2
2017
WINTER_FINAL
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Future]/Necro/Peace
P1
2018
SUMMER_R1
FrozenStorm
[Demon/Necro]/Finesse
Dreamfire
[Demon]/Anarchy/Balance
P1
2018
SUMMER_R3
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Demon/Necro]/Finesse
P2
2018
SUMMER_R5
Dreamfire
[Demon]/Anarchy/Balance
zhavier
[Anarchy]/Strength/Growth
P1
2018
SUMMER_R8
Dreamfire
[Demon]/Anarchy/Balance
zhavier
[Anarchy]/Strength/Growth
P2
2018
SUMMER_R9
zhavier
[Anarchy]/Strength/Growth
FrozenStorm
[Demon/Necro]/Finesse
P1
2018
SUMMER_FINAL
Dreamfire
[Demon]/Anarchy/Balance
zhavier
[Anarchy]/Strength/Growth
P1
2018
WINTER_R1
Marto
[Discipline]/Strength/Finesse
FrozenStorm
[Anarchy]/Necro/Growth
P1
2018
WINTER_R3
Marto
[Discipline]/Strength/Finesse
EricF
[Discipline]/Strength/Finesse
P1
2018
WINTER_R3
Marto
[Discipline]/Strength/Finesse
zhavier
[Finesse]/Anarchy/Blood
P1
2018
WINTER_R6
EricF
[Discipline]/Strength/Finesse
FrozenStorm
[Anarchy]/Necro/Growth
P1
2018
WINTER_R7
EricF
[Discipline]/Strength/Finesse
zhavier
[Finesse]/Anarchy/Blood
P2
2018
WINTER_R8
EricF
[Discipline]/Strength/Finesse
Marto
[Discipline]/Strength/Finesse
P1
2018
WINTER_FINAL
EricF
[Discipline]/Strength/Finesse
Marto
[Discipline]/Strength/Finesse
P2
2018
WINTER_FINAL
EricF
[Discipline]/Strength/Finesse
Marto
[Discipline]/Strength/Finesse
P1
2019
SUMMER_R1
EricF
[Anarchy/Blood]/Demonology
FrozenStorm
[Past/Future]/Finesse
P1
2019
SUMMER_R3
bansa
[Law/Peace]/Finesse
zhavier
[Anarchy]/Strength/Growth
P1
2019
SUMMER_R5
EricF
[Anarchy/Blood]/Demonology
bansa
[Law/Peace]/Finesse
P2
2019
SUMMER_R6
zhavier
[Anarchy]/Strength/Growth
EricF
[Anarchy/Blood]/Demonology
P1
2019
SUMMER_R8
zhavier
[Anarchy]/Strength/Growth
bansa
[Law/Peace]/Finesse
P1
2019
SUMMER_R9
zhavier
[Anarchy]/Strength/Growth
EricF
[Anarchy/Blood]/Demonology
P2
2019
SUMMER_FINAL
bansa
[Law/Peace]/Finesse
zhavier
[Anarchy]/Strength/Growth
P1
2019
SUMMER_FINAL
zhavier
[Anarchy]/Strength/Growth
bansa
[Law/Peace]/Finesse
P1
2019
WINTER_R1
bolyarich
[Demon/Necro]/Finesse
EricF
[Feral]/Law/Fire
P2
2019
WINTER_R2
bolyarich
[Demon/Necro]/Finesse
codexnewb
[Anarchy]/Strength/Growth
P1
2019
WINTER_R3
bolyarich
[Demon/Necro]/Finesse
zhavier
[Balance]/Growth/Finesse
P1
2019
WINTER_R3
FrozenStorm
[Future]/Peace/Necromancy
codexnewb
[Anarchy]/Strength/Growth
P2
2019
WINTER_R4
FrozenStorm
[Future]/Peace/Necromancy
EricF
[Feral]/Law/Fire
P2
2019
WINTER_R5
bolyarich
[Demon/Necro]/Finesse
FrozenStorm
[Future]/Peace/Necromancy
P2
2019
WINTER_R6
FrozenStorm
[Future]/Peace/Necromancy
zhavier
[Balance]/Growth/Finesse
P1
2019
WINTER_R7
bolyarich
[Demon/Necro]/Finesse
FrozenStorm
[Future]/Peace/Necromancy
P1
2019
WINTER_R7
codexnewb
[Anarchy]/Strength/Growth
zhavier
[Future]/Necromancy/Peace
P1
2019
WINTER_FINAL
bolyarich
[Demon/Necro]/Finesse
codexnewb
[Anarchy]/Strength/Growth
P2
2020
SUMMER_R2
FrozenStorm
[Necromancy]/Blood/Fire
zhavier
[Future]/Necromancy/Peace
P1
2020
SUMMER_R6
zhavier
[Future]/Necromancy/Peace
FrozenStorm
[Necromancy]/Blood/Fire
P1
2020
SUMMER_R8
zhavier
[Future]/Necromancy/Peace
FrozenStorm
[Necromancy]/Blood/Fire
P2
2020
SUMMER_FINAL
FrozenStorm
[Necromancy]/Blood/Fire
zhavier
[Future]/Necro/Peace
P2
42
26
Findings: Improved data shows 62% P1 win rate which falls in between the previous two. I think I will be comfortable to say that the actual P1 win expectation is not far from this result. It’s not actually as bad as I though it would be. I was expecting mid 60s and even maybe higher. I also think there is room for improvement on P2 win rates. As sample size grows, I hope this falls below 60 which then I think would be a healthy ratio.
Took a while to update the model for the XCAPS21 results – I had some tooling issues to sort out – but they’re now up on the site. bansa has eked EricF out as most likely to be the best active player, and is catching up to Marto as all-time best. I haven’t had time to think through the deck matchup changes yet.