Thursday, January 31, 2013

Yea, but how important is it?

Over the last few posts, I've been providing you with a look at battles on Province, the crazy low-tier map that's so unbalanced it makes your head hurt. With that look, I gave you a list of tanks and a number that represented how they affect your win chances if you plugged your team composition into a fancy formula.

Today I'm going to take a step back and try to inform you something about their usefulness: how well do they let us predict the outcome of battles?

I do this for two reasons:

1. The only way to know if these numbers mean anything is to apply them to data and see if you get accurate predictions.
2. People who know their statistics keep harping on me to do some legit hypothesis testing/other statistical things.

So, to evaluate how good my regressions are, I'm going to slip off into a slightly different data set: All Tier 1 Battles. 1576 battles were included in the model generation, another 1506 in the validation set (these were separated randomly). Again, T1 Cunninghams were used as the baseline tank.


Once again we see an interesting distribution, ~51% of games are won by side 1, however we can attribute it to:

A. Effects of Province on the total distribution.
B. Random chance.

We know A from the last set of analyses, while we know B from the P-value of the intercept, at a 28.9% chance of getting that value purely by chance, we can't reject the null hypothesis of battles being, on average, 50/50. So, I'm going to run it again, forcing this value:


This obviously isn't much of a difference, but it will give us a good starting point. In summary, from best to worst:

TankOdds Ratio
Leichttraktor
1.09
MS-1
1.019
T1 Cunningham
1
Renault NC-31
0.965
Renault FT-17
0.915
Vickers Med. Mk. 1
0.901

Hey look, I made a table! But I've stuck something in here that you probably haven't seen before, the Odds Ratio. It's really not hard to understand, it's just the amount that your base odds (in this case 0.5, or 50%) is multiplied by for each unit increase in that variable.

Example Battle: Somebody got Annihilated!
Count 'em up: Even LTraktors, Even FTs, Side 1 has +1 MK. I, Side 2 has +1 MS-1, Even T1s. So to predict the outcome is pretty simple:

Overall Odds Ratio = (1.09)0(1.019)-1(0.965)0•(0.915)0(0.901)1= 0.882 or ~9:10 odds

And since P = (Odds)/(Odds + 1):

46.86% Win Chance

Obviously this isn't the best of odds for our eventual winners, so lets see how well it does on other data.

If we were to just choose side 1 to win the entire time, we would get it right ~50% of the time, 51.4% in the case of this data. How well does the model predict it? Well, if we compare it to the data we used to generate the model, it gets 735 cases wrong (By wrong, I mean it gave side 1 a >50% chance of winning but side 2 won, or vice-versa). Remember that we had a total of 1576 battles, so our error rate with this model is 46.6%. In other word, this model only predicts 2% more battles correctly than you would if you laid your money all on one side, 3.4% than if battles are truly balanced. That may be good enough for Vegas odds makers, but not that great for us.

Applying it to the test set, the results aren't particularly good either. In fact, if we apply this data to my test set of 1506 battle, it only get 50.6% right! That's terrible! Scarcely better than flipping a coin! Surely there has to be some way we can make it better?

Well the first thing to do is combine the two data sets. Stick them together and what do you get?

On this one I forced a 50% Win Rate
But not on this one.

From these you can clearly see that doubling the amount of data definitely increased the significance of each of the coefficients except for the LoLTraktor. Further the difference between these two models is pretty small, which is a sign of stability in the outcomes. The intercept shows that 52% of all battles are won by side 1, but now the significance level is such that we're mostly left with option A for an explanation. But now we get a whole new set of Odds Ratios:

Tank50% Model52% Model
Leichttraktor
1.0268
1.0239
T1 Cunningham
1
1
MS-1
0.9516
0.9495
Renault NC-31
0.9179
0.9181
Renault FT-17
0.9106
0.9096
Vickers Med. Mk. 1
0.8583
0.8563

Looking at it this way we can see a new shift in the data, suddenly the MS-1 drops below the T1. But we still need to do the important part, see how well the model predicts.

So, of the total 3082 tier 1 battle, the previous model, based on half the data, had only predicted 1603 right, a tepid 52%. With these new models, we should see some drastic improvement, right?


Behold! The mighty power of the 50% Model! 1550 Correct predictions! An astounding 50.3%! Wait, that's actually WORSE than having no predictors at all! It's scarcely even an improvement on our 50/50 win split assumption! That's terrible!  Maybe the other model will do better...

The power of the 52% Model will astonish you with 1666 correct predictions! 54.05% correct! That's right, 60 more correct predictions! 2% additional accuracy! The slivers of probability that gambling careers are made of. But still not a tool you'd want to bet the house on. Obviously there's more to these battles than tank type.


So basically, the moral of the story is this: At least down here, at tier 1, the differences in tanks are so small that building a prediction tool based soley on tank types isn't going to help you much. No tank is so overpowered that it overrides skill or luck in determining the outcome of the match. rather the differences in tanks is just as likely to be a bias in the players who favor them, which is something we'll have to control for at some point.


As always, if you enjoyed this post, please help us in our data gathering efforts. It's simple, it's easy and you can find out how just by clicking here.

2 comments:

  1. keep this going, I like seeing the info, even if your charts rarely make sense to me

    ReplyDelete
  2. Glad you're enjoying it! Let me know which parts are confusing and I will definitely work to make them easier to understand.

    ReplyDelete