Saturday, January 26, 2013

That Stiff Upper Lip won't help you down here!

Since my full spreadsheet is still unfinished (give me a break, I've unlocked three tier 8s over the last three days!) and I still need to see just how much information we can develop from the data, I decided to go ahead and take a closer look at how the tanks of different nations perform on Province.

Before I show you the numbers, I want to give you a better understanding of what they mean. All of my numbers are generated using what's called Logistic Regression, which a fancy way of saying we can assign a value to the factors that effect your chances of winning. We get to choose the factors we think will be important, grab them in a data set and throw them at the little computer daemons in a package called R to work their statistical magic and give us these values, which we must then interpret.

When you go to perform a regression in R (or when you do any statistical regression really), you have to give it a function to regress. So far I have (and for the near term I will continue to) functions that are pretty simple: WinChanceSide1 = a + b*NetArtillery + c*NetTDs being the one from the previous post (you can see this at the top of the image). I had another variable to add there, NetTanks, that I didn't include because it would 'overdetermine' the system (NetTanks+NetArtillery+NetTDs=0, so if you know two, you can always find the third). This meant that Tanks were the baseline for the regression and you can figure out the win chances for each team from the values generated.

Now that that's all explained (and hopefully makes sense) let's take a look at the breakdown by nationality:


Interesting, apparently, at least on province, the British low tiers are REALLY underpowered. In fact, that's almost as bad as the artillery from the last post, but much, much more significant. In fact, down here it seems being anything but German is a Bad Idea(TM). Now, once again, I have to caution that these values are fairly insignificant, except for the British, and nationality is mostly a non-effect. In fact, looking at the American values, despite popping up with a slight negative value in this regression, based on the small size of the effect and the significance, I think it can be safely said that the American sub-tier 3 tanks are almost perfectly balanced. More data is probably necessary to identify if the Soviet and German biases are real or imagined (insert plug for people to send us replays with our Uploader here).

Now there's one last category I want to take a look at before we go ahead and regress on all of the tier 1-3 tanks. Premium tanks are generally intended to be "better than a stock tank, but worse than a same tier elited tank." So, down at these low tiers, versus other low tiers, on a very specific map, the results may surprise. After all, people who play low-tier premiums tend to be very experienced and have a greater likelihood of having a good crew. But let's take a look:


Holy unbalanced Batman! Here is a category of tank that gives a HUGE increase in win rates. Having just one extra on your team gives you a whole 5% better chance of winning. That's pretty damn imbalalanced if you ask me...

So now we have three good predictors for winning on province: What side you are on, how many more British tanks do you have and how many more premiums. So I'm going to make us a simple model for guessing how likely you are to win on province, using these three variables:


This gives us a pretty simple, but generally accurate guess of win likelihood from just those three questions:

Chance of Side 1 Winning = (e -(0.275+0.182*NetPremiums-0.08*NetBritish) +1)-1

Chance of Side 2 Winning = 1 - Chance of Side 1 Winning

In case you forgot which side is which


Again, this makes no allowances for draws, but as we have demonstrated, almost all of the draws come out of side 2's win rate, so we can estimate it as:

Chance of Side 2 Winning = 1 - Chance of Side 1 Winning - 0.03

Next time I plan to perform a regression over the whole of tier 1-3 tanks, which I expect to have a lot of low significance results until more data is available. Then we can move on to bigger and better things...

Please, if you enjoyed this post and want to see more like them, be sure to help by contributing your replays! It's fast and simple, just see how here.

No comments:

Post a Comment