Wednesday, January 23, 2013

What is wrong with Province?

To get us started off on the right foot, I'm going to work with a nice, small subset of the 174k battles that vbaddict has supplied us as a seed.

Why Province? In all honesty, the fact that it is tier limited makes it a very attractive way to keep the data set small(ish), as well as only being availible as standard mode. Being limited to tier 3 or less tanks (tier 2 artillery) means that we are only looking at 53 different tanks that can possibly be in the battle. That means I can set up my spreadsheets before my morning coffee runs out.

Of the 174k battles the vbaddict supplied, 2950 of the took place on Province. I am going to simplify things and assume that vbaddict doesn't capture a large portion of the total battles that take place during the day, thus treating the total battle population as infinite relative to our sample here.
Side 1 starts in the Green Area to the south, Side 2 in the Red to the North

The straight forward and simple:

Side 1 Win Rate: 54.7% with a standard error of 0.9%
Side 2 Win Rate: 41.8% with a standard error of 0.9%
Draw Rate: 3.5% with a standard error of 0.3%

Really, you could have found that on any WoT Statistics site. It's widely known that Province is unbalanced in favor of side 1, anyone who has spent significant time at tier 3 or lower is aware of this. But we're not here to look at things as simple as map biases, we're looking for answers.

One possible explanation for the map bias is issue with the matchmaker when dealing with the tier limitation. Unfortunately  if you look at the distribution of tank tiers and types across the two teams, this theory falls rather flat on its face.
Positive values indicate Side 1 had more of that tank type in battle, Negative side 2
This leaves the bias in the map.

The Logistic Equation: For us, PI is the probability of a win for side 1


Next, since we have the data on the types and tiers of tanks, let's take a look at how they effect your chances of winning. This is going to be a logistic regression on all tiers and types of tanks to see if the proportion of any of them is a significant predictor of success on the battlefield. Note that because of the fact that that we can determine certain variables from the others (ie Tier 1 from Tier 2 and Tier 3 tanks)



The first thing to realize is that we are looking at the data from the point of view of side 1. I set up our predictors as net values, ie the number of artillery on side 1 minus the number of artillery on side 2.

If we look at artillery and tank destroyers, leaving other tanks out to avoid over-determination, there is a slight positive effect from tank destroyers and a bigger negative effect from artillery. So each artillery that you have more than the other team decreases your teams chances by about 10%. Similarly, TDs have a positive effect on win chance, ~1.1%.

The 'Intercept' value tells us what we already know, the base chance of side 1 winning is about 56.7%. As outcomes for the regression can only be absolute win or loss, this is actually the chance of side one winning with out draws as an option. Comparing this to the earlier values, it would seem that introducing draws as an option makes it look like most draws come from Side 2's win chance. This may lead one to a theory that most of the draws on province are from side 2 teams that can't entirely capitalize on a weak side 1.

Now all this sounds good, but there is one more column to take note of: P(>|Z|), also known as the P value. In general, for a result to be statistically significant, you want a P-value of <0.05, meaning a less than 5% chance of obtaining the value by chance, which neither net artillery or TDs manages, each of these having ~20% chance of giving these results even if they had no actual effect on the outcome. However, the Intercept is on the order of 10^-12, which is extremely significant. Thus, from this data set, artillery and TDs are not necessarily good predictors, but the team side is by a large margin.






2 comments:

  1. You know, just a single image with a photo of the map and an explanation of which side is side 1 would have made this piece a 10x better read.

    ReplyDelete
  2. As you have that data, can you check if higher win percentage players have more non-penetrations or more zero damage hits. This should be of course be that they have more of penetrating shots, but as we've played few tanks and gotten good winrate on them, the shots seems to bounce more or not go through. Would be interesting to know if this is just in our heads or if there is actually statistical evidence for it.

    ReplyDelete