More nerdy statistics of BF4 players and battlereports.

This text was originally posted on Symthic forums on 15.1.2015, (original without images).
Title: "More nerdy statistics of BF4 players and battlereports."

Quite simple: Following bunch of text shows distribution of played maps and their fractions.

Map                  N    Frac     
-----------------------------------------------------------
Siege of Shanghai:   1127 (0.11) ##############################
Golmud Railway:      957  (0.10) #########################
Operation Locker:    912  (0.09) ########################
Zavod 311:           687  (0.07) ##################
Paracel Storm:       532  (0.05) ##############
Hainan Resort:       467  (0.05) ############
Rogue Transmission:  426  (0.04) ###########
Dawnbreaker:         422  (0.04) ###########
Flood Zone:          367  (0.04) #########
Lancang Dam:         354  (0.04) #########
Operation Metro:     347  (0.03) #########
Whiteout:            319  (0.03) ########
Giants of Karelia:   313  (0.03) ########
Hammerhead:          306  (0.03) ########
Hanger 21:           302  (0.03) ########
Caspian Border:      279  (0.03) #######
Pearl Market:        270  (0.03) #######
Silk Road:           266  (0.03) #######
Propaganda:          200  (0.02) #####
Gulf of Oman:        159  (0.02) ####
Operation Firestorm: 155  (0.02) ####
Guilin Peaks:        147  (0.01) ###
Operation Mortar:    97   (0.01) ##
Wave Breaker:        94   (0.01) ##
Lumphini Garden:     91   (0.01) ##
Dragon Pass:         88   (0.01) ##
Lost Islands:        85   (0.01) ##
Sunken Dragon:       79   (0.01) ##
Altai Range:         77   (0.01) ##
Nansha Strike:       75   (0.01) #

Of course, this is not the most fairest comparison as Final Stand was released about 2-3 months ago while max age of these reports were 4 months. However we can see

that original maps seem to be quite popular. Final Stand maps seem to come next which can be explained by it's freshness.

I can come up with two possible explanations for this distribution:

There are more non-premium/non-DLC players than premium players. Vanilla map servers are more active because non-premium players -> more players join them -> more games played.
New maps are somewhat "throw-away" and premium players move mostly to the new DLC maps when it comes out.

Of course, these are just assumptions and nothing can be proven by these results.

Next we have distribution of wins per map. Team nations can be changed in server settings but their number (1 or 2) is where their spawn is on the map.

"95% Conf." tells fraction of wins of the team who has more wins. Smaller interval -> more accurate.
"Unbalance" indicates how much 95% Conf. interval's closer edge differs from the 0.5, if at all. One # = 0.005

Map               Team 1 vs. 2   95% Conf.   "Unbalance"
----------------------------------------------------------------------------------
Gulf of Oman          52 vs. 107 (0.67±0.07) ####################
Operation Mortar      65 vs. 32  (0.67±0.09) ###############
Rogue Transmission   274 vs. 152 (0.64±0.05) ###################
Propaganda            76 vs. 124 (0.62±0.07) ##########
Lumphini Garden       35 vs. 56  (0.62±0.10) ###
Nansha Strike         29 vs. 46  (0.61±0.11) 
Giants of Karelia    188 vs. 125 (0.60±0.05) #########
Flood Zone           147 vs. 220 (0.60±0.05) #########
Hammerhead           127 vs. 179 (0.58±0.06) #####
Sunken Dragon         46 vs. 33  (0.58±0.11) 
Wave Breaker          54 vs. 40  (0.57±0.10) 
Caspian Border       159 vs. 120 (0.57±0.06) ##
Silk Road            118 vs. 148 (0.56±0.06) 
Operation Metro      154 vs. 193 (0.56±0.05) 
Whiteout             175 vs. 144 (0.55±0.05) 
Guilin Peaks          67 vs. 80  (0.54±0.08) 
Lancang Dam          192 vs. 162 (0.54±0.05) 
Hainan Resort        251 vs. 216 (0.54±0.05) 
Golmud Railway       449 vs. 508 (0.53±0.03) 
Lost Islands          40 vs. 45  (0.53±0.11) 
Paracel Storm        279 vs. 253 (0.52±0.04) 
Dragon Pass           42 vs. 46  (0.52±0.10) 
Zavod 311            329 vs. 358 (0.52±0.04) 
Hanger 21            145 vs. 157 (0.52±0.06) 
Siege of Shanghai    542 vs. 585 (0.52±0.03) 
Operation Locker     442 vs. 470 (0.52±0.03) 
Pearl Market         131 vs. 139 (0.51±0.06) 
Operation Firestorm   79 vs. 76  (0.51±0.08) 
Altai Range           39 vs. 38  (0.51±0.11) 
Dawnbreaker          210 vs. 212 (0.50±0.05)

Maps are mostly balanced in statistical vision with some exceptions. Most unbalanced maps are big vehicle maps which include lots of things to balance and the size just makes it worse.

For some maps I wouldn't dare to assume if they are unbalanced or not due to lack of samples, especially after considering how sample was taken.

Here we check if sum of rank per team affects the outcome of the game, inspired by my personal hatred towards clan/group stacking.

I calculated sum of ranks and scores (based on score needed for rank) of both teams, then substracted loser's sums from winner's (I simply use terms "rank" and "score" later on),

giving us nice and simple sample of "how much more sum rank/score winner had". Why I included score? Because gaining ranks is non-linear so we can't substract ranks that simply, but I decided to give it a shot.

Image of distribution spoilered due to bright-white background of plot (at least burns my eyes with forum's background color).

On the left we have score's distribution (boxplot and hist), right we have rank's distribution.

Both variables have statistically significant different mean from 0 (p <<< 0.01, t-test). 99% confidence interval for rank is 328-351 and for score is 46.8-50.8, meaning a mean of ranks/scores is inside these intervals with 99% confidence (Not for invidual reports!).

There's definetly indication of rank/score differencies affecting on chances of winning positively, especially when from boxplots we can see how >75% of cases are above zero.

In layman's terms: Team with more high-ranked players has bigger chances of winning.

3VerstsNorth requested a look at how players' skill improves as a function of time played, so lets get to it.

First lets define skill: We have few measurements for this like SPM, KD and KPM. They all tell bit about the player but not everything (eg. Player with KD just might be camper who doesn't help the team).

Lets combine these all into one variable as pmax suggested me by summing z-values of SPM, KPM and log(KD). Lets see the plot of this z-product with observations rounded to closest 10 hours.

A beautiful learning curve. Plotting invidual variables SPM, KPM and log(KD) provide similar results. As the plot shows, there is more variance at right (more time played) and I limited graph to max 1000h due to small number of observations above that.

There's a clear pattern here: Player improves rapidly in first 200h but improving gradually slows down. The pattern is so clear we could try fitting a regression model to find out some values for this:

Formula: z_product = -4.37134 + 0.88833*ln(hours_played)

p-values <<< 0.01 and R² = 0.92

Almost perfect fit, and this is what learning curves seem to be like quite often (Ask 3VerstsNorth, he seems to be acquainted with this). Tests confirm there's a definetly a connection between the two and high R² indicates that time played explains much of the variance of z-product.

Again, in a nutshell: For about first 100 hours players "skill" doubles/triples. However after that it takes about 300h more to double "skill". After this point data is too scattered to say anything for sure but seems to follow logarithmic function.

Update 21st Jan

3VN requested analysing first 50h of gameplay time.

Formula: z_product = -3.46490 + 0.72937*ln(hours_played)

p-values <<< 0.01 and R² = 0.93

Like with the larger data we get noticeable logarithmic function in player's improving. Judging by this, during the first few hours (<10h) player gets twice as better roughly, after which it takes ~30h to get twice as better from that.

The regression model agrees with the first one with slightly different coefficients, which can be explained by highly variating observations.

Does PTFOing really help you to gain more score? Well lets find out!

Lets use flag captures/defends from conquest games as an indicator of "PTFOing" against win/lose ratio and SPM. Quite simply, we then calculate correlations between these different variables.

Correlation in short: Higher the correlation (closer to 1.0 or -1.0) indicates of dependency/connection between the two variables. Positive indicates of rising line, negative of descending line. Doesn't tell if one causes the another!

logwlr = logarithm of win/lose ratio
flag_def/flag_cap = number of flag defends/captures
flag_rib/cq_rib = number of conquest flag cap ribbons and conquest win ribbons

            logwlr       SPM  flag_cap  flag_def  flag_rib    cq_rib
logwlr   1.0000000 0.5178820 0.3673040 0.4073538 0.3662498 0.3707486
SPM      0.5178820 1.0000000 0.4886269 0.5318530 0.4863264 0.4147895
flag_cap 0.3673040 0.4886269 1.0000000 0.9425577 0.9873665 0.9342006
flag_def 0.4073538 0.5318530 0.9425577 1.0000000 0.9236739 0.9560230
flag_rib 0.3662498 0.4863264 0.9873665 0.9236739 1.0000000 0.9273325
cq_rib   0.3707486 0.4147895 0.9342006 0.9560230 0.9273325 1.0000000

I used Spearman's correlation to compensate for scattered values and heavy outliers. All p-values <<< 0.01 so values are statistically significant.

As a result we have quite high correlations: logwlr is about ~0.4 for objective related variables which is reasonably high and clearly indicates of a connection between these two.

SPM correlates even better with objective based variables. Conquest win ribbons correlating very strongly with flag variables can be explained by the fact you get both ribbons just by playing games, but it's still pretty high.

So basically: Yes, PTFOing improves your chances of winning and gives you more score, now with statistical proof. HOWEVER, this doesn't mean PTFOing makes you win always! It just means there's a connection between the two!

Update 21st Jan

People suggested diving flag captures/defends with games played to get something similar to flags capture per game. Problem is that played games variable also includes rounds of other gamemodes, but lets give it a shot.

                 logwlr       SPM game_flag_cap game_flag_def
logwlr        1.0000000 0.3763953     0.2153307     0.3694187
SPM           0.3763953 1.0000000     0.3819391     0.5360870
game_flag_cap 0.2153307 0.3819391     1.0000000     0.4695575
game_flag_def 0.3694187 0.5360870     0.4695575     1.0000000

Correlation between flag captures and win/lose ratio dropped almost 0.2 which is quite a bit, meanwhile flag defends only dropped by ~0.04 which could be well explained by the fact played rounds also included other gamemodes than conquest.

Also correlation with SPM had same kind of effect: Defending of flags didn't change and capturing flags lowered considerably.

This seems to support the conclusion/idea that defending flags is more benefitical for your score and possibly to your team.

I am just going to shamelessly copy/paste my earlier text here:
Quote:

For past week I have been working on script for parsing sample of BF4 players and battlereports for statistical analysis. pmax did this same a while ago but he mentions the data might not been pure random sample as the users were taken from battlelog forums. Also this data didn't include reports nor some of the variables I would have liked.

So what this script does it first scrapes good bunch of servers and report IDs from BF4DB, taking max of ~50 reports per server. For every report id it uses a url left by Battlelog devs to get the report data without need of parsing the HTML (Thanks, DICE / whoever made that!), collects the soldier IDs from the report and then uses BF4Stats API to get soldier data.

@Using Bf4Stats: I just later noticed pmax posted more of those battlelog API urls which have fresh soldier data...

The "rules" I have for selecting reports:

Only Conquest Large or Conquest
PC only
Max starting tickets 1200
Remove players who had < 1000 combatscore (didn't stay in the game too long)
Report happened less than 4 months ago
Only ranked games
Remove players who had < 5h gameplaytime

The rules are there to avoid having unusual outliers in the sample. Unranked games could have any set of rules, servers with high ticket count can have 300 different players in one game, etcetc.

I had 10k battlereports for analysing and ~40k players. I tried to keep the sample taking as random as possible but I am unsure how bf4stats works with their server listing so there might be biased to some direction,
however I believe these serve us at least a general picture of the game's underlaying statistics.

For anybody who is interested on having this data just contact me. Currently it's in ugly python pickle files which are bit clumsy so I need to refine it bit.

Also if you have more questions regarding statistics of BF4 just go ahead and ask, I'll see what I can do.

And one more thing: Thank you for being such an awesome community! It is nice to share something like this with you guys :).

Especially big thanks to 3VerstsNorth for being an awesome doctor and researcher in general. You keep inspiring me! pmax also inspired me with his own research and with lot of help and tips he gave to me!