First, this is not about consecutive bad rolls, though that is a factor raw numbers don't show either. 1 6 1 6 1 6 1 6 is a much better roll pattern then 1 1 6 6 1 1 6 6... Skull pow followed by skull pow is way better than double skulls followed by double pows (two good blocks, as opposed to 1 good block with a reroll burned. But data can't really account for that, so we'll leave it.
Ah yes, the magic "I can see it but mathematics and statistics cannot" argument. We see that a lot from people whose excuses don't match the numbers.
I care because I am a stats geek.
No, you're a persistent liar and failed agitator. Not all of us have forgotten you were banned under multiple names, each time coming back and trying (badly) to pretend to be someone different. The problem is you're not very good at starting trouble for Focus/Cyanide.. you mostly just convince people you're not particularly bright.
It's been tried. It's too good.
"Too good" as determined by the same metric that built the rest of the system, which is the BBRC's eyeballing, correct? I'd be interested to know the actual effect as opposed to what a couple of people felt about it.
Those are my rolls for the last game, my opponents were also quite bad, but he did not have a single turn over over double 1´s
The last two games recorded on goblinSpy at the moment are concession matches for you. So here's the analyses on the last non-concession match that is available for you via goblinSpy:
d6 rolls: n = 52, χ2 = 4.54, p = 0.4748
d6 ac1: r = 0.1546, p = 0.1350
d6 mean: 3.2692
d6 mean t = -0.9508, p = 0.3462
Block rolls: n = 16, χ2 = 1.25, p = 0.8698
d6 rolls: n = 76, χ2 = 4.37, p = 0.4977
d6 ac1: r = 0.1169, p = 0.1558
d6 mean: 3.6053
d6 mean t = 0.5045, p = 0.6154
Block rolls: n = 44, χ2 = 0.66, p = 0.9563
d6 rolls: n = 128, χ2 = 4.19, p = 0.5227
d6 ac1: r = 0.0972, p = 0.1367
d6 mean: 3.4688
d6 mean t = -0.1973, p = 0.8439
Block rolls: n = 60, χ2 = 0.35, p = 0.9864
Nothing out of the ordinary there. Even if we ignored family-wise error problems (which we shouldn't) we still don't see anything reaching the level of statistical significance. Not a single p value is below 0.05, much less the 0.004 we'd use to maintain the guaranteed of only a 5% false positive rate across the whole battery of tests.
Mike all that sounds goo, but does not take the context of how the plays were made. Double 1´s happen to often, which pretty much breaks everything for agility teams which has been my point from the beginning.
The chi-square test looks at the distribution of the rolls to see how far off they are from the expected perfect distribution. This means that, for example, if we rolled the d6 60 times during a game our expectation is that we'll see roughly 10 of each dice value. Randomness rarely gives us exactly that, so the chi-square goodness of fit test looks at how far off from that expectation the observed values are.
If you were getting an abnormal amount of double 1s, we'd expect that to show up in that test as there being an abnormal distribution of values... unless its giving a commensurate amount of additional 2's, 3's, 4's, 5's, and 6's. At that point the "dice cheating" is getting pretty complex, since it's having to keep track of the number of times it alters the value it gives you, and compensate with other values in order to avoid being noticed in statistical analysis.
Even if it did that (tinfoil hats on) we look at the pattern of the values using lag-1 autocorrelation (ac1 in the results) which is basic signal analysis meant to differentiate between random noise and numbers that have a non-random pattern to them.
Likewise, we look at the mean value of the dice rolls (which have an "expected" value of 3.5) and see how far off that expected value the mean value is.. again, if it was throwing you more 1's than anything else, that would push your mean d6 value downward, and a t-test of the observed values against the expected value would almost certainly show abnormality... unless, again, its keeping track and compensating with enough > 1 values to push the mean back into expected ranges while simultaneously tracking which numbers it uses in order to avoid running afoul of the chi-square test on the number of each value we see.
I also commented that the roll number are usually the same, but my doubt comes in how has people tested how often double ones, or quadruple ones come up?
The ac1 test I mentioned would notice if there were an abnormal amount of sequential values.
Cyanide, you are fucking retards, I Challenge you to come and see this games and tell me the rolls are normal
While I may not be Cyanide, I'm happy to run your replay through a quick battery of analyses:
d6 rolls: n = 153, χ2 = 5.63, p = 0.3442
d6 ac1: r = 0.0188, p = 0.4080
d6 mean: 3.4575
d6 mean t = -0.3192, p = 0.7500
Block rolls: n = 64, χ2 = 2.52, p = 0.6418
d6 rolls: n = 180, χ2 = 3.73, p = 0.5884
d6 ac1: r = -0.0361, p = 0.3143
d6 mean: 3.4667
d6 mean t = -0.2679, p = 0.7891
Block rolls: n = 131, χ2 = 10.92, p = 0.0275
d6 rolls: n = 333, χ2 = 2.66, p = 0.7526
d6 ac1: r = 0.0116, p = 0.4162
d6 mean: 3.4625
d6 mean t = -0.4136, p = 0.6795
Block rolls: n = 195, χ2 = 11.80, p = 0.0189
The bolded lines are the only ones that even touch upon being abnormal, and even then it's only if we're ignoring family-wise error. If we want to stay within 95% CI across the entire battery of tests the lowest p value would need to be below 0.004, which it isn't.
Most importantly, though, you're not even complaining about block dice, you're complaining about d6 results, and those show as absolutely within normal expectations as far as value distributions go. You're not getting "way more 1s than 6s" or anything of that sort.
The easiest way to explain the tests is to look at the p value, and think of it as a percentage (1 being 100%, so.. 0.5 being 50% and so on) of games where we expect to see a less normal distribution of values than what we see in this one. So, if a p value is 0.75 it means that 75% of all matches can be expected to be LESS normal than this one.