Author Topic: "Average" versus "Expected" in Mathhammer (Read 2257 times)

SeekingOne · « **on:** July 12, 2017, 07:27:36 AM »

Reading the recent post by Blazinghand on long-range anti-tank, I was reminded of one terminology issue that often creeps in when people mathhammer absolute and relative efficiency of weapons and units. And since I long wanted to write about it anyway, this may be a good time to finally do it

What I'm talking about is the use of the term "Expected" like this (highlights are mine):

Quote from: Blazinghand on July 11, 2017, 07:07:14 PM

Damage output:
1 Bright Lance shot, 1 PL giving 2 shots, 6 AEML/Reaper Launcher shots, all at pretty good BS, and with the Reaper-fired weapons suffering no penalties against Hard To Hit:
Total Expected Damage against T7, 3+: 9.4 damage
Total Expected Damage against T8, 3+: 7.1 damage

This is what I see as a significant mistake in the language: these are average values rather than "expected" ones. Now, I know many people who tend to word it like this, and even those who don't specifically use the word "expected" still often tend to see the average as something that can be "expected" in each instance. This might seem a small thing, but imho it is important because it affects the way how people understand and interpret the numbers, and often leads to false expectations, which in turn sometimes result in people getting upset and frustrated by actual dice rolls.

Now, this is a mistake because, maths-wise, average of a random value is NOT what can be expected in each specific realisation of that value. Average is just that - average, a value which you will hit (approximately) if you average out a very long series of instances of a certain random value. However, hitting the average in each particular instance is quite unlikely.
As you probably know, in addition to average a random value would also have standard deviation. The only thing that can be said with any degree of certainty about what can be "expected" of each specific instance of a random value is this: it will likely (as in "more often than not", or with ~68% chance) hit somewhere between (average - standard deviation) and (average + standard deviation). Anything within that range is pretty much equally "expected".

For example, 9 Dark Reapers firing Starshot at a standard vehicle will hit and wound on 3+, followed by 5+ save. This gives us the effective chance of ~0.296 to inflict an unsaved wound with each shot, and the average of ~2.67 unsaved wounds from 9 shots, which will multiply into ~8 points of damage.
But 9 shots with 0.296 chance of a wound per shot will also give us standard deviation of sqrt(9*0.296*(1-0.296)) ~= 1.37. This means that the actual number of wounds from such volley can be expected to be anywhere between 2.67-1.37=1.3 and 2.67+1.37=4.04. However, actual number of wounds cannot be fractional, so rounding to the nearest whole we get the result of anything between 1 and 4 wounds, dealing between 3 and 12 points of damage.

Have to admit, when I did such simple calculation for the first time, it was a bit of a shocking revelation to me - because these simple numbers give a clear explanation of why the 40k in general often feels so unpredictable and luck-dependent. Thing is, statistically the range of 1 to 4 unsaved wounds from 9 Starshots corresponds to a very small deviation from the average. In other words, from statistics standpoint, getting 4 wounds means that your dice went just slightly above average, while getting 1 wound means that your dice went just slightly below average. But think of what a tremendous difference that would make in game terms! 12 damage means a vehicle destroyed outright, while 3 damage means a barely scratched paintwork - and both results are well within of what can be easily expected from each volley. If you never though of this before, I suggest you to roll this in your mind for a while...

This shows one fundamental design problem of the 40k as a game system: random deviations in the results of actual dice rolls which statistically qualify as really very small often have a disproportionally huge impact on the result of a specific action and on the whole tactical situation in the game. And that is precisely what often drives that subjective feeling of being "lucky" or "unlucky" - while in fact both may be well within the boundaries of "expected".

So, while we can (and should) calculate and use averages for our reference, they have little to do with what can be "expected" - because in actual situations we can expect... well, pretty much anything really

Thanks for reading!

P.S.
While what I wrote above is mostly academic, it leads us to at least one very practical conclusion:
When you need to accomplish something in-game (e.g. destroy a specific target), and the calculation shows that on average you should succeed, this means that your actual chance of success is only about 50%. If you want something done more or less reliably, you should aim for an average that is significantly higher than the result that you need to succeed.

Tweedz · « **Reply #1 on:** July 12, 2017, 08:15:28 AM »

Thank you SeekingOne, this was a very good read. I think it is good to go into mathhammering with a clear mindset as to it's limitations.

faitherun (Fay-ith-er-run) · « **Reply #2 on:** July 12, 2017, 12:37:37 PM »

Awesome post and some good points - something I need to start accounting for in my mathammering sessions.

Hey Iris! Can I respectfully flag this post as another to go into that awesome list of useful posts that I am shamelessly pushing for?

Blazinghand · « **Reply #3 on:** July 12, 2017, 02:03:30 PM »

A classic hypothetical would be this. You're shooting at a model with two wounds, like a terminator. You have two weapons whose stats I have just made up for this, that you can choose from:

Reliant Cannon: S6 AP-3 Damage 3
Fun Cannon: S6 AP-3 Damage 1d6

The "Average Wounds Dealt" with the Fun Cannon is 3.5, which is higher than the Reliant Cannon. Does that make the Fun Cannon better? No, because of the possible outcomes. Reliant Cannon is a way better choice for shooting at Terminators. Assuming you hit and wound and the terminator fails his save, here's what the options look like:

Reliant Cannon: Terminator always dies
Fun Cannon: On a roll of a 1, the Terminator takes 1 wound, On a 2+, he dies.

So, as we can see, this is actually difference! Despite having a "higher" average wounds dealt, the Fun Cannon is actually worse. The reverse is possible, too, where a roll with a potential for big changes (like 2d6) is actually quite different from just the average, a flat 7. The odds of 2d6 landing on 7 is actually only 1 in 6. 5 out of 6 times, you're getting a result that isn't 7. Half of that time it will be below 7.

Nonetheless, the idea of average wounds against multiwound vheicles from something like 10 Bright Lances, for example, is actually a useful metric. We must keep variance in mind, but it's still a useful tool. I never expect to deal "9.4 Damage" against anything. In fact, that's literally impossible, you can only deal damage in whole numbers.

A truly accurate evaluation of Anti-tank guns would look something like this:
Bright Lance
Bright Lance vs T7, 3+
44% chance of dealing at least 1 damage (assuming you hit and wound)
22% chance of dealing at least 4 damage (chance of rolling a 4+ on damage)
7% chance of dealing at least 6 damage (if you roll a 6 for damage)
Bright Lance vs T8, 3+
33% chance of dealing at least 1 damage
17% chance of dealing at least 4 damage
5.5% chance of dealing at least 6 damage

We can take this info and determine, for example, that the odds of dealing 12 damage against T7 3+ with two Bright Lance shots is 0.5%. The odds of dealing at least 4 damage against T7 3+ with two Bright Lance shots is at minimum 40% (the odds of at least one of the singular bright lances dealing 4) but likely somewhat higher, due to the odds of one rolling 3 and the other rolling 1, or both of them rolling 2. Back of the envelope tells me this increases the odds by about 1.5%.

So, instead of just saying "Two Bright Lances will deal an average of 3.1 damage" we can now make claims like "Two Bright Lances will have a 41.5% chance of dealing at least 4 damage any time they are fired" and "Two Bright Lances will have a 60% chance of dealing at least two damage when they are fired" and so on.

volatilegaz · « **Reply #4 on:** July 12, 2017, 03:39:40 PM »

Because statistical deviations and probability maths is hard for me, I tend to follow a simple rule:
If you want it to die, dedicate 2 units that each have at least a 50% average chance of killing it.
Then if it doesn't die, you're unlucky.

Edit:
To expand on my point: there's 2 distinct times you want to be looking at probability of damage or survival or whatever: the first is during list building, when you have the luxury of time to work out statistical deviations and the like, but don't know precisely what variables you'll be working with in game (ie what unit will be firing at what unit). For list building, I focus almost entirely on average results per point cost.
Second is during the game, when you know all the variables, but complex maths is beyond most of us in real time. That's when I follow the "work out what's 50-50 going to work, and then double the resources committed" tactic.

In summary: while I absolutely agree that you should never conflate average with probable, I don't see much real-world use for calculating anything more complex than average results in 40k.

Adrastos · « **Reply #5 on:** July 12, 2017, 08:11:33 PM »

Great read. Thanks for taking the time to write that all out. Really enjoyed it.

As ever it reaffirms my favorite piece of strategic advice for 40K: Eliminate randomness and chance as much as possible. The less you leave to the dice the more you leave to your strategic acumen.

SeekingOne · « **Reply #6 on:** July 13, 2017, 03:09:58 AM »

@volatilegas

Quote from: volatilegaz on July 12, 2017, 03:39:40 PM

In summary: while I absolutely agree that you should never conflate average with probable, I don't see much real-world use for calculating anything more complex than average results in 40k.

I absolutely agree with you here. Averages are quite fine, especially when you need to compare different units/weapons. I use averages all the time myself, particularly since I also use Lanchester formulas a lot, and ranges are useless in them. I only calculate deviations from time to time, usually when I want to get a better feel of what to expect of a unit or a weapon that I'm about to settle on. As long as we don't confuse "average" with "likely", averages are the most practical way to estimate random things in 40k.

@Blazinghand
Just to clarify - I certainly didn't mean to challenge your maths or criticise the use of averages in general. As I wrote above, averages are fine and very practical. I was just arguing against the use of the word "expected" instead of "average" in the mathhammer tables )

Your calculations of reliant cannon vs fun cannon are quite interesting - I was planning to do something like that myself, just to see how different the picture would be from simple flat average. Was too lazy to code a full algorithm though

Fenris · « **Reply #7 on:** July 17, 2017, 02:30:49 AM »

I agree, that average is the most useful measurement, however it may at times be useful to know the maximum.
Standard deviation is irrelevant, but statistical deviation in a set scenario, say you need to kill a model with 4 wounds and you weapon does D6 dmg, knowing that is a 50% chance can be useful, especially if deciding wether to re-roll the dice or not.

For example knowing that 3 bright lances can not kill an unharmed wraithknight, while 4 bright lances could.
It's extremely unlikely, however shooting a single bright lance at an unharmed war walker, and using a command point to re-roll damage makes it much more plausible, to kill it outright.

Meanwhile 5 reaper launchers could never kill an unharmed Wraithknight, but renders a higher average damage.
5 Reaper launchers(starshot) average 3.33 dmg, maximum 15 (20 with sunburst).
4 Bright lances average 3.11 dmg, maximum 24.

Dev Null · « **Reply #8 on:** July 17, 2017, 12:19:20 PM »

<delurk><soapbox>
This is why I always (used to) try to push Mathhammer comparisons to the odds of achieving some specific goal, instead of average damage. With vehicles that's easy; what are the odds that it's dead at the end. Nothing else really matters. With troops, I'll just pick a range of the most likely values; what are the odds that we'll get 2 unsaved wounds? 3? 4?

Once upon a time I wrote an article about calculating the odds to kill a vehicle. That was probably half-a-dozen versions of the rules ago by now, so I doubt the actual mechanics are still 100% accurate, but the theory should still hold; updating the spreadsheet should be pretty straightforward if someone was interested.

http://www.40konline.com/index.php?action=articles;sa=view;article=1017

(Edit: Yeah, checked around a bit and the vehicle rules sure did change! Same theory applies though; calculate the odds of killing something, just using the current rules...)
</soapbox><relurk>

DuckWake · « **Reply #9 on:** July 27, 2017, 09:02:30 AM »

I also think the probability distribution is an important factor when evaluating a units perceived effectiveness. Especially in small games if you are only including a single unit for a given role.

Back in 3rd edition I was introducing a friend to the game, and as I fired a unit of 6 guardians at a unit of his marines I explained that each of their 12 attacks had a 1 in 12 chance of killing a marine. So a round of fire from them will kill 1 marine on average.

Sure enough those 6 guardians killed 5 marines. Getting 5+ kills in that scenario has a really low chance of happening http://www.wolframalpha.com/input/?i=more+than+4+12+with+12+12+sided+dice roughly 1 in 518, but, those low chance high success outcomes really skew the distribution curve. Since you have the chance of a great result, the average result that you perceive in the majority of your games is going to be below average.

If you look at the probability distribution, there is a ~35% that they get 0 kills, ~38% chance that they get 1 kill, and a ~26% chance that they get 2 or more kills. If you expect 1 kill per round of shooting from this unit and you sample a random set of results from your past battles, there is a greater than 50% probability that those results are going to be below average. This averages out over a larger sample size, but if you are relying on a small number of key units, you need to expect that most of the time they are delivering slightly below average results, and then occasionally they shine.

SeekingOne · « **Reply #10 on:** August 1, 2017, 10:56:50 AM »

Quote from: DuckWake on July 27, 2017, 09:02:30 AM

If you look at the probability distribution, there is a ~35% that they get 0 kills, ~38% chance that they get 1 kill, and a ~26% chance that they get 2 or more kills. If you expect 1 kill per round of shooting from this unit and you sample a random set of results from your past battles, there is a greater than 50% probability that those results are going to be below average. This averages out over a larger sample size, but if you are relying on a small number of key units, you need to expect that most of the time they are delivering slightly below average results, and then occasionally they shine.

This, exactly.

This logic happens to be particularly apparent with low probability rolls, like weak saving throws of 5+ or 6+. On average they stop 33% and 17% of wounds respectively; however, looking at actual rolls you may notice that if a unit with such save has low wounds count, it can easily be wiped out without making a single save at all in the process. But when there are multiple units like that, and handfuls of those saves are rolled time after time, occasionally they shine and you suddenly find yourself successfully making well over half of 5+ saves or almost half of 6+ saves in a single roll, averaging things out.

This is why certain upgrades that give models extra (weak) saves work best when applied to high numbers of models with lots of wounds. This way it is most likely that the points invested in them would actually pay off in every game.

Dev Null · « **Reply #11 on:** August 7, 2017, 11:34:05 AM »

Those low probability rolls really break averages when you start talking about target models/units with low wound counts. You may have only a tiny chance of getting 5 or 6 unsaved wounds, but if the chance exists, then it drives the average up. If your target only has 4 wounds in the first place, that higher average isn't even reflecting the chance of getting lucky and doing exceptionally well...

ARMIES

Members Online

Author Topic: "Average" versus "Expected" in Mathhammer (Read 2257 times)

SeekingOne

"Average" versus "Expected" in Mathhammer

Tweedz

Re: "Average" versus "Expected" in Mathhammer

faitherun (Fay-ith-er-run)

Re: "Average" versus "Expected" in Mathhammer

Blazinghand

Re: "Average" versus "Expected" in Mathhammer

volatilegaz

Re: "Average" versus "Expected" in Mathhammer

Adrastos

Re: "Average" versus "Expected" in Mathhammer

SeekingOne

Re: "Average" versus "Expected" in Mathhammer

Fenris

Re: "Average" versus "Expected" in Mathhammer

Dev Null

Re: "Average" versus "Expected" in Mathhammer

DuckWake

Re: "Average" versus "Expected" in Mathhammer

SeekingOne

Re: "Average" versus "Expected" in Mathhammer

Dev Null

Re: "Average" versus "Expected" in Mathhammer