It is possible that a household which installs the product only achieves a fraction of the bill reduction promised. They may even see their bill increase. But even in these scenarios, the statements of expected energy savings that appear on products can remain completely true. In this blog post I explain why presenting a simple number is not sufficient and describe what information is required to make a sound assessment of a retrofit product.

Always include uncertainty

On day-one of my undergraduate lab sessions, we learned that any measured number, if quoted without something called “uncertainty”, is meaningless. It’s an incredibly important point, but one which most people are unaware of. To understand what “uncertainty” is, imagine a tailor needs to know your waist measurements. You report it as 34 inches. But in reality, your waist measurement will not be 34 inches exactly. To begin with, your tape measure will only have a certain precision. Perhaps you also rounded the result up or down, or perhaps you only took one measurement when you were breathing in. How close you were able to get to your true waist size depends on your measurement device and how many measurements you took. You therefore need to give an estimate of how precise, or uncertain, you believe your measurement to be. By convention, this “uncertainty” is quoted as a range around your number using the “±” symbol (pronounced “plus-minus”). If you believe you measured your waist to within half an inch, you will quote your waist as 34 ± 0.5 inches. The tailor can then read this is as “the persons waist is in the range of 33.5 and 34.5 inches”, and they can make your clothes accordingly. If you quote your waist at as 34 ± 5 inches, the tailor will know you didn’t do a very good job and would ask you to take more measurements using a better tape measure.

From an energy efficiency standpoint, a quoted reduction in bills of 14% is also meaningless. An uncertainty always exists for a measured number, and this uncertainty needs to be stated. It might be 14 ± 0.1 %, but it might also be 14 ± 13%*. In this latter scenario, I might get savings close to the upper bound of 27%, but I also risk getting savings close to the lower bound of 1%. I, as a consumer, need to know this information. The uncertainty could even be 14 ± 30%, in which case I risk getting a negative percentage of savings and having higher fuel bills because of the product.

Clearly uncertainties are important. I have been telling people for years that the need to include uncertainties in results. Let’s imagine that I successfully lobby the whole retrofit market on the importance of uncertainties. The product now reads “this product reduces the homeowners bills by an average of 14 ± 0.1 %”, can we now make a more informed decision? Not necessarily, as there is a statistical sleight-of-hand at play here which may still mislead the consumer. This is due to the use of the word average.

Beware of uncertainties on averages

All measured numbers should have an uncertainty. That means that when you calculate an average, we need to estimate how uncertain our average is too. As long as our numbers behave themselves, this is straightforward and follows a simple equation**. But one important feature of this equation, is that the uncertainty on an average will decrease if our sample size increases. For example, say our tailor wants to estimate the average waist size of all the people that walk into the shop. They take a sample of the first 100 people and find their waists are between 30 and 40 inches, with a mean average of 35 ± 3 inches***. The next week, another 100 customers are measured, and it is found that the waists are very similar to the first group, still lying within the range 30 and 40 inches, but our mean is now 35 ± 2 inches. If we were to repeat this week on week, the uncertainty in the mean would keep getting smaller, even if the mean itself remains unchained. This is because the increased amount of data means we understand our average more and more.

The uncertainty on an average can change just by including more data. This suggests an important point; the uncertainty on average does not describe how spread out the original data is. If the tailor only makes clothes at the average waist of 35 ± 2 inches, then they will clearly be excluding many of their customers, as they have waist sizes in the range 30 to 40 inches. Likewise, if a product states it “reduces the homeowners bills by an average of 14 ± 0.1 %.”, then all I know is that the average is well defined. I don’t know anything about the sample from which the average was calculated. The savings which I can expect may still lie in that range of 14 ± 30 %. If I want to make sure I won’t see my bills increase, I need to know the average savings, along with some measure of how dispersed the data was. This dispersion measure could be the sample uncertainty. But the sample range, interquartile range, or standard deviation could also be used as a measure of the dispersion. As long as some information is included that describes how spread out the underlying data is, the consumer can make a much more informed decision.

Conclusion

There is nothing inherently wrong with quoting an average and the uncertainty on that average. Indeed, some audiences may request that value. If the government were installing a retrofit product in lots of homes, the average and its uncertainty would be useful in telling them the savings they can expect on a large scale. But the average and its uncertainty also do not paint a full enough picture. All stakeholders, whether its consumers, governments, or housing associations, would want to know the full range of possibilities for a consumer. Can I achieve even better savings than those quoted? Is it likely that a product will go wrong and make my efficiency worse? This information is contained in the average and the sample dispersion. It’s this information which, in my view, should be quoted as the headline on packaging and in reports. This would be fairer for both consumers, and for producers of retrofit products. It would allow lower risk retrofit products to be identified.

Other causes of misleading results include cherry picked data, unrepresentative samples, or incorrect assumptions in models. But the statistical sleight-of-hand using averages and their uncertainty is something I have seen used with more frequency. And unfortunately, it tends to go unnoticed.

* An example of a “symmetric” uncertainty, in which the range of possible values is the same on both sides. But the uncertainty does not have to be the same on each side. It might be 14% (plus or minus 5-20%) which a customer should read as “The expected savings lie in the range of -6% to 19%”.

** “Well behaved” refers to the data following a “normal” or “gaussian” distribution. Lots of numbers do follow this distribution, and it makes the maths much easier if they do.

*** If our data does follow a normal distribution, then when we make statements like this, we are actually saying there is a 95% chance that the value lies in that range. You could also define your uncertainty with a 99.7% chance, or a 99.99% chance, in which case your range would get slightly larger. Note that you can never push it to 100% though. Statistics is funny in that way.

More from the blog

All blogs