Opinion

Image of polling results

Just how accurate are Opinion Polls?

Just after the Brexit referendum result became known, in late June 2016, several newspapers ran stories on how the opinion polls “got it wrong”.

Typical of these was an article in the Guardian from 24th June 2016, with the headline “How the pollsters got it wrong on the EU referendum.”  In it the journalist observed:

“Of 168 polls carried out since the EU referendum wording was decided last September, fewer than a third (55 in all) predicted a leave vote.”

Of course, this is neither the first nor the last time pollsters have come in for some criticism from the media.  (Not that it seems to stop journalists writing articles about opinion polls of course).

But sensationalism aside, how accurate are polls?  In this article, I’ll explore how close (or far away) the polls came to predicting the Brexit result.  And what lessons we might draw from this for the future.

The Brexit Result

On 23rd June 2016, the UK voted by a very narrow margin (51.9% to 48.1%) in favour of Brexit.  However, if we just look at polls conducted near to the referendum, the general pattern was to predict a narrow result.  In that respect the polls were accurate. 

Taking an average of all these polls, the pattern for June showed an almost 50/50 split, with a slight edge in favour of the leave vote.  So, polls taken near to the referendum predicted a narrow result (which it was) and, if averaged, just about predicted a leave result (which happened).

Chart of brexit vote v. result polls

To compare the predictions with the results, I’ve excluded people who were ‘undecided’ at the time of the surveys.  Since anyone still ‘undecided’ on the day would presumably not have voted at all.

Of course, the polls did not get it spot on.  But that is because we are dealing with samples.  Samples always have a margin of error, so cannot be expected to be spot on.

Margin of error

The average sample size of polls run during this period was 1,799 (some had sample sizes as low as 800; others, several thousands).  However, on a sample size of 1,799 a 50/50 result would have a margin of error of +/- 2.3%.  That means if such a poll predicts 50% of people were going to vote leave, we could be 95% confident that between 48% and 52% would vote leave.

In the end, the average of all these polls came to within 1.7% of predicting the exact result.  That’s close!  It’s certainly within the margin we’d expect.

You might wonder why polls don’t use bigger samples to improve the margin?  If a result looks like being close, you’d think it might be worth using a large enough sample to reduce the error margin. 

Why not, for example, use a sample big enough to reduce the statistical error margin to 0.2% – a level that would provide a very accurate prediction?  To achieve that you’d need a sample of around 240,000!  That’s a survey costing a whopping 133 times more than the typical poll!  And that’s a cost people who commission polls would be unwilling to bear.

Data Collection

Not all polls are conducted in the same way, however. Different pollsters have different views as to the best ways to sample and weight their data.  Most of these differences are minor and all reflect the pollster’s experience of what approaches have delivered the most accurate results in the past.  Taking a basket of several polls together would create a prediction more likely to iron out any outliers or odd results resulting from such differences.

However, there is one respect in which polls fall into two potentially very different camps when it comes to methodology.  Some are conducted online, using self-completed surveys, where the sample is drawn from online consumer panels.  Others are conducted by telephone, using randomly selected telephone sample lists.

Both have their potential advantages and disadvantages:

  • Online: not everyone is online and not everyone is easy to contact online.  In particular older people might be less likely to use the internet that often.  So, any online sample would under-represent people with limited internet access.
  • Telephone:  not everyone is accessible by phone.  Many of these sample lists may be better in terms of reaching out to people with landlines than mobiles.  That might make it difficult to access some younger people who have no landline, or people using the telephone preference service.

But, that said, do these potential gaps make any difference?

Online vs Telephone Polling

So, returning to the Brexit result, is there any evidence to suggest either methodology provides a more accurate result?

chart of brexit result v. online and telephone polls

A simple comparison between the results predicted by the online polls vs the telephone polls conducted immediately prior to the referendum reveals the following:

  • Telephone polls: Overall, the average for these polls predicted a 51% majority in favour of remain.
  • Online polls: Overall, the average for these polls predicted a win for the leave vote by 50.5% (in fact it was 51.9%)

On the surface of things, the online polls appear to provide the more accurate prediction.  However, it’s not quite that simple.

Online polls are cheaper to conduct than telephone polls.  As a result, online polls can often afford to use larger samples.  This reduces the level of statistical error.  In the run up to the referendum the average online poll used a sample of 2,406 vs. an average of 1,038 for telephone polls.

The greater accuracy of the online polls over this period could therefore be largely explained simply by the fact that they were able to use larger samples.  As telephone is a more expensive medium, it is undeniably easier to achieve a larger sampler via the online route.

Accuracy over time

You might expect that, as people get nearer to the time of an election, they are more likely to come to a decision as to how they will vote.

However, our basket of polls in the month leading up to the Brexit vote shows no signs of the level of people who were ‘undecided’ changing.  During the early part of the month around 10% consistently stated they had not decided.  Closer to the referendum, this number remained much the same.

However, when we look at polls conducted in early June vs polls conducted later, we see an interesting contrast.  As it turns out, polls conducted early in June predicted a result closer to the actual result than those conducted closer to the referendum.

In fact, it seems that polls detected a shift in opinion that seems to have occurred around the time of the assassination of the MP, Jo Cox.

Chart of brexit result v polls taken before and after the killing of Jo Cox

Clearly, the average for the early month polls predicts a result very close to the final one.  The basket of later polls however, despite the advantage of larger samples, are off the mark by a significant margin.  It is these later polls that reinforced the impression in some people’s minds that the country was likely to vote Remain.

But why?

Reasons for mis-prediction

Of course, it is difficult to explain why surveys seemed to show a result that was a little way off the final numbers so close to the event. 

If we look at opinion surveys conducted several months before the referendum, then differences become easier to explain.  People change their minds over time and other people who are wavering will make up their minds.

A referendum conducted in January 2016 would have delivered a slightly different result to the one in June 2016.  This would be purely because a slightly different mix of people would have voted.  Also, because some people would have held a different opinion in January to that which they held in June.

However, by June 2016, you’d expect that a great many people would have made up their minds.

Logically, however, there are four reasons I can think of as to why there might be a mis-prediction by polls conducted during this period:

  1. Explainable statistical error margins.
  2. Unrepresentative approaches.
  3. Expressed intentions did not match actual behaviour.
  4. “Opinion Magnification”.

Explainable statistical error margins

Given the close nature of the vote, this is certainly a factor.  Polls of the size typically used here would find it very difficult to precisely predict a near 50/50 split. 

51.9% voted Leave.  A poll of 2000 could easily have predicted 49.7% (a narrow reverse result) and still be within an acceptable statistical error margin. 

18 of the 31 polls (58%) conducted in June 2016 returned results within the expected margin of statistical error vs the final result.  If they got the result wrong (which 3 did), this can be explained purely by the fact that the sample size was not big enough.

However, this means that 13 returned results that can’t be accounted for by expected statistical error alone. 

If we look at surveys conducted in early June, 6 returned results outside the expected bounds of statistical variance.  However, this was usually not significantly outside those bounds (just 0.27% on average). 

The same cannot be said of surveys conducted later in June.  Here polls were getting the prediction wrong by an average of 1.28% beyond the expected range.  All the surveys (7 in total) that predicted a result outside of the expected statistical range, consistently predicted a Remain win.

This is too much of a coincidence.  Something other than simple statistical error must have been at play.

Unrepresentative approaches

Not everyone is willing (or able) to answer Opinion Polls. 

Sometimes a sample will contain biases.  People without landlines would be harder to reach for a telephone survey.  People who never or rarely go online will be less likely to complete online surveys.

These days many pollsters make a point of promising a ‘quick turnaround’.  Some will boast that they can complete a poll of 2,000 interviews online in a single day.  That kind of turnaround is great news for a fast-paced media world but will under-represent infrequent internet users.

ONS figures for 2016 showed that regular internet use was virtually universal amongst the under 55s.  However, 12% of 55–64-year-olds, 26.9% of 65–74-year-olds and 61.3% of the over 75s had not been online for three months in June 2016. Older people were more likely to vote Leave.  But were the older people who don’t go online more likely to have voted Leave than those who do?

It is hard to measure the effect of such biases.  Was there anything about those who could not / would not answer a survey that means they would have answered differently?  Do they hold different opinions?

However, such biases won’t explain why the surveys of early June proved far better at predicting the result than those undertaken closer to the vote. 

Expressed intention does not match behaviour

Sometimes, what people do and what they say are two different things.  This probably doesn’t apply to most people.  However, we all know there are a few people who are unreliable.  They say they will do one thing and then go ahead and do the opposite.

Also, it is only human to change your mind.  Someone who planned to vote Remain in April, might have voted Leave on the day.  Someone undecided in early June, may have voted Leave on the day.  And some would switch in the other direction.

Without being able to link a survey answer to an actual vote, there is no way to test the extent to which peoples stated intentions fail to match their actual behaviour.

However, again, this kind of switching does not adequately explain the odd phenomenon we see in June polling.  How likely is it that people who planned to vote Leave in early June, switched to Remain later in the month and then switched back to Leave at the very last minute?  A few people maybe, but to explain the pattern we see, it would have to have been something like 400,000 people.  That seems very unlikely.

The Assassination of Jo Cox

This brings us back to the key event on 16 June – the assassination of Jo Cox.  Jo was a labour politician who strongly supported the Remain campaign and was a well-known champion of ethnic diversity.  Her assassin was a right-wing extremist who held virulently anti-immigration views.

A significant proportion of Leave campaigners cited better immigration control as a key benefit of leaving the EU.  Jo’s assassin was drawn from the most extremist fringe of such politics.

The boost seen in the Remain vote recorded in the polls that followed her death were attributed at the time to a backlash against the assassination.  That some people, shocked by the implications of the incident, were persuaded to vote Remain.  Doing so might be seen by some as an active rejection of the kind of extreme right-wing politics espoused by Jo’s murderer.

At the time it seemed a logical explanation.  But as we now know, it turned out not to be the case on the day.

Reluctant Advocates

There will be some people who will, by natural inclination, keep their voting intentions secret. 

Such people are rarely willing to express their views in polls, on social media, or even in conversation with friends and family.  In effect they are Reluctant Advocates.  They might support a cause but are unwilling to speak out in favour of it.  They simply don’t like drawing attention to themselves.

There is no reason to suspect that this relatively small minority would necessarily be skewed any more or less to Leave or Remain than everyone else.  So, in the final analysis, it is likely that the Leave and Remain voters among them will cancel each other out. 

The characteristic they share is a reluctance to make their views public.  However, the views they hold beyond this are not necessarily any different from most of the population.

An incident such as the assassination of Jo Cox can have one of two effects on public opinion (indeed it can have both):

  • It can prompt a shift in public opinion which, given the result, we now know did not happen.
  • It can prompt Reluctant Advocates to become vocal, resulting in a phenomenon we might call Opinion Magnification.

Opinion Magnification

Opinion Magnification creates the illusion that public opinion has changed or shifted to a greater extent than it actually has.  This will not only be detected in Opinion Polls but also in social media chatter – indeed via any media through which opinion can be voiced.

The theory being that the assassination of Jo Cox shocked Remain supporting Reluctant Advocates into becoming more vocal.  By contrast, it would have the opposite effect on Leave supporting Reluctant Advocates. 

The vast majority of Leave voters would clearly not have held the kind of extremist views espoused by Jo’s assassin.  Indeed, most would have been shocked and would naturally have tried to distance themselves from the views of the assassin as much as possible.  This fuelled the instincts of Leave voting Reluctant Advocates to keep a low profile and discouraged them from sharing their views.

If this theory is correct, this would explain the slight uplift in the apparent Remain vote in the polls.  This artificial uplift, or magnification, of Remain supporting opinion would not have occurred were it not for the trigger event of 16 June 2016.

Of course, it is very difficult to prove that this is what actually occurred.  However, it does appear to be the only explanation that fits the pattern we see in the polls during June 2016.

Conclusions

Given the close result of the 2016 referendum, it was always going to be a tough prediction for pollsters.  Most polls will only be accurate to around +/- 2% anyway, so it was ever a knife edge call.

However, in this case, in the days leading up to the vote, polls were not just out by around 2% in a few cases.  They were out by a factor of around 3%, on average, predicting a result that was the reverse of the actual outcome.

Neither statistical error, potential biases nor any disconnect between stated and actual voting behaviour can adequately account for the pattern we saw in the polls. 

A more credible explanation is distortion by Opinion Magnification prompted by an extraordinary event.  However, as the polling average shifted no more than 2-3%, the potential impact of this phenomenon appears to be quite limited.  Indeed, in a less closely contended vote, it would probably not have mattered at all.

Importantly, all this does not mean that polls should be junked.  But it does mean that they should not be viewed as gospel.  It also means that pollsters and journalists need to be alert for future Opinion Magnification events when interpreting polling results.

About Us

Synchronix Research offers a full range of market research services, polling services and market research training.  We can also provide technical content writing services & content writing services in relation to survey reporting and thought leadership.

For any questions or enquiries, please email us: info@synchronixresearch.com

You can read more about us on our website.  

You can catch up with our past blog articles here.

Sources, references & further reading:

How the pollsters got it wrong on the EU referendum, Guardian 24 June 2016

ONS data on internet users in the UK

Polling results from Opinion Polls conducted prior to the referendum as collated on Wikipedia

FiveThirtyEight – for Nate Silver’s views on polling accuracy

Inputting credit card data onto a laptop

Working with Digital Data Part 2 – Observational data

One of the most important changes brought about by the digital age is the availability of observational data.  By this I mean data that relates to an observation of actual online consumer behaviour.  A good example would be in tracing the journey a customer takes when buying a product.

Of course, we can also find a lot of online data relating to attitudes and opinions but that is less revolutionary.  Market Research has been able to provide a wealth of that kind of data, more reliably, for decades.

Observational data is different – it tells us about what people actually do, not what they think (or what they think they do).  This kind of behavioural information was historically very difficult to get at any kind of scale without spending a fortune.  Not so now.

In my earlier piece I had a look at attitudinal and sentiment related digital data.  In this piece I want to focus on observational behavioural data, exploring its power and its limitations.

Memory vs reality

I remember, back in the 90s and early 2000s, it was not uncommon to be asked to design market research surveys aimed at measuring actual behaviour (as opposed to attitudes and opinions). 

Such surveys might aim to establish things like how much people were spending on clothes in a week, or how many times they visited a particular type of retail outlet in a month, etc.  This kind of research was problematic.  The problem lay with people’s memories.  Some people can recall their past behaviour with exceptional accuracy.  However, others literally can’t remember what they did yesterday, let alone recall their shopping habits over the past week.

The resulting data only ever gave an approximate view of what was happening BUT it was certainly better than nothing.  And, for a long time, ‘nothing’ was usually the only alternative.

But now observational data, collected in our brave new digital world, goes some way to solving this old problem (at least in relation to the online world).  We can now know for sure the data we’re looking at reflects actual real-world consumer behaviour, uncorrupted by poor memory.

Silver Bullets

Alas, we humans are indeed a predictable lot.  New technology often comes to be regarded as a silver bullet.  Having access to a wealth of digital data is great – but we still should not automatically expect it to provide us with all the answers.

Observational data represents real behaviour, so that’s a good starting point.  However, even this can be misinterpreted.  It can also be flawed, incomplete or even misleading.

There are several pitfalls we ought to be mindful of when using observational data.  If we keep these in mind, we can avoid jumping to incorrect conclusions.  And, of course, if we avoid drawing incorrect conclusions, we avoid making poor decisions.

Correlation in data is not causation

It may be an old adage in statistics, but it is even more relevant today, than ever before.  For my money, Nate Silver hit the nail on the head:

“Ice cream sales and forest fires are correlated because both occur more often in the summer heat. But there is no causation; you don’t light a patch of the Montana brush on fire when you buy a pint of Haagan-Dazs.”

[Nate Silver]

Finding a relationship in data is exciting.  It promises insight.  But, before jumping to conclusions, it is worth taking a step back and asking if the relationship we found could be explained by other factors.  Perhaps something we have not measured may turn out to be the key driver.

Seasonality is a good example.  Did our sales of Christmas decorations go up because of our seasonal ad-campaign or because of the time of year?  If our products are impacted by seasonality, then our sales will go up at peak season but so will those of our competitors.  So perhaps we need to look at how market share has changed, rather than basic sales numbers, to see the real impact of our ad campaign.

Unrepresentative Data

Early work with HRT seemed to suggest that women on HRT were less susceptible to heart disease than other women.  This was based on a large amount of observed data.  Some theorised that HRT treatments might help prevent heart disease. 

The data was real enough.  Women who were on HRT did experience less heart disease than other women.

But the conclusion was utterly wrong.

The problem was that, in the early years of HRT, women who accessed the treatment were not representative of all women. 

As it turned out they were significantly wealthier than average.  Wealthier women tend to have access to better healthcare, eat healthier diets and are less likely to be obese.  Factors such as these explained their reduced levels of heart disease, not the fact that they were on HRT.

Whilst the completeness of digital data sets is improving all the time, we still often find ourselves working with incomplete data.  Then it is always prudent to ask – is there anything we’re missing that might explain the patterns we are seeing?

Online vs Offline

Naturally, digital data is a measure of life in the online world.  For some brands this will give full visibility of their market since all, or mostly all, of their customers primarily engage with them online.

However, some brands have a complex mix of online and offline interactions with customers.  As such it is often the case that far more data exists in relation to online behaviour than to offline.  The danger is that offline behaviour is ignored or misunderstood because too much is being inferred from data collected online.

This carries a real risk of data myopia, leading to us becoming dangerously over-reliant on insights gleaned from an essentially unrepresentative data set. 

Inferring influence from association

Put simply – do our peers influence our behaviour?  Or do we select our peers because their behaviour matches ours?

Anna goes to the gym regularly and so do most of her friends.  Let’s assume both statements are based on valid observation of their behaviour.

Given such a pattern of behaviour it might be tempting to conclude that Anna is being influenced by ‘herd mentality’. 

But is she? 

Perhaps she chose her friends because they shared similar interests in the first place, such as going to the gym? 

Perhaps they are her friends because she met them at the gym?

To identify the actual influence, we need to understand the full context.  Just because we can observe a certain pattern of behaviour does not necessarily tell us why that pattern exists.  And if we don’t understand why a certain pattern of behaviour exists, we cannot accurately predict how it might change.

Learning from past experiences

Observational data measures past behaviour.  This includes very recent past behaviour of course (which is part of what makes it so useful).  Whilst this is a useful predictor of future behaviour, especially in the short term, it is not guaranteed.  Indeed, in some situations, it might be next to useless. 

But why?

The fact is that people (and therefore markets) learn from their past behaviour.  If past behaviour leads to an undesirable outcome they will likely behave differently when confronted with a similar situation in future.  They will only repeat past behaviour if the outcome was perceived to be beneficial.

It is therefore useful to consider the outcomes of past behaviour in this light.  If you can be reasonably sure that you are delivering high customer satisfaction, then it is less likely that behaviour will change in future.  However, if satisfaction is poor, then there is every reason to expect that past behaviour is unlikely to be repeated. 

If I know I’m being watched…

How data is collected can be an important consideration.  People are increasingly aware their data is being collected and used for marketing purposes.  The awareness of ‘being watched’ in this way can influence future behaviour.  Some people will respond differently and take more steps than others to hide their data.

Whose data is being hidden?  Who is modifying their behaviour to mitigate privacy concerns?  Who is using proxy servers?  These questions will become increasingly pressing as the use of data collected digitally continues to evolve.  Will a technically savvy group of consumers emerge who increasingly mask their online behaviour?  And how significant will this group become?  And how different will their behaviour be to that of the wider online community?

This could create issues with representativeness in the data sets we are collecting.  It may even lead to groups of consumers avoiding engagement with brands that they feel are too intrusive.  Could our thirst for data, in and of itself, put some customers off?  In certain circumstances – certainly yes.  This is already happening.  I certainly avoid interacting with websites with too many ads popping up all over the place.  If a large ad pops up at the top of the screen, obscuring nearly half the page, I click away from the site immediately.  Life is way too short to put up with that annoying nonsense.

Understanding why

By observing behaviour, we can see, often very precisely, what is happening.  However, we can only seek to deduce why it is happening from what we can see. 

We might know that person X saw digital advert Y on site Z and clicked through to our website and bought our product.  Those are facts. 

But why did that happen?

Perhaps the advert was directly responsible for the sale.  Or perhaps person B recommended your product to person X in the bar, the night before.  Person X then sees your ad the next day and clicks on it.  However, the truth is that the ad only played a secondary role in selling the product – an offline recommendation was key.  Unfortunately, the key interaction occurred offline, so it remained unobserved.

Sometimes the only way to find out why someone behaved in a certain way is to ask them.

Predicting the future

Forecasting the future for existing products using observational data is a sound approach, especially when looking at the short-term future.

Where it can become more problematic is when looking at the longer term.  Market conditions may change, competitors can launch new offerings, fashions shift etc.  And, if we are looking to launch a new product or introduce a new service, we won’t have any data (in the initial instance) that we can use to make any solid predictions.

The question we are effectively asking is about how people will behave and has little to do with how they are behaving today.  If we are looking at a truly ground-breaking new concept then information on past behaviour, however complete and accurate, might well be of little use.

So, in some circumstances, the most accurate way to discover likely future behaviour is to ask people.  What we are trying to do is to understand attitudes, opinions, and preferences as they pertain to an (as yet) hypothetical future scenario.

False starts in data

One problematic area for digital marketing (or indeed all marketing) campaigns is false starts.  AI tools are improving in their sophistication all the time.  However, they all work in a similar way:

  • The AI is provided with details of the target audience.
  • The AI starts with an initial experiment,
  • It observes the results,
  • Then it modifies the approach based on what it learns. 
  • The learning process is iterative, so the longer a campaign runs, the more the AI learns, the more effective it becomes.

However, how does the AI know what target audience it should aim for in the initial instance?  In many cases the digital marketing agency determines that based on the client brief.  That brief is usually written by a human which should (ideally) provide a clear answer to the question “what is my target market?”

That tells the Agency and, ultimately, the AI, who it should aim for.

However, many people, unfortunately, confuse the question “what is my target market?” with “what would I like my target market to be in an ideal world?”  This is clearly a problem and can lead to a false start.

A false start is where, at the start of a marketing campaign, the agency is effectively told to target the wrong people.  Therefore, the AI starts by targeting the wrong people and has a lot of learning to do!

A solid understanding of the target market in the first instance can make all the difference between success and failure.

Balancing data inputs

The future will, no doubt, provide us with access to an increased volume, variety, and better-quality digital data.   New tools, such as AI, will help make better sense of this data and put it to work more effectively.  The digital revolution is far from over.

But how, when, and why should we rely on such data to guide our decisions?  And what role should market research (based on asking people questions rather than observing behaviour) play?

Horses for courses

The truth is that observed data acquired digitally is clearly better than market research for certain things. 

Most obviously, it is better at measuring actual behaviour and using it for short-term targeting and forecasting. 

It is also, under the right circumstances, possible to acquire it in much greater (and hence statistically reliable) quantity.  Crucially (as a rule) it is possible to acquire a large amount of data relatively inexpensively, compared to a market research study.

However, when we are talking about observed historic data it is better at telling us ‘what’, ‘when’ and ‘how’ than it is at telling us ‘why’ or ‘what next’.  We can only look to deduce the ‘whys’ and the ‘what next’ from the data.  In essence it measures behaviour very well, but determines opinion, as well as potential shifts in future intention, poorly. 

The role of market research

Question based market research surveys are (or at least should be) based on structured, representative samples.  It can be used to fill in the gaps we can’t get from digital data – in particular it measures opinion very well and is often better equipped to answer the ‘why’ and ‘what next’ questions than observed data (or attitudinal digital data). 

Where market research surveys will struggle is in measuring detailed past behaviour accurately (due to the limitations of human memory), even if it can measure it approximately. 

The only reason for using market research to measure behaviour now is to provide an approximate measure that can be linked to opinion related questions measured on the same survey.  To be able to tie in the ‘why’ with the ‘what’

Thus, market research can tell us how the opinions of people who regularly buy products in a particular category are different from less frequent buyers.  Digital data can usually tell us, more accurately who has bought what and when – but that data is often not linked to attitudinal data that explains why.

Getting the best of both data worlds

Obviously, it does not need to be an either/or question.  The best insight comes from using digital data in combination with a market research survey.

With a good understanding of the strengths and weaknesses of both approaches it is possible to obtain invaluable insight to support business decisions.

About Us

Synchronix Research offers a full range market research services and market research training.  We can also provide technical content writing services.

You can read more about us on our website.  

You can catch up with our past blog articles here.

If you like to get in touch, please email us.

Sources, references & further reading:

Observational Data Has Problems. Are Researchers Aware of Them? GreenBook Blog, Ray Poynter, October 2020

Scroll to Top