Approaches to Big Data analytics when your data is worse than bad

In Data & Analytics by Freya Smale

big data, analytics,

Bill Beckler, Head of Innovation, joined us at Big Data World yesterday to discuss Approaches to Big Data analytics when your data is worse than bad. In his presentation he covered:

  • How can you apply analytics to a mess of data?
  • What steps need to be be taken before data can be applied?


Guest blogger Paul Booth shares his thoughts.

The Head of Innovation at, Bill Beckler, began his talk with an interesting metaphor for big data business intelligence using Battle of Britain. Bill's main point can be summed up as ‘garbage in – garbage out', referring in his example to the over bloated claims by Nazi pilots about the number of British planes they had shot down. Working with low quality data, or data that is just plain wrong is a common problem it is suggested.

Working with a high granularity in big data introduces issues around non-intuitive dynamics. Data is evil ! Beckler exclaims.. we sit waiting for a good follow up to such a wild statement, which is delivered with further visual example to illustrate the speaker's point. Some people are waiting for more data to arrive to justify the data they already have, is it really 56% of your customers who are buying product X ? Waiting for more data is not the answer, there are investments to be made based on the limited data companies already have.

Bill's biggest sell was on Baysian statistics, it is the new statistics he says, the revolution is ‘gigantic'. The importance is certainly not underplayed for Baysian stats being the answer to information reasoning for many companies. Don't look in the books though, because not all machine learning books really work through Baysian stats properly we're told. So what should companies do? Use R ( and employ people who know how to do Baysian using R, the packages are all there to use.

Nine out of 10 data disparities are tracking issues and not actual opportunities with changes in the way observations are counted and stored as data lead to frequent errors in comparison between two datasets compiled differently about the same phenomena. ‘Data is rotten fruit' says Bill, citing data collection is as important as data analysis to help pick out the ripe and fruity bits in the data you have. With all these complications ROI is hard to measure of course. We are taken through examples of advertising sold on video channels (YouTube), and how big data is behind those ads at the bottom of every video; big data is behind the decisions on what to advertise. Data about users is mapped across other sites by some companies to establish better and more relevant adverts- the rule of thumb here is that those who pay the most for ads on YouTube and other sites do so because they have the worst analytics.

Beckler finishes his talk with the insistence for employers to teach listening skills and invest in training – especially for analysts, because "if you're not listening the barriers are difficult to overcome." Its good for supplier relationships with data suppliers, bringing teams together is part of the process solution for everyone who wants to avoid common data roadblocks. The short list to remember:

1. listening = better relationships
2. relationships are power
3. you will rule !

Answering a question at the end Beckler advises the audience to find new people by running contests and testing people's skills in all the areas you on a technical level. Perhaps with enough good data on a person companies might be able to tell if they have good relationship skills too?  Tell us what you think below.


Presentations from Big Data World Europe 2012 will be available from the 24th September.