In a single minute, 100,000 tweets are produced, 2 million searches on Google are made and more than 680,000 pieces of Facebook content are posted. How can we make sense of this staggering amount of data and put it to good use? – Text by Richard Hartung.
Imagine this: Walking along North Bridge Road on your way home from work, you get an SMS from your bank, offering a 20% discount on an anniversary bouquet at a flower shop 50 metres ahead. Having forgotten that it was your wedding anniversary, you quickly pop in to pick up the bouquet.
Here's another scenario: on your way to the airport the next morning to board your flight to Hong Kong, you get an offer from the airline for a restaurant at Pacific Place shopping mall. Since you're attending a conference at the Shangri-La Hotel right next to Pacific Place, it's quite tempting.
The real question, though, is how the companies had any idea that it was your wedding anniversary or that you were going to be near Pacific Place in Hong Kong. Was it simply an overzealous employee checking on you? Or was it your wife's musing on Facebook about what you were going to get her for your anniversary? The real answer is Big Data.
Newfound power in computing means companies are using their own data and sometimes pulling in everything, from GPS location data on your phone to the public postings on your Facebook wall, to find out more about you.
Big Data consists of the three "Vs" – Volume, Velocity and Variety. Volume means that there is an incredible amount of data. Velocity means it's whizzing by all the time, everywhere, from mobile phone chats and Google searches to CCTV cameras and companies churning out financial records. And variety means it ranges from structured data like tables that the Singapore Department of Statistics produces or financial reports, to unstructured data like Facebook musings and Twitter posts.
In fact, technology news portal ZDNet reported in 2012 that as much as 90% of the data in the world today was created in the last couple of years alone.
Rapid-fire advances in technology also enable people to use Big Data like never before. Software developer TIBCO Asia's Chief Technology Officer Kevin Pool told Challenge that in the past, anyone who wanted to use their organisation's vast amount of data usually had to go to a business intelligence analyst and wait a month. Now, cheaper memory chips and better software mean the average person gets the answers he needs by himself almost instantly.
The impact on our lives
The big deal about Big Data, Open Knowledge Foundation Founder Rufus Pollock told The Guardian, is "the mass democratisation of access, storage and processing of data. What matters is having the data that helps us solve a problem or address the question we have."
Because of that, Big Data has the power to change how governments, businesses and even individuals go about their daily lives. Big Data's ability to identify patterns and anomalies in an analysis of a large amount of data means that one can make predictions of wrongdoings and take preventive action. For example, using Big Data for fraud detection, tax collectors in the US would be able to spot anomalies more easily, making those considering improper tax filings think twice.
Big Data's ability to identify patterns and anomalies in an analysis of a large amount of data means that one can make predictions of wrongdoings and take preventive action.
Elsewhere on the globe, researchers from the University of Ontario Institute of Technology in Canada have tapped Big Data to detect hospital-acquired infections in premature infants early. While machines monitoring the babies can detect subtle changes in body temperature, heart rate or blood pressure, the data stream is too rapid and abundant for humans to process quickly. Hence infections go undetected until they are life-threatening. Researchers have developed algorithms to analyse the Big Data in real-time so infections can be detected 24 hours before symptoms become visible.
Boosting security with real-time information
The US Department of Energy (DOE) needed to detect, locate and track potential threats to secure its border areas. So it first set up an elaborate sensing system that could collect huge amounts of acoustic data along the border areas. This deluge of data is fed into a programme that scans at hyperspeed and in real-time – it "listens" out for selected key sounds that could indicate human presence. The system is able to analyse 275Mbit of data (about 100 MP3 songs) in a fraction of a second; in contrast, humans would require hours to do the same. Now, the US border security staff use the results of the Big Data analysis, delivered to them on their computers in real-time, to decide how to respond to a threat.
Big Data can also boost the overall efficiency of government operations. In New York, the Department of Environmental Protection wanted to crack down on restaurants that were illegally dumping cooking oil into sewers. In the past, theNew York Times reported, the health department would have sent inspectors to restaurants, hoping to catch the culprits in the act. Recently, they did it the smarter and quicker way – the city's Office of Policy and Strategic Planning, "a geek squad of civic-minded number-crunchers", unearthed records of local restaurants with a carting service to remove their grease. Matching restaurants that did not have a carter with geo-spatial data on the sewers, public officers informed inspectors of the statistically likely suspects. The success rate of catching the culprits was 95% and the problem was solved.
Big Data in Singapore
In this country, Big Data is already starting to improve service delivery. The National Environment Agency (NEA) is using Big Data to fight dengue. It pulls in data from dengue cases, public feedback, mosquito inspections, mosquito virus serotypes and other sources for analysis. Making use of geographical information systems to identify high-risk areas, NEA is able to prioritise places for checks. Along with sharing its risk assessment with other government agencies to coordinate dengue control efforts, NEA provides information to the public at the national dengue website and via social media. The agency also puts up notices at high-risk zones.
On the transport front, the Singapore-MIT Alliance for Research and Technology (SMART) compared weather data and 830 million GPS records of 80 million taxi rides to find out why it's so hard to get a taxi on rainy days. The findings: cab drivers pull to the side of the road when it rains for fear of getting into accidents – they could have to pay about $1,000 because of a taxi company policy. This insight could shape policy and make a difference to public transport here.
Big Data can be useful, but only if we know how to convert it into âactionable information' so we can see what to look for.
While in the past only people like statisticians and programmers with PhDs might have been able to access all that data, more powerful software means that the ordinary public service officer, company employee or even the man in the street can now use Big Data.
In Singapore, government-wide initiatives are already underway to turn Big Data into something that public officers, from policymakers to front-line staff, can actually use.
"Big Data can be useful, but only if we know how to convert it into âactionable information' so we can see what to look for," SMART CEO Rohan Abeyaratne toldChallenge. "Researchers at SMART are working together with public agencies to do just this in the domain of transportation. The goal is to solve real world traffic and crowding problems."
To make sure Singapore has the resources it needs, the Infocomm Development Authority (IDA) is seeding early adoption of analytics in key industry sectors, formulating data policies, and collaborating with universities and polytechnics to roll out programmes that will ensure there are enough people with the right skills to work on Big Data. It is also developing a Government Business Analytics Programme to boost public sector capabilities in data analytics.
A few agencies are in the early stages of training their staff not just to use the software correctly, but also to collect the right type of data – there's such a massive amount out there that you need to identify and select those you need – and know the right patterns to look out for in the analysis stage. Only then will the analysis be useful (To find out more, check out "How to Read Things Right" in our May/June 2013 issue).
Similarly, in the private sector, DBS Bank's Managing Director Ed Pinto said the goal is not to give staff data but to give them useful information. "Everything we put out to an operations or marketing person is in a form that tells them an action or gives them a measurement. There are tools they can use."
A PLANET that revolves around Big Data
To find out more about daily commuting behaviours in Singapore, the Land Transport Authority (LTA) rolled out the Planning for Land Transport Network (PLANET) project in 2010. One of the largest government data warehouses in Singapore, PLANET gives the LTA easy and timely access to historical and real-time data captured from more than 12 million public transport transactions every day. The data collected is then analysed and used to improve the commuting experience. For instance, PLANET is used to plan the recently launched Bus Service Enhancement Programme. The result – 180 new buses have been deployed to reduce crowding on buses and increase the frequency of bus services along heavy transport corridors.
Despite the benefits, Big Data does come with risks, such as drawing the wrong conclusions from data analytics. One example is Google Flu Trends, which significantly overestimated flu levels in the US during the winter season this year. If public health officials had overly relied on the Google data, misallocation of public resources could have resulted, said Microsoft Research's Principal Researcher Kate Crawford in a recent Harvard Business Review blog entry.
With all that data floating around, another risk is data privacy (see box below). And as Mr Pool said, consumers can find it "kind of spooky" when a company or government uses all that data.
While privacy protection is clearly important, people seem to accept usage of their data better if it benefits them, such as in the case of public safety, said Mr Pool. If a bomb goes off and law enforcement can instantly access information from in-store cameras, mobile phones and license plate registrations, it may be easier to catch the culprit.
There may be risks, but Big Data can enable us to do our job better. Finding the tools to access data, turning it into action and using it the right way will ensure every public officer rides the Big Data wave to improve public services.
Tackling privacy woes
Obviously, with the Big Data trend, concerns regarding how data about individuals is collected and used need to be addressed.
Besides the recent passing of the new Personal Data Protection Act, which aims to prevent private companies from misusing consumers' information, the Public Service also needs to adhere to a set of rules when managing public data. Officers can check out the rules in the IT Management section of the IM on the Government intranet.
For the individual, the Personal Data Protection Commission, in charge of educational and outreach efforts to encourage data protection, has some tips:
- Think before you give your personal data to anyone. Look at website privacy notices for information on why your data is needed and how it would be disclosed to third parties.
- Destroy any document with your personal details before you bin it. For instance, you can shred it first.
- Tweak your browser settings if you don't want to be tracked. Visitbit.ly/pdpcdata for more tips on how you can better protect your data.
Richard Hartung is the trainer of the 3 Day MBA in Card and Payments with Terrapinn Training. Don't miss his upcoming course in Singapore, taking place from 21 – 23 October. Download the brochure to find out more >