Understanding Big Data

Understanding Big Data

Big data is a term that has been thrown around quite extensively over the last couple of years, and in the process has been misused, misaligned, misconceived and misinterpreted.

At the heart of it, big data has described the sudden explosion of data through the proliferation of smartphones, tablets, sensors, scanners, machines and any other receptacle of electronic information, but the concept is far more encompassing than that.

IDC defines big data as a new generation of technologies and architectures designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis.

The real problem of big data is not so much about volume. Technologies are continually evolving to manage the growth of data, and the Hadoop Distributed File System seems to be the emerging standard most solutions are adopting. The real problem lies in the variety and velocity of data.

Big data is messy. It is unstructured and does not fit neatly into the rows and columns of the relational database. It is varied and comes in different types and from different sources.

Organisations are now collecting social media feeds, images, streaming video, text files, documents, telemetry data and so on, reading everything from sentiment, to expression, to electronic forms, to genomes, to soil temperatures and pH levels. This variety of data is hard to render into a structured format and almost impossible for a standard query language to interpret.

Data is being created as fast as it is being collected. High velocity and streaming data could become obsolete minutes after it was created as in the movement of markets on a stock trading floor, or multimedia streaming used for surveillance and security. The challenge with this is to be able to take action on the insights from information that is ever changing.

However, even the variety and velocity of data may be the least of an organisation’s concerns. In a recent IDC study, which polled 300 organisations from all industries across Australia, 47 per cent of respondents revealed they do not have the skill sets required to manage big data.

Read More