Wednesday, June 24

3) Big Data Characteristics - 5 Vs of Big Data


To understand Big Data let's discuss the characteristics of Big Data. Big Data has 5 dimensions (or characteristics) : Volume, Variety, Velocity, Veracity and Value. Let's briefly go through the popular definitions of the 5 V's

1) Volume: Volume refers to the vast amounts of data generated every second. If we take all the data generated in the world between the beginning of time and 2008, the same amount of data will soon be generated every minute. This makes most data sets too large to store and analyze using traditional database technology. New big data tools use distributed systems so that we can store and analyze data across databases that are dotted around anywhere in the world.
2) Variety: Variety refers to the variety of data generated today. Text, Audio, Video, Device Data, GPS data, Facebook data, Call Data Records, Air Flight Logs and 100s of other data types contribute to Big Data.
3) Velocity: Velocity refers to the High Speed at which data is getting generated today. For example- Data generated by Stock Exchanges is high speed data, GPS of a travelling car or a plane generates data at high velocity, Each mobile towers generates CDR data at very high velocity and one of the Big Data challenge is how to process huge volume of data that is generated at such high velocity.
4) Veracity:  Having a lot of data in different volumes coming in at high speed is worthless if that data is incorrect. Incorrect data can cause a lot of problems for organizations as well as for consumers. Therefore, organizations need to ensure that the data is correct as well as the analyses performed on the data are correct. Especially in automated decision-making, where no human is involved anymore, you need to be sure that both the data and the analyses are correct.
5) Value - All the data generated by different devices may or may not have any value for your business. While designing the Big Data solution it is required to decide which data is relevant for business and also filter the 'Noise Data' before you store & process the Big Data.

 The following image from IBM is my favorite Big Data infographic. Picture this image and you will never forget the key characteristics of Big Data. 

Some data scientists consider 'Visualization' as the 6th V of Big Data but I do not agree that Visualization is a characteristic of Big Data. So what is visualization? Is it related to Big Data? What is Big Data Analytics? Visualization is a discipline of business analytics and it is about using tools to play with your data & analyze it to derive business value. Tools like Tableau, Qlikview are some of the leading visualization tools. We will discuss Visualization in my future post. 

1 comment:

Understanding Generative AI and Generative AI Platform leaders

We are hearing a lot about power of Generative AI. Generative AI is a vertical of AI that  holds the power to #Create content, artwork, code...