Five V of Data Analysis

Zahid Un Nabi
3 min readApr 15, 2021

--

So, how do you know if you require a comprehensive data analysis solution or even a basic analysis solution? Well, ask yourself: are you struggling to support sudden increases in the volume of data you’re dealing with? Or the speed at which new data arrives? The variety of data sources? The accuracy of your data? Whether you’re drawing value from your data?

Yes, I’m referring to the 5 Vs: volume, velocity, variety, veracity, and value.

Let’s take each one, in turn, starting with the first one. When I say volume, I mean the amount of data that a solution must handle. The solution must do it efficiently and be able to distribute the load across enough servers to handle the next V: velocity.

Velocity is the speed at which data enters and flows through your solution. Many businesses now use large volumes of real-time streaming data. Solutions must be able to rapidly ingest and rapidly process this data.

The third V is variety — ingesting data of many different types from many different sources can mean many different challenges to data analysis. Smart companies build solutions to work with structured, semistructured, and completely unstructured data types. (We will get to data types later in the course.)

The fourth V is veracity, which refers to the trustworthiness of your data. Have you ever heard the saying, “My word is my bond”? It’s supposed to instill trust, to let you know that the person saying it is honorable and will do what they say they will. That’s veracity. To have trustworthy data, you have to know the provenance of your data.

What does even means by provenance?

Sure, provenance means that you know the chain of custody for that data. That you can say with certainty that the data has not to be altered falsely. Collecting data is easy — making sure it’s accurate and consistent? That’s the hard part — that’s veracity.

And the fifth V is value — which is the bottom line. The whole point of this effort is getting value from data. That includes creating reports and dashboards that inform critical business decisions. It also includes highlighting areas for improving the business. And it includes making it easier to find and communicate critical details about business operations.

And there we have it: the 5 Vs of big data. Now that we have discussed the indicators that mean you may need a data analysis solution, let’s talk about what you need to know to prepare for one.

First, we need to know where your data is coming from. The majority of analytical data comes from existing on-premises databases and file stores. Streaming data is becoming increasingly popular, as is the use of public data sets to enrich the other data sources 3rd party solution will tap into.
Second, we need to know the options for processing your data. The term processing includes collecting, cleaning, transforming, and loading data into an analytic data store. That’s a lot of work. This process can be handled manually or by using applications to assist in automating this process.

Finally, we need to know what you need to learn from your data. The result of collecting and processing all this data should be actionable insights. These insights are often presented in the form of reports and dashboards.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Zahid Un Nabi
Zahid Un Nabi

Written by Zahid Un Nabi

Blockchain, Data Science & Big Data Analysis Enthusiastic | Python & Server Nerd

No responses yet

Write a response