The 3Vs of Big Data – Volume, Velocity, and Variety

Bringing Big Data to the People (Part 4 of 6)

Beyond Natural Selection – Variety

Data used to have to be carefully selected for processing both in quantity and quality.  Data was strictly formatted.  At first, its gatekeepers were men in lab coats and pocket protectors (and eventually morphed to the IT guys.)  

The original Computer lab

As data became more prolific, it became more personal through spreadsheets and databases that were possible on home computers via Lotus and Microsoft.  Anyone with a PC and cheap software could learn basic capabilities with a little effort. With a lot of effort, any PC could actually accomplish quite a bit with these tools (most users only utilize less than 10% of any MS product ability.)  Anyone who’s worked with a pivot table or even just got the “!” trying to use a spreadsheet understands the need to have the right format to manipulate the data.

!  the data error
! the data error

Big Data is a lot more than a Big MS tool.  BD consumes all data – heterogeneously – words, images, audio, telemetric, transactional, scanned analog, legacy databases and social media.  The data must still be scrubbed but BD ingests everything – an information jabberwocky of sorts.  

more VARIETY in Big Data - even how it's defined
more VARIETY in Big Data – even how it’s defined

This scrubbing process changes source data to application data, which can then be manipulated. The increase in variety and subsequent scrubbing process has given rise to the Fourth V – veracity – the uncertainty of data.

One Reply to “The 3Vs of Big Data – Volume, Velocity, and Variety”

Comments are closed.