In class we learned about the key components of Big Data and what variables apply to it. Each of these measures analyse the effectiveness of data sets, and are essential in having accurate, reliable data.
Big Data can be measured by using 'The 7 V's of Big Data' which include:
- Volume
- Velocity
- Variety
- Veracity
- Value
- Variability
- Visualisations
Volume refers to the size of the data sample, the scale on which it represents. Data projections have shown a dramatic increase in the volume, growing exponentially every year, which brings us onto the next point:
Velocity is the rate at which new data is being generated. Institutions must constantly upgrade and ensure they have the capacity to store data being put into their systems. Bigger measures for data storage are being used in today's world to accommodate the increasing velocity, with exabytes and zettabytes becoming more common on the large scale.
Variety is the different types of data collected. It can be structured, semi-structured, and unstructured. These refer to how easily data is to analyse, with structured being the easiest and most effective for analysis while unstructured is incredibly difficult to analyse. This can also refer to the sources in which data is acquired. Some examples include but are not limited to science, business, and government statistics.
Veracity is the term that represents the accuracy of data. In other words, its a test of how reliable the data is. This is crucial when analysing data because it isn't accurate, the end result will not be useful whatsoever. Big Data should always use data sets that are as accurate and relevant as possible. After all. if the data cannot be trusted, then why should we use it? No data set is100% accurate, however recent measures have ensured that it is as close to 100% as it has ever been.
Value refers to how useful data is, and how organisations can use the data after its value is extracted. If the data can be used, it automatically has value, however data can be used in different applications and therefore some data may be more valuable than others. Data can be used in many different ways, but a business, for example, could find value in customer data showing what products should be targeted, how to improve products, and providing valuable feedback on certain products, to name a few.
Variability is similar to veracity but slightly different. It looks at the consistency of data and the real meaning behind it. Some data may have a different meaning than what is originally intended. If inconsistencies are not found it can greatly impact the accuracy of results.
Visualisations refers to the way data can be displayed and represented. This is commonly done through charts and graphs and makes information more readable in contrast to looking at data in a table. The format in which data is displayed makes data easy to comprehend.
We made a poster in class to represent all of these terms in class. How cool is this? (Admittedly it could be a little better)
https://ilearn.fife.ac.uk/course/view.php?id=9751#section-5
No comments:
Post a Comment