BIG DATA
What is Data?
The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.
Now, let's learn Big Data introduction
What is Big Data?
Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is data with so large size and complexity that none of the traditional data management tools can store it or process it efficiently. Big data is also data but with a huge size.
Examples of big data-
social Media---
The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments, etc.
A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.
Types Of Big Data
Following are the types of Big Data:
- Structured
- Unstructured
- Semi-structured
Structured
Any data that can be stored, accessed, and processed in the form of a fixed-format is termed as 'structured' data. Over the period of time, talent in computer science has achieved greater success in developing techniques for working with such kind of data (where the format is well known in advance) and also deriving value out of it. However, nowadays, we are foreseeing issues when the size of such data grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.
Examples Of Structured Data
An 'Employee' table in a database is an example of Structured Data
Employee_ID | Employee_Name | Gender | Department | Salary_In_lacs |
2365 | Rajesh Kulkarni | Male | Finance | 650000 |
3398 | Pratibha Joshi | Female | Admin | 650000 |
7465 | Shushil Roy | Male | Admin | 500000 |
7500 | Shubhojit Das | Male | Finance | 500000 |
7699 | Priya Sane | Female | Finance | 550000 |
Unstructured
Any data with an unknown form or structure is classified as unstructured data. In addition to the size being huge, unstructured data poses multiple challenges in terms of its processing for deriving value out of it. A typical example of unstructured data is a heterogeneous data source containing a combination of simple text files, images, videos, etc. Now day organizations have wealth of data available with them but unfortunately, they don't know how to derive value out of it since this data is in its raw form or unstructured format.
Data Growth over the years
Please note that web application data, which is unstructured, consists of log files, transaction history files, etc. OLTP systems are built to work with structured data wherein data is stored in relations (tables).
Characteristics Of Big Data
Big data can be described by the following characteristics:
- Volume
- Variety
- Velocity
- Variability