Introducing big data

In a publication of 2017, 2.5 exabytes (Eb) of data was generated everyday:

  • 1 exabyte 1000 petabytes
  • 1 petabyte is 1000 terabyte

That number fits with the statement that more data was generated in the two years before than in the history of humanity.

Initially big data referred to the very large amounts of data produced in the digital age (see “Big data defined”) out of all sources on the internet: emails, websites, social networks.

Nowadays in addition this refers also to specific datasets, that are large in both size and complexity, with which new algorithmic techniques are required in order to extract useful information from them.

80% of that data is unstructured (see “types of data”)

Reference: Big Data: A very short introduction by Dawn E. Holmes. Oxford University Press, 2017


