Big data vs small data

  • Statistical methods in the past could only apply to structured data
  • In the digital age, collection of data to analyze can concern the whole population and no longer needs to be limited to sampling.
  • Instead of carefully constructed samples (structured) of small data , we are dealing with enormous quantities of data collected without a specific question in mind; it is therefore often unstructured.
  • see “Big data defined”.

What is the use of collecting all that data ?

  • Nowadays, data is always being collected and this is no different in medicine. This entails a potential of a wealth of uses although at the stage of the collection, it an worthless rock which needs careful carving….ask the right questions.
  • By merging traditional statistics with computer science we are looking for patterns.
  • My linking the patterns we look for associations. In health causality is the next step towards the essential step of explainability.

How does it work ?

  • Data is collected, stored and the analysis of it takes place. This is what defined the data science field.
  • This requires the expertise of a data scientists who are in short supply.
  • Data science merges computer science with new methods derived from statistics, artificial intelligence. The output of the results is possible through delivererd solutions to a problem (equation) called an algorithms.

Reference: Big Data: A very short introduction by Dawn E. Holmes. Oxford University Press, 2017

Scroll to Top