- Statistical methods in the past could only apply to structured data
- In the digital age, collection of data to analyze can concern the whole population and no longer needs to be limited to sampling.
- Instead of carefully constructed samples (structured) of small data , we are dealing with enormous quantities of data collected without a specific question in mind; it is therefore often unstructured.
- see “Big data defined”.
What is the use of collecting all that data ?
- Nowadays, data is always being collected and this is no different in medicine. This entails a potential of a wealth of uses although at the stage of the collection, it an worthless rock which needs careful carving….ask the right questions.
- By merging traditional statistics with computer science we are looking for patterns.
- My linking the patterns we look for associations. In health causality is the next step towards the essential step of explainability.
How does it work ?
- Data is collected, stored and the analysis of it takes place. This is what defined the data science field.
- This requires the expertise of a data scientists who are in short supply.
- Data science merges computer science with new methods derived from statistics, artificial intelligence. The output of the results is possible through delivererd solutions to a problem (equation) called an algorithms.
Reference: Big Data: A very short introduction by Dawn E. Holmes. Oxford University Press, 2017