AI in Health: What are the Types of Data Sources on Which a Model Learns?

  • Data is the critical component of developinf solutions using Artificial Intelligence.
  • Data can me divided into 3 types of datasets:
    • a training set which is how the machine learning model learns by tweaking its solution (algorithm)
    • a validation set validation data, assesses the model fit and performance and tunes the model. It which is often a subset of training data.
    • a test set which is a separate dataset and test data that independently assesses the final model fit and performance.
  • Validation is a term that refers both to:
    • the model validation, how well the model performs.
    • clinical validation. How well does this model work in a clinical setting ? Does it impact the outcomes that it is set out to intervene and change? Only if this answer is in the affirmative can it be deployed to the real world.