Data is the critical component of developinf solutions using Artificial Intelligence.
Data can me divided into 3 types of datasets:
a training set which is how the machine learning model learns by tweaking its solution (algorithm)
a validation set validation data, assesses the model fit and performance and tunes the model. It which is often a subset of training data.
a test set which is a separate dataset and test data that independently assesses the final model fit and performance.
Validation is a term that refers both to:
the model validation, how well the model performs.
clinical validation. How well does this model work in a clinical setting ? Does it impact the outcomes that it is set out to intervene and change? Only if this answer is in the affirmative can it be deployed to the real world.