The importance of Good Data in AI for Health

What are some of the challenges, obstacles, and issues that we need to address in realizing this future?

Steps in the development of an AI model:

  • To create a model to develop a model, a dataset is needed.
  • This data is internally validated and tested.

In the deployment into the clinical setting and real world:

  • That deployed model gets deployed and external validation occurs in the clinical setting.
Description of the creation of an AI mofel applied to health. First a model is created. Then it is deployed to the clinical setting. In orange you can see all the safeguards in place.
Source of image:
  • Then it’s disseminated with monitoring of real world results, which then gives rise to more data and new data that can be used to refine the model.

Therefore the quality of the data that’s used to train the AI is crucial:

  • what we put in is what we get out.
  • the validity and ability of AI-based technology is dependent on the quality and the source of the data that’s used to develop the model.

Imperatives of data:

  • The characteristics of these datasets influenced the nature of the algorithms generated, and they can carry the potential to extend bias to delivery of care if these datasets are not representative of the populations in which AI algorithms will be used.
  • Therefore the data that’s used to train the AI models has to be of high quality, their source has to be identifiable, and when possible data labeling characteristics of the training dataset must be made transparent.

Reference: Justin Ko, MD. AAD Position Statement on Augmented Intelligence. Fusing technology with human Expertise to enhance Dermatological Care. 8th World Congress of Teledermatology, Skin Imaging and AI in Skin diseases – November 2020


Scroll to Top