Health Tools Produced from Artificial Intelligence (AI) Depend on the Dataset: the Potential for Bias

The adage goes: garbage in garbage out, and this is especially true for machine learning models, since this data sets on which the models are trained and validated, are essential in ensuring the ethical use of the resulting algorithm and poorly trained, poorly representative data sets can introduce biases into the algorithms.

There are two main types of biases that can exist, although there are many others:
- cases in which the data sources themselves don’t reflect the true epidemiology within a given demographic. And so one example, population data bias by under diagnosis of inflammatory disease in patients of skin and color for instance. If so what are the consequences ?
  - If we look outside of medicine, we can see that the result of such bias is likely to be the entrenchment and exacerbation of systemic biases. For example with Amazon’s AI program that was aimed at supporting hiring decisions, HR decisions, by applying algorithms to application files. And in hindsight, it’s obvious that training of program on data fraught with systemic gender and other biases was only going to lead to a model that perpetuated the same.
- cases where algorithms are trained on data and datasets, which are not representative. For example an interesting study (2020), found a striking geographic representation skew in AI studies with most of the studies and the population represented coming from just a few states.
  - Alternatively, we can consider what happens when a data set doesn’t contain enough members of the given demographic. For example facial recognition software performs exceedingly well on white men, but poorly on young women of skin and color. From a technological performance perspective and also fundamental reasons of equity and justice, the research community, academia industry, regulatory bodies, need to take steps to ensure that machine learning training data mirror the populations for which the algorithms will be used.
And so one approach exemplified by the “All of Us Precision Medicine Research” cohort in the United States, is to actually proactively fund the development of more representative datasets that can be used for training and validation.
Could we think of a parallel of this research initiative in our field in dermatology, by developing a more representative dataset of patients, for example, who may be traditionally underrepresented.

Justin Ko, MD. Ethical Considerations in AI. 8th World Congress of Teledermatology, Skin Imaging and AI in Skin diseases – November 2020

Artificial Intelligence In Health

Paraffin sections are classically stained with Hematoxylin-Eosin (HE) before being integrated in a glass slide. One step further takes us to the digital slide. These studies below highlight how deep ...

Combining different sources of data are the fundamentals of personalized medicine. The developed tools are called personomics. Dermatopathology digital slides thus constitute a data source and it can solve specific ...

In the last 15 to 20 years computing and data power have enabled us to digitize, a complete digital slide. Digital pathology is not only the creation of a digital ...

While building that robust evidence base and showing proof of concept launching pilot studies, how can we move further along to this product in practice implementation of AI ? That ...

Dermatology has been among the first in generating evidence and a robust evidence base for Telehealth to: develop technology and innovation increase access and appropriateness to measure outcomes. However, there ...

Telehealth vs Telemedecine The pandemic has catalyzed the adoption of telehealth. “Telehealth” refers to a broader scope of remote health care services than telemedicine. Telemedicine refers specifically to remote clinical ...

Health Data

The training of a model is influenced by: The amount and quality of Data it receives. The Processing Power: it is more expensive than you think and effective methods are ...

Overfitting Overfitting occurs when the model is accurate with the data it is supplied to predict and the model is quite accurate, so the training set and validation set agree. ...

Explainability is of upmost importance in health in general for explaining the benefit of the patient…and not the least gaining his or her trust. Explainability concerns the algorithmic solution and ...

Even if efforts are made to standardize the image collection (same position), there are distortions which can arise. These can sometimes be corrected but it is important to be aware ...

Taking a standardized clinical picture can help to compare images and enhance machine learning to recognize the conditions, especially when associated with clinical data. It is also of use when ...

The adage goes: garbage in garbage out, and this is especially true for machine learning models, since this data sets on which the models are trained and validated, are essential ...

Teledermatology

The figure below show the different steps in the 5 levels to reach a fully digitalized system in digital pathology. The presentation shows the completeness applied for the experience in ...

This article is about the situation in Andalusia, Spain. However it is a case example which will be useful for the reader terms of replication and scaling up. In public ...

Digital slides compared to glass slides have several advantages: Digital slides are easier to analyze as they need to be viewed in full Annotatation of the areas of interest is ...

Health Tools Produced from Artificial Intelligence (AI) Depend on the Dataset: the Potential for Bias

Categories

Artificial Intelligence In Health

Health Data

Teledermatology

Clinical Services

Geneva Clinic

Bahrain Clinic

Terms of Service | Privacy Policy

Education

Global Dermatology Information Portal (until May 2015)

Global Dermatology Education (until December 2020)

Contact

Health Tools Produced from Artificial Intelligence (AI) Depend on the Dataset: the Potential for Bias

Related posts:

Categories

Terms of Service | Privacy Policy

Education