Synthetic and augmented data

Last revised by Dimitrios Toumpanakis on 15 Apr 2021

Citation, DOI, disclosures and article data

Citation:

Moore C, Toumpanakis D, Knipe H, et al. Synthetic and augmented data. Reference article, Radiopaedia.org (Accessed on 19 Apr 2024) https://doi.org/10.53347/rID-67573

DOI:

https://doi.org/10.53347/rID-67573

Permalink:

https://radiopaedia.org/articles/67573

rID:

67573

Article created:

12 Apr 2019, Candace Makeda Moore

Disclosures:

At the time the article was created Candace Makeda Moore had no recorded disclosures.

View Candace Makeda Moore's current disclosures

Last revised:

15 Apr 2021, Dimitrios Toumpanakis

Disclosures:

At the time the article was last revised Dimitrios Toumpanakis had no recorded disclosures.

View Dimitrios Toumpanakis's current disclosures

Revisions:

9 times, by 5 contributors - see full revision history and disclosures

Sections:

Artificial Intelligence

Tags:

refs

Synonyms:

Synthetic data
Augmented data

In the context of radiological images, synthetic and augmented data are data that are not completely generated by direct measurement from patients.

Machine learning models improve with increased data. However, there is a relative lack of open, free available radiology data sets. Issues of patient privacy and legal restrictions on data use make machine learning without synthetic and augmented data challenging. Additionally, some diseases are so rare that even large data sets do not contain sufficient samples to generate robust machine learning algorithms.

In addition to contributing to making algorithms for image identification and classification, synthetic data can also be used in algorithms for artifact correction.

Basic principles

synthetic data: partly or completely artificial. Synthetic data are often produced by generative adversarial networks
augmented data: derived from real images with some sort of minor and realistic transformation (such as translation, flipping, rotation, or the addition of noise) in order to increase the diversity of the training set

Criticisms

Criticisms of the use of augmented, and especially synthetic, data in medical AI include the potential amplification of statistical biases and a general lack of research on their consequences. Nonetheless, data augmentation is used routinely in the development of AI applications for radiology.