Overfitting

Last revised by Andrew Murphy on 16 Apr 2021

Citation, DOI, disclosures and article data

Citation:

Moore C, Murphy A, Toumpanakis D, et al. Overfitting. Reference article, Radiopaedia.org (Accessed on 16 Apr 2024) https://doi.org/10.53347/rID-67692

DOI:

https://doi.org/10.53347/rID-67692

Permalink:

https://radiopaedia.org/articles/67692

rID:

67692

Article created:

19 Apr 2019, Candace Makeda Moore

Disclosures:

At the time the article was created Candace Makeda Moore had no recorded disclosures.

View Candace Makeda Moore's current disclosures

Last revised:

16 Apr 2021, Andrew Murphy ◉

Disclosures:

At the time the article was last revised Andrew Murphy had no recorded disclosures.

View Andrew Murphy's current disclosures

Revisions:

9 times, by 5 contributors - see full revision history and disclosures

Sections:

Artificial Intelligence

Tags:

machine learning, refs, deep learning

Overfitting is a problem in machine learning that introduces errors based on noise and meaningless data into prediction or classification. Overfitting tends to happen in cases where training data sets are either of insufficient size or training data sets include parameters and/or unrelated features correlated with a feature of interest non-randomly. For example, an algorithm trained to read chest x-rays may correlate the use of a side marker or absence of a side marker with the absence/non-absence of pathology¹.

Strictly speaking, overfitting applies to fitting a polynomial curve to data points where the polynomial suggests a more complex model than the accurate one. In terms of neural networks, classification results which are misclassified by irrelevant parameters are referred to as examples of overfitting.

When we analyze a machine learning model, underfitting can be established by looking at the model's learning curves and observing high performance on the training set with significantly lower performance on the validation set. In essence, that means that the neural network memorizes the training samples instead of learning the patterns, i.e. struggles to generalize.

There are many techniques that can be used to mitigate overfitting, such as: