Overfitting

Last revised by Andrew Murphy on 16 Apr 2021

Overfitting is a problem in machine learning that introduces errors based on noise and meaningless data into prediction or classification. Overfitting tends to happen in cases where training data sets are either of insufficient size or training data sets include parameters and/or unrelated features correlated with a feature of interest non-randomly. For example, an algorithm trained to read chest x-rays may correlate the use of a side marker or absence of a side marker with the absence/non-absence of pathology 1.

Strictly speaking, overfitting applies to fitting a polynomial curve to data points where the polynomial suggests a more complex model than the accurate one. In terms of neural networks, classification results which are misclassified by irrelevant parameters are referred to as examples of overfitting. 

When we analyze a machine learning model, underfitting can be established by looking at the model's learning curves and observing high performance on the training set with significantly lower performance on the validation set. In essence, that means that the neural network memorizes the training samples instead of learning the patterns, i.e. struggles to generalize.

There are many techniques that can be used to mitigate overfitting, such as:

It should be noted that some degree of overfitting is common even with effective models and we can only try to minimize this effect.

ADVERTISEMENT: Supporters see fewer/no ads

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.