Stochastic gradient descent is an optimization algorithm which improves the efficiency of the gradient descent algorithm. Similar to batch gradient descent, stochastic gradient descent performs a series of steps to minimize a cost function. Unlike batch gradient descent, which is computationally expensive to run on large data sets, stochastic gradient descent is able to take smaller steps to be more efficient while achieving the same result.
After randomization of the data set, stochastic gradient descent performs gradient descent based on one example, and start to change the cost function. This takes less computational power compared to the batch gradient descent, which iterates through all the examples in a data set before aiming to reduce the cost function.
However in stochastic gradient descent, as one example is processed per iteration, thus there is no guarantee that the cost function reduces with every step. Over time, the general direction of the stochastic gradient descent will converge to close to the minimum. Stochastic gradient descent does not always converge to the minimum of the cost function, instead, it will continuously circulate the minimum. With large data sets with millions of examples, and after a reasonable amount of iterations, the value of the cost function will be extremely close to the minimum that any differences will be negligible.
- 1. NG A. "Machine Learning | Coursera". Coursera, 2018. [Link].
Related Radiopaedia articles
- artificial intelligence (AI)
- imaging data sets
- computer-aided diagnosis (CAD)
- natural language processing
machine learning (overview)
- machine learning processes
- machine learning models
- visualizing and understanding neural networks
- common data preparation/preprocessing steps
- DICOM to bitmap conversion
- dimensionality reduction
- principal component analysis
- training, testing and validation datasets
- loss function
- optimization algorithms
- linear and quadratic
- batch normalization
- rule-based expert systems