Dimensionality reduction is the process of combining the information from a large number of features to a create a smaller number of features, either to reduce the computational cost or to visualize the data.
In order to achieve the most accurate result, it is often required to have many features. For examples, when analyzing simple images by pixels, there are often over a million features. Even a simple 1 Megapixel image (1000 x 1000) produces 1 million features. With many features come a greater computational cost, thus it is important to reduce the number of features to maintain reasonable computational speed. Simply eliminating features will eliminate valuable information, hence there are techniques such as principal component analysis which reduces the number of features of a dataset while preserving most of the information.
Dimensionality reduction also becomes relevant for data visualization. In order to visualize the how an algorithm works, the data needs to be reduced to 2 or 3 dimensions to be plotted.
- 1. NG A. "Machine Learning | Coursera". Coursera, 2018. [Link].
Related Radiopaedia articles
- artificial intelligence (AI)
- imaging data sets
- computer-aided diagnosis (CAD)
- natural language processing
machine learning (overview)
- machine learning processes
- machine learning models
- visualizing and understanding neural networks
- common data preparation/preprocessing steps
- DICOM to bitmap conversion
- dimensionality reduction
- principal component analysis
- training, testing and validation datasets
- loss function
- optimization algorithms
- linear and quadratic
- batch normalization
- rule-based expert systems