Dimensionality reduction

Last revised by Andrew Murphy on 23 Jul 2019

Dimensionality reduction is the process of combining the information from a large number of features to a create a smaller number of features, either to reduce the computational cost or to visualize the data.

In order to achieve the most accurate result, it is often required to have many features. For examples, when analyzing simple images by pixels, there are often over a million features. Even a simple 1 Megapixel image (1000 x 1000) produces 1 million features. With many features come a greater computational cost, thus it is important to reduce the number of features to maintain reasonable computational speed. Simply eliminating features will eliminate valuable information, hence there are techniques such as principal component analysis which reduces the number of features of a dataset while preserving most of the information.

Dimensionality reduction also becomes relevant for data visualization. In order to visualize the how an algorithm works, the data needs to be reduced to 2 or 3 dimensions to be plotted.

ADVERTISEMENT: Supporters see fewer/no ads

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.