PCA in Machine Learning — Your Complete Way around the Key Components You Want.

Mohamed Bakrey
3 min readOct 5, 2022

--

Introduction

Abstract Principal Components Analysis (PCA) is a powerful statistical method for limiting the variables that confront us, and we use it when the variables are highly correlated. As PCA becomes an essential tool for unsupervised multivariate data analysis and dimensionality reduction. Hence PCA has been combined with AI technologies to improve the performance of many applications such as image processing, pattern recognition, classification, and out-of-date anomaly detection. The objective of this survey is to provide a comprehensive review of the literature on principal component analysis (PCA).

What is the PCA?

PCA is a very useful statistical technique that has many applications in areas such as face recognition and image compression and is a common technique for finding patterns in high-dimensional data. But the main problem with mining scientific data sets is that the data is often highly dimensional. When the number of dimensions reaches hundreds or even thousands, the computational time of pattern recognition algorithms can become prohibitive. In many cases, there are a large number of features that represent the object. One problem is the computational time to recognize patterns.

If we look at the figure above, we will find that we have several points drawn on a two-dimensional plane. There are two main components. PC1 is the primary component that explains the maximum image contrast in the data. PC2 is another major component that is orthogonal to PC1.

What are the main components of PCA?

The principal components are a straight line that captures most of the variance of the data. They have direction and size. The principal components are orthogonal (perpendicular) projections of data over a space of lesser dimensions.

Now that we understand that bit of PCA basics, let’s take a look at the next topic about PCA in machine learning.

The Multiple Applications of PCA in Learning God

Initially, PCA has many uses, including the following:

1. In multi-data imaging and its dimensions.

2. Use it to reduce dimensions in health care evidence

3. It is used to change the size of the image.

4. Use it in financing to analyze the warehouse data and sign its returns.

5. The method helps to find patterns in the data set that contains high dimensions.

What are the steps by which the PCA works?

1. Data normalization

Here is the consolidation of the data before the implementation of the PCA in the first place. As this will ensure that each feature has to mean = 0 and variance = 1.

2. Construction of the covariance matrix

Here we construct a square matrix to express the association between two or more features in a multidimensional data set.

3. Find the eigenvectors and eigenvalues

Now we can calculate eigenvectors/unit vectors and eigenvalues. The eigenvalues are a scalar by which we multiply the eigenvector of the covariance matrix.

Conclusion

The principal component analysis is a widely used unsupervised learning method for performing dimensionality reduction. In the meantime, we hope that this article will help you understand what PCA is and its applications of PCA. I’ve looked at PCA implementations and how they work.

Mohamed B Mahmoud. Data Sceintist.

--

--