Classification Algorithms in Machine Learning.

In this article, we will unite on one of the most important topics share in artificial intelligence, which is classification It is one of the most important applications that are used in many projects, and it is one of the important tools that work a lot and intervene a lot in most of the works. It contains artificial intelligence, as it works to classify things and so on.

Here a classifier in machine learning is an algorithm that automatically arranges or categorizes data into one or more sets of ‘categories’. Where one of the most common examples is an email rating tool that scans emails to filter them by category rating: spam or not spam.

A classification algorithm is a supervised learning technique used to select a new observation category based on training data. In classification, the software learns from the given set of data or observations and then categorizes the new observation into a number of categories or groups. Like, yes or no, 0 or 1, mail

In contrast to regression, the classification output variable is a class, not a value, such as “green or blue”, “fruit or animal”, etc. Because the classification algorithm is a supervised learning technique, it takes the sorted input data, which means that it has an input with the corresponding output.

In the classification algorithm, the discrete output function (y) is assigned to the input variable (x). y=f(x), where y = categorical output

Binary Classifier: If the classification problem has only two possible outcomes, then it is called a Binary Classifier.

Multi-class Classifier: If a classification problem has more than two outcomes, then it is called a Multi-class Classifier.

Depending on your needs and data, you should cover the top 5 ranking algorithms.

  1. Decision tree
  2. naive Bayes workbook
  3. K-Nearest Neighbor
  4. vector machines support
  5. artificial neural networks

Lazy Learners: Lazy Learner first stores the training dataset and waits until it receives the test dataset. In the case of the lazy learner, the classification is done on the basis of the most relevant data stored in the training data set. It takes less training time but more time to make predictions.

Example: K-NN algorithm, case-based thinking

Enthusiastic learners: Enthusiastic learners develop a classification model based on a training data set before receiving the test data set. Unlike lazy learners, avid learners take more time to learn and less time to predict.

Example: Decision trees, Naïve Bayes, ANN.

Classification Algorithms can be further divided into the Mainly two categories:

  • Linear Models
  • Logistic Regression
  • Support Vector Machines
  • Non-linear Models
  • K-Nearest Neighbours
  • Kernel SVM
  • Naïve Bayes
  • Decision Tree Classification
  • Random Forest Classification

The model evaluation step is one of the most important steps that must be taken into account when evaluating the model. It is necessary to evaluate its performance; Either it is a classification or a regression model. So to evaluate the classification model, we have the following methods:

1. Log loss or cross-entropy loss:

  • It is used to evaluate the performance of a classifier, whose output is a probability value between 0 and 1.
  • For a good binary classification model, the log loss value should be close to 0.
  • The log loss value increases if the expected value deviates from the actual value.
  • The lower the log loss represents the higher fidelity of the model.
  • For binary classification.

2. Confusion Matrix:

  • The confusion matrix provides us with a matrix/table as output and describes the performance of the model.
  • It is also known as the error matrix.
  • The matrix consists of predictions resulting in a summarized form, which has a total number of correct predictions and incorrect predictions.

3. AUC-ROC curve:

  • ROC curve stands for Receiver Operating Characteristics Curve and AUC stands for Area Under the Curve.
  • It is a graph that shows the performance of the classification model at different thresholds.
  • To visualize the performance of the multi-class classification model, we use the AUC-ROC Curve.
  • The ROC curve is plotted with TPR and FPR, where TPR (True Positive Rate) is on Y-axis and FPR(False Positive Rate) is on X-axis.

In this article, the classification method was presented in machine learning, what are the algorithms that work by the classification method, and what are the measurements that make the model suitable for work after you finish it.

Mohamed B Bakrey. Data Scientist.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store