Detailed Explanation of the Interview Questions in Machine Learning and Data Science

10 min readFeb 3, 2023

Introduction

Welcome, my friend, data scientists, and machine learning engineers, to this article, which will always be useful to you in a great and decent way, because you will find what will benefit you in a large and proper way until you see the result that is tantamount to a positive effect on you. Here you will find the most important questions that you may need to know, even if you prepare for any job interview. Here you will find all questions of all kinds, as they will be included from ease to difficulty, and then questions that will benefit you greatly and decently. Let’s go to the content we will benefit each other. Some of the great articles will help you very much.

The questions we’ve included cover a wide range of topics, including:

How to Prepare for the interview?
How to Prepare for Coding Interviews
Talk about Bais.
Machine Learning.

These points are explained in an easy and simple way, and a simple explanation is given for all those parts that we have talked about.

Q1. How to Prepare for the interview?

Clarify data structures:

Here we will explain the most important elements of data structures:

1. List: — It is one of the most important types of data structures, as it works on:

Arrangement of a set of components secured in it.
The position of each element is determined by the index that accompanies it.
Easy access to items in any order

2. Linked List:

This is one of the important elements in data structures as it works on:

Determine the order of the linked list through physical placement in a secured memory.
Contiguous elements of the linked list that are adjacent to each other will not be placed in memory.

Here we see in the figure above that each value is associated with a pointer to the next associated element, which is very clear. Here we can say that the linked list has been traversed by traversal through each element one at a time.

3. Stack:

The stack is one of the most popular types of data structures, as it is an infrastructure for sequential data that maintains the order of elements when they are included in it. It also works this way:

Last In First Out (LIFO) order, which means that elements can only be accessed in the reverse order they were inserted into the stack.
The element to be inserted last will be the first to be removed from the stack.
push() adds an element to the head of the stack, while pop() removes an element from the head of the stack

4. Queue:

The queue is a large structure and really realizes very good results and is used a lot as it has a sequential structure that maintains the order of the elements when they are inserted into it by First In First Out (FIFO) as the element entered in the first will be extracted in the first of its own list.

4. HashTable:

Here, this element is presented for each value with a key. That key is used to access the pairs at the same time, as it makes each pair a key and a value, and then the key is passed through the function of that function. It is hashed to create a unique address for the new value that will be stored, and here it can be The same function ends up creating the same physical address for several different keys as this is called a collision.

5. Tree:

This element is a hierarchical shape, as it has a node. This node is called an insular node since it is the node at the top of the tree. And the parent node, since any node has at least one child. And then the child node, and it knows the successor node of the parent and child nodes. As the islands can not be a subsidiary at all.

6. Graph

It is about two pairs of sets (V, E), where V is assigned to all vertices, and E is a set of all sides. Whereas, the neighbor of a node is a group of vertices connected to that node.

Q2.How to Prepare for Coding Interviews:

We have some important parts to prepare for the interview:

The schedule is one of the most important reasons that distinguish you at that point: coding and what is inside it. This diagram can illustrate the process that we talked about.

In this part, reviewing data structures is one of the most important things that you should be aware of and fully aware of these points. Knowledge of temporal and spatial complexity of lists/arrays, linked lists, hash tables/dictionaries, trees, graphs, heaps, and queues.

Q3.What is the Bias?

It is an error that arises between the average expectation of the used model and the ground truth. Hence, work is being done to tell us how to predict values.

Q4.What is the Variance?

It is about the variance that the model creates for the data set that we are working on and we define it, and from here it tells you about the extent of the variability of the incapacitated function, to what extent that function can adapt to the updated change in the data sets used.

Q5.What happens if high Bias Occure?

The model will be too simplified and also not suitable for that data and also be a big error in the training and test data used.

Q6.What happens if high Variance Occure?

The model will become more complex and there will be a lot of over-fitting in the model. Also, the training data will have a large and very high error in the test. Starts modeling the noise in the input.

This graphic will make a great illustration of the difference and how it occurs.

Bias variance trade-off

Increasing the bias (not always) decreases the variance and vice versa
Error = Bias 2 + Variance + Irreducible Error
The best model where the error is reduced.
A compromise between bias and variance

Unbalanced data in the classification:

In this problem, we see several things that significantly affect the data and also, in another way, the results of the model trained on the data, as the model goes to the number of labels that are more in the data or meaning the number of more categories and understands them greatly, and that affects the results when testing, as it is neutral to The most rated or predicted label.

In the picture shown above, we will clearly see the way in which the number of labels is more than the other.

An important note is that accuracy does not always give the correct view of the trained or used model.

Possible solutions to this point.

In the first place, you look at the special data that you are working on and see what is the highest percentage, where is that percentage located, and then you can clarify the optimal solution, such as copying the data available to you so that the number of samples becomes equal and comparable. Or you can touch on synthetic data such as image data, you can do the cropping part in it, or can also put or use noise on the images. Or you can also use the modified loss, where you can adjust the loss to reflect a larger error when misclassifying the group into smaller samples that you can handle more. Or also you can change the algorithm.

Q7.Clarifying principal component analysis(PCA)

What is PCA?

If we look at the set of data that you own and are working on, you can look for a new set of vectors that contain multiple directions in a way that makes the data or files more spread out and from the maximum number of feature directions. where data points have maximum variance in the first feature vector, and minimum variance in the last feature vector The variance of data points in the direction of the feature vector can be termed as a measure of information in this direction.

Explanation steps:

Here, in the first place, you can unite the important points in the data, and then the covariance matrix from the given points that you have. Also, conduct the eigenvalue analysis of the covariance matrix. Here, too, we can sort the eigenvalues and eigenvectors.

Dimensionality Reduction with PCA

Keep the first m out of n feature vectors rated by PCA. These m vectors will be the best m vectors preserving the maximum information that could have been preserved with m vectors on the given dataset.

Explanation:

Figure 1: Data points with feature vectors as the x- and y-axis

Figure 2: Here the Cartesian coordinate system is rotated to maximize the standard deviation along any one axis (new feature #2)

Figure 3: Removal of the feature vector with minimum standard deviation occurs for the data points (new feature #1) and data display on the new feature #2

Q8.Bayes Theorem and Classifier

What is Bayes’ Theorem?

It is a theory that describes the probability of impertinence of an event based on prior knowledge of that event or the circumstances that may have been associated with the event since its occurrence.

This is the equation for the Bayes:

Here it will be shown how to tier the probability of an event occurring when we have knowledge of the occurrence of another.

Example

• Probability of fire P(F) = 1%

• Probability of smoke P(S) = 10%

• Prob of smoke given there is a fire P(S F) = 90%

• What is the probability that there is a fire given

Do we see a smoke P(F S)?

Here a maximum likelihood estimation of aposteriori (MAP) is considered

The MAP estimate for the random variable y, given that we observed iid (x1, x2, x3, …), is given by. We try to absorb our prior knowledge when estimating.

Note: y that maximizes the product of prior and likelihood

Here also considers the maximum likelihood estimate (MLE) as the MAP estimate for the random variable y, given that we observed iid (x1, x2, x3, …), is given by. We assume that we have no prior knowledge of the quantity that is actually estimated.

Note: y maximizes only the likelihood and MLE is a special case of MAP where our prior is uniform (all values are equally likely)

Q9.What are regression and classification problems?

Regression is a specific method for modeling the relationship between a variable that is an independent variable or several independent variables. Its aim is to understand how the variables relate to each other, and here it makes a prediction about that variable value based on the new value of the new variables.
As for the classification problem: it is a type of machine learning, as the goal of that problem is to predict a specific goal, and it also works to collect the categories to which it belongs.

Q10.What is regression analysis?

Here if we say that the fit of the function f (.) for data points yi = f (xi) occurs under some error functions. Based on estimated function and error.

Here we will give a detailed explanation of the famous regression types:

1. Linear Regression:

Fits a line that minimizes the sum of the mean squared errors for each data point.

2. Bayesian Regression:

Each data point fits a Gaussian distribution under the mean squared error minimization. As xi increases, the number of data points converges to the point estimates.

3. Polynomial Regression:

This type is considered to fit a polynomial of order k (k + 1 unknown), which minimizes the sum of the mean squared errors of each data point.

4. Logistic Regression:

It can fit a line or a polynomial with sigmoidal activation that minimizes the binary entropy loss for each data point. The y labels are binary class labels.

Visual Representation:

If we look at the part related to drawing, we will find that each of these types differs according to the type of fiber that it performs, its composition, the construction of its own equation, and so on.

Conclusion

The preparation stage for the personal interview is one of the most important stages that you must be prepared for in all respects, as that part is the most important in your educational career. Regarding the basics that must be known, as well as a clarification with detail on these points. Here are some salient takeaways we discussed in today’s article:

How to prepare for the personal interview and that this point is one of the most important points that you should be aware of.

How to prepare for the practical part when the personal interview takes place.

Explain the special parts that occur when you work on data.

Explanation of the parts of regression and its types. I hope you accept the job soon. See you in the title of a new article, thanks. Happy day.