top of page

Frequently asked Interview Questions and Answers on Linear Regression

  • Writer: vikash Singh
    vikash Singh
  • Jan 10
  • 5 min read

Updated: Jan 17


ree

Linear regression is a common topic that is asked in interviews, mainly for freshers and professionals starting out on their machine learning journey.


Though linear regression may seem innocuous, some of the conceptual questions may seem a bit tricky.


When I mentor data science and machine learning aspirants, I often find that gap in understanding of the basic fundamentals.


So, let’s explore some common conceptual questions and answers on linear regression.


Multiple Linear Regression Fundamentals


  1. What distinguishes multiple linear regression from simple linear regression?


     a) The number of dependent variables

     b) The number of independent variables

     c) The use of a different least square formula


     Answer: b) The number of independent variables


  2. In a multiple linear regression model, if you have ‘p’ predictors, what is ‘p’?


     a) The number of coefficients

     b) The number of intercepts

     c) The number of residuals

     d) The number of independent variables


     Answer: d) The number of independent variables.

    Example: Consider the snapshot of the first five rows of mpg dataset below, where the target variable is mpg and we have 8 predictors, or “p”.


ree


Significance and Meaning of Intercepts and Coefficients


2.1. What does the intercept term represent in a linear regression model? 


a) The change in the dependent variable for a one-unit change in the independent variable 

b) The expected value of the dependent variable when all independent variables are zero 

c) The standard error of the coefficients 

d) The sum of the squared residuals


Answer: b) The expected value of the dependent variable when all independent variables are set to zero. Recollect the equation of a straight line you would have learned in your school ( y = mx + c).


2.2. How do you interpret the coefficient of an independent variable in a linear regression model? 


a) It represents the p-value of the variable. 

b) It quantifies the change in the dependent variable for a one-unit change in the independent variable, holding all other variables constant. 

c) It indicates the strength of the correlation between the variables. 

d) It represents the variable’s variance.


Answer: b) It quantifies the change in the dependent variable for a one-unit change in the independent variable, holding all other variables constant.


3. Evaluation Metrics — RMSE and R-squared


3.1. What is the main purpose of RMSE (Root Mean Square Error) in linear regression? 


a) To measure the proportion of variance explained by the model 

b) To quantify the change in the dependent variable for a one-unit change in the independent variable 

c) To assess the model’s prediction error 

d) To calculate the confidence interval for the coefficients


Answer: c) To assess the model’s prediction error


3.2. R-squared is a measure that: 


a) Represents the proportion of variance explained by the model 

b) Measures the model’s prediction error 

c) Quantifies the change in the dependent variable for a one-unit change in the independent variable 

d) Evaluates the statistical significance of the coefficients 


Answer: a) Represents the proportion of variance explained by the model. In simple terms, R-squared quantifies the proportion of variance in the dependent variable that can be explained by the independent variables in the model.


4. Difference Between R-squared and Adjusted R-squared


4.1. What does Adjusted R-squared account for that R-squared does not? 


a) The interaction between independent variables 

b) The presence of multicollinearity 

c) The number of predictors and penalizes excessive variables 

d) The residuals’ distribution


Answer: c) The number of predictors and penalizes excessive variables. Adjusted R-squared considers the number of predictors in the model and adjusts for excessive variables, providing a more reliable measure of model fit.


4.2. In the context of R-squared and Adjusted R-squared, which one tends to increase when adding more predictors to the model? 


a) Both R-squared and Adjusted R-squared

b) R-squared 

c) Adjusted R-squared 

d) Neither


Answer: b) R-squared. To understand this better, remember that R-squared tends to increase when more predictors are added to the model, while Adjusted R-squared may increase or decrease depending on the impact of the additional predictors.


5. Multicollinearity


5.1. What is multicollinearity in the context of linear regression? 


a) It’s the tendency of multiple regression models to collinearize variables. 

b) It occurs when independent variables are highly correlated with each other. 

c) It represents the simultaneous use of multiple collinear regression models. 

d) It indicates that the model has too many predictors.


Answer: b) It occurs when independent variables are highly correlated with each other. This makes it challenging to distinguish their individual effects. Look at the example of a correlation matrix below that shows the presence of multicollinearity within predictors.


5.2. How does multicollinearity affect the stability of regression coefficients? 


a) It makes the coefficients more stable and robust. 

b) It has no effect on the stability of coefficients. 

c) It makes the coefficients more sensitive and less stable. 

d) It only affects the significance of the intercept. 


Answer: c) It makes the coefficients more sensitive and less stable.


6. Variance Inflation Factor and Its Interpretation


6.1. What is the Variance Inflation Factor (VIF) used for in linear regression? 


a) To measure the proportion of variance explained by the model. 

b) To quantify the change in the dependent variable for a one-unit change in the independent variable. 

c) To assess the model’s prediction error. 

d) To detect and quantify multicollinearity.


Answer: d) To detect and quantify multicollinearity.


6.2. How is VIF interpreted in the context of multicollinearity? 


a) A higher VIF indicates less multicollinearity. 

b) A VIF of 1 suggests the absence of multicollinearity. 

c) A higher VIF indicates stronger multicollinearity. 

d) VIF values are not interpretable.


Answer: c) A higher VIF indicates (usually greater than 5 or 10) indicates stronger multicollinearity.


7. Assumptions of Linear Regression


7.1. Which of the following is NOT one of the key assumptions of linear regression? 


a) Linearity 

b) Independence of errors 

c) Normality of residuals 

d) Heteroscedasticity


Answer: d) Heteroscedasticity. Instead, the key assumptions is homoscedasticity, which refers to a condition in which the variance of the error term is constant.


7.2. Why is it important to validate the assumptions of linear regression? 


a) To improve the distribution of the independent variables 

b) To increase the number of predictors in the model 

c) To ensure the reliability of regression results and make accurate inferences 

d) To simplify the regression equation.


Answer: c) To ensure the reliability of regression results and make accurate inferences


7.3. What is the main purpose of residual analysis in linear regression? 


a) To validate the assumptions of the model 

b) To determine the optimal learning rate 

c) To improve the multicollinearity of predictors 

d) To estimate the model intercept 


Answer: a) To validate the assumptions of the model


Conclusion

Linear regression is a powerful tool, and mastering it is essential for data scientists and analysts, not just from an interview point of view; but also because it forms the basis of several important concepts related to statistical modeling and machine learning.


Additionally, by understanding these interview questions and answers, you can demonstrate your proficiency in linear regression and enhance your chances of success in interviews. Be prepared to delve deeper into these topics during your interviews, and you’ll be well-prepared to tackle linear regression questions.


Hope this guide gave you a good starting point and a structure to prepare for machine learning based interview questions. If there is a specific topic you want me to cover, please post it in the comments section.


Collection of other blog can be found here.


You can also connect with me on LinkedIn.


Happy Learning!

Comments


Never Miss a Post. Subscribe Now!

I'm a paragraph. Click here to add your own text and edit me. It's easy.

Thanks for submitting!

© 2035 by Kathy Schulders. Powered and secured by Wix

  • Grey Twitter Icon
bottom of page