In this video we will be discussing about the concept of multicollinearity and the concept of autocorrelation. Now what is multi current multicollinearity generally occurs when there are high correlations between two or more creditor variables. In other words, one factor variable can be used to predict the other creditor variable an easy way to detect multicollinearity is to calculate correlation coefficients for all pairs of predictor variables. The predictors in a regression model are often called the independent variables, but this term does not imply that the predictors are themselves independent statistically from one another. In fact, for natural systems, though predators can be Highly inter correlated MySQL linearity is a term reserved to describe the case when the inter correlation of predictor variables is high. It has been noted that the variance of the estimated regression coefficients depends on the inter correlation of characters.
Therefore, we can say that multicollinearity occurs when the predictor variables or the independent variables get influenced by each other they are they are very much correlated with respect to each other. Now, multicollinearity has the following negative effects the first effect is the variance of the regression coefficients can be inflated so much that the individual coefficients are not statistically significant, even though the overall regression equation is strong, and the predictive ability is good. The second effect, the relative magnitudes and even the science of the coefficients may defy the interpretation, the third effect the values of the differential regression coefficients may change radically with the removal or addition of a predictor variable in the equation. In fact, the sign of the coefficient might even switch now, let us discuss about the signs of multi collinear first sign, when there are high correlation between pairs of predictor variables, this denotes multicollinearity when the regression coefficients whose signs or magnitudes do not make good physical sense, this is another sign of multicollinearity statistically non significant regression coefficients are important predictors of our classical linear regression model this is another sign of multicollinearity extra sensitivity of sign or magnitude of regression coefficients to insertion or deletion of predictor variables.
So, this is another sign of multi collinear. Now, what is V is when there is multicollinearity in your classical linear regression model, that is when the predictor variables are getting influenced by each other then the concept of variable variable inflationary factor comes where the variance gets inflated. So, VA is measuring how much the variance of the estimator regression coefficients are inflated as compared to when the predictor variables are not linearly related, it is used to explain how much multicollinearity that is the correlation between predictors exists in a regression analysis. So, the variance inflation factor is a statistic that can be used to identify multicollinearity in a matrix of predictor variables variance inflation reference here to the mentioned the effect of multicollinearity on the variance of estimated regression coefficients multicollinearity depends not just on the by variate, but also on the multivariate heritability of any one creditor from the other creditors.
Accordingly, the VI F is based on the multiple coefficient of determination of regression model for each predictor in multivariate linear regression model on all the other predictors Now the V vi F or the variance inflation factor can be represented by the following formula vi a Vi is equal to one by one minus r i squared, where r square is the multiple coefficient of determination in a regression of the is predicted on all other predictors. And VA s is the variance inflation factor associated with the is predicted. If the if character is independent of the other predictors, the variance inflation factor is one. Why is the if character can be almost perfectly predicted from the other predators the variance inflation factor approaches to infinity, the variance of the estimated regression coefficients is unbounded. multicollinearity is said to be a problem when the variance inflation factors of one or more predictors becomes very large.
Some researchers uses a vi F of five or 10 as a critical threshold. The vi F is closely related to a statistic called tolerance Which is one by vi now, what are the remedies or what are the solutions of Vi first is obtain more data so as to reduce the standard errors. Next obtain better data where the predators are less correlated example by conducting an experiment called remedy record the predictors in a way that reduces correlations. Now, let's discuss the concept of autocorrelation. autocorrelation is a statistical measure that indicates the degree of correlation of a random variable, we could say it measures the relationship between a value in a time series and those that occur before and after. So, autocorrelation is a mathematical representation of the degree of similarity between a given time series and the lagged version of itself over successive time intervals.
It is the same as calculating the correlation between two different time series except that that same time series is used twice, once in its original form, and once lack one On more time periods autocorrelation is calculated to detect patterns in the data according to our assumption of the classical linear regression model, the error terms should not be correlated with respect to time that is error term at a time period that is, he should not be correlated to et minus one should not be correlated to et minus two and so on. There should not be any autocorrelation with respect to their terms. Now, let's move to the concept of Durbin Watson test Durbin Watson test is done to check for autocorrelation there may ah not or not hypothesis no autocorrelation and may alternative hypothesis H one is autocorrelation exists the DW statistic which is denoted by P is equal to summation of within bracket Ei minus EI minus one whole square divided by summation of a square or D is equal to two into one minus rho, where rho is my autocorrelation coefficient.
Now, if my d value is equal to two then there is no autocorrelation that is rho equals to zero if d was to zero, then autocorrelation is one that is rho equals to one Because to four then autocorrelation is minus one that is rho is equal to minus one if the value of my D w statistic lies between 1.5 to 2.5. Then it denotes that there is no accumulation. I'm ending this video over here. Thank you, goodbye. See you all for the next