Let's start our practical session before we start the practical session, first we need to get the data sets in our SAS environment. And for that we have to first execute our live name statement. So my code to execute the live name statement will be live name. My Library name is my live one I've given you can give any library any name for your library. Then I give double quotes here I have to give the part of my data sets. This is the path I have given the bug within double quotes and then I'm given semicolon.
As you know, every sass statement should end by a semicolon so without semicolon it cannot be executed. So now let's execute the zipline statement. So this is my explorer window. And inside the Maven library, these are my data sets. So I've got my data sets, my data set that the data set with which are going to work over here is linear underscore data underscore detail. So let's run the regression procedure.
Proc reg. Data equals my lib run is my library name. New underscore, reg. underscore retail Is my data set name I'm reading the classical linear regression model using the keyword model, my dependent variable is customer satisfaction, I have to write the variable names exactly the weight is given in the data set. Otherwise, SAS cannot identify them. So customer satisfaction is very dependent variable and all my independent variables and starting from product quality, delivery speed, so I will be giving the range of the independent variables instead of specifying each of them.
I'm giving the range of the independent variables. Starting from property that is my first independent variable delvia speech is my last independent variable. I'm using the key Va which stands for variance inflation factor to check for multicollinearity for all the independent variables let us run the code. So, this code is used to check the multicollinearity for all the independent variables starting from product quality to delivery speed, we are using the keyword variance inflation factor that is VA project for multicollinearity. So, let's run this code. So, these are the values of the variance inflation factor for all the independent variables to see, the variable delivery speed has got maximum value of a pet is 47 double one double eight.
So, this is the maximum value of the A for delivery speed. So, we will be excluding or we will be removing this variable for the next regression procedure. jackin multicollinearity again, because as we know that according to this absolute classical linear regression model, the independent variables of my data set should have minimum of multicollinearity. here that's been adjusted R square value values are displayed r squared is 0.802 that is 80% adjusted R squared is 79%. You know that we will always consider the value of adjusted R squared because that will that will bring the efficiency of the model and can only consider the significant variables that remove the redundancy. So, we will only consider adjusted R square this is me and I this is a variance table adjusted R square is only taken as the accurate measure for the goodness of fit and the total number of observations that are used throughout run this regression procedures 200 that is there are total 200 observations in data set.
Now, let's do the next regression procedure. We'll be doing the next regression procedure to check for multicollinearity for our data excluding the variable delivery speed, because delivery speed had maximum multicollinearity before so now we'll be excluding the variable delivery speed and check for multicollinearity. Again, for the rest of the independent variables of my data set. My procedure will be same. my dependent variable will be customer satisfaction and dependent variables will be from product one GE in price flexibility that we made last variable because we have excluded Let me repeat the chart the maximum value of the A. So long giving the range from protocol.
Price flexibility is flexible flexibility is a second last variable according to my data set and our delivery speed so my last variable over here we look at price flexibility. Again, I'm using the keyword var. So let's run the code and check for multicollinearity of the set of independent variables. The VA values for all the variables from product quality to price This video given to see the VA values are quite low compared to before. So now I want you doctor to have minimal multicollinearity in my data, I also got our spinner tested R squared value, which is the same. This is my analysis of variance table.
And we have used 200 observations. That is the total number of observations in our data set. All my VA values are within five. So generally five or 10 is taken as a critical threshold for VA. So now we can move for the next regression procedures. In this video, we'll be doing two here.
So for now, let's end the video over here. Thank you. Goodbye and see you all for the next video.