Now in this video we will be generating a classification table for every level of probability from zero to one with a gap of 0.01. So let's start the logistic code to generate the classification table. Our code will be proc logistic data equal to my live one is our library name dot logistic underscore, reg, underscore German underscore bank This is the name of our data set that is my liberties when I rename logistic underscore eight underscore German underscore cobank is my data set name, then I'm using the keyword. The C semi colon model is a keyword. To build the logical regression model response is my dependent variable. I hope you'll remember that in our data set we had one dependent variable that is response and there were total independent variables.
After doing the step by selection, automatically the insignificant variables are removed and we are left with only 14 significant variables which we are going to specify in this code like in this code after model response equals two we will be specifying the name of all the significant variables to the name of the significant variables are all displayed in the last result viewer in the summary of step by selection table out of the 30 independent variables only 14 independent variables that is check account duration history save account new card education guaranteed a single other installed in surveyed amount used car foreign rent the are neither significant variables which we are going to use for our further analysis. So we will be specifying these variables name so I will be copying the variable name from here and we'll write it over here in the code. The first variable is check account.
Next duration, then is history medisave account. Then new car. You have provided the variables name exactly the same way. It is given the day Does it otherwise ask would you not be dragged into a good afternoon god it's education. Get under need a single install? installed rate model Then used car then foreign and dendrite.
So, these are the list of significant variables there are total 14 significant independent variables that we are going to use for further analysis that is check account duration history, save account new car education guarantor meal single other install, install rate amount used car foreign rent now we are going to use the keyword See table. See table is a key word to generate classification table. Then p prop stands for probability equals to, as we know probability value lies between zero to one. So we are specifying the range zero to one, and our gap will be 0.01 semicolon and then run. So let's run the code before I run the rule. Let me explain you all the code.
Proc logistic data is this new procedure name my lick one is my library name dot logistic reg underscore German underscore bank is my data set name. BSc is the key word which is used to build the model for Y equals to one model responses my dependent variable and check account duration history save account new car education guarantor made single other install installment amount To use carport and rent. These are my all significant independent variables out of the 30 independent variables. These are the only significant ones which are chosen in our step by selection in the last reason viewer for our further analysis. So we got the name of the significant variables from the summary of stepwise selection. I hope you remember that before we run this procedure proc logistic data we had first brought our data set using the live name statement in SAS environment that we had done already in the last feature I showed you on how to execute the live name statement.
We execute the lignum statement to get our datasets in the SAS environment then only we can run the procedure. Now let us run this procedure to generate the classification table. Okay, so this is our classification table. So the classification table at every level of probability from zero to one with a gap of 0.0 01 is displayed. We also got the different measures of classification table that is correctly classified event correctly classified non even incorrectly classified even in classified non event, percentage correct sensitivity, specificity, false positive, false negative, whatever level of probability. We also got the percentage of concordant is codons and tight.
Which tells us about the amount of misclassification of our model. higher is the percentage of concordance. Lesser is the misclassification and better is our model. So, our percentage concordance is concordance is a key weapon 9% which is good. We got the odds ratio estimates. The analysis of maximum likelihood estimates that is we use maximum likelihood estimation to estimate our parameters for logistic regression.
We had, we have built a model for response via email equal to one that is via calculating probability or for Y equals to one So we have used thousand observations. So there are total 70 observations with response variable value equal to one and 300 observations with response variable value equal to zero, where one denotes event and zero denotes non event. And number of response levels is two, because response variable it's binary nature which is taking value either zero or one. So number of levels or number of categories or two. So in this video we will be learning till here Let's end this video over here. Goodbye.
Thank you see what for the next video