Now, in this video, we will be discussing about the concept of logistic regression. Now, what is logistic regression in logistic regression there is a set of dependent there is one dependent variable and there can be there can be one independent variable or multiple independent variables, but like in linear regression my dependent independent variables all are continuous in nature in logistic regression my dependent variable is binary in nature that is, it is dichotomous in nature. Here we are calculating the probability of y equals to even that is the our dependent variable is calculating the probability of y equals to event or dis calculating the probability of occurrence of an event. So, in logistic regression, instead of predicting the value of the dependent variable y from predictor variables, we calculate the probability of y equals to even given known values So, the credit creditors there can be multiple independent variables also in logistic regression that can be one independent variable as well as multiple independent variables using those independent variables we predict the probability of the occurrence of an event which is our dependent variable.
Unlike linear regression the dependent variable in logistic regression is dichotomous in nature that is, calculates the probability for Vipers to event since probability value lies between zero and one therefore, the value of the dependent variable lies between zero and one where zero denotes an impossible event and one denotes insured event. So just because our dependent variables calculating probability of occurrence of an event, that is a particular case or a particular event will occur. So since probability value lies between zero and one therefore the value of a dependent variable also lies between zero and one where zero means it is an impossible event and one means it is assuring. So logistic regression is an extension of simple linear regression where the dependent variable is dichotomous or binary in nature. logistic regression is a statistical technique used to predict the relationship Between predictors and predicted variable where the dependent variable is binary, there might be one or more than one independent variables or predictors for a logistic regression model and we need to remember that in case of logistic regression our independent variables can be continuous in nature or it can be categorical in nature like linear regression our independent variables were all continuous in nature and our dependent variable was also continuous in nature for logistic regression our dependent variable is binary or dichotomous in nature and our independent variables are categorical in nature.
Now let's understand the standard form of logistic regression equation. The standard form of an logistic regression equation is given by probability y equals to one is equal to one by one plus exponential that is e to the power minus b naught plus b one x one plus E and probably divide equals to zero is one minus relative i equals to one which is equal to one by one plus an exponential that is e to the A B naught plus b one x one plus EI were my exponents one independent variable like in this case I have considered that there is only one independent variable that is x one B one is my regression coefficient or my slope B notice my intercept and Ei are my error terms and here probability y equals to one denotes the probability of an of an event or that is probability of occurrence of an event and probability y equals to zero denotes gravity of a known event here as I told you that we have considered only one independent variable that is x one now in logistic regression model just because the output variable is calculating probability which is lying between zero and one and my input variable that is my independent variables can be categorical in nature as well as continuous in nature.
The curve which represents a logistic regression equation is called a sigmoid curve, where we can see that that output variable which is calculating probability lies between zero and one and as we know that probability value can never be negative. So any value of probability or any value Variable below zero does not exist and we know the probability when the probability value is equal to zero it is called an impossible event and probability above one also does not exist when probability value is equal to one it is conditioning. So no values of an output variable below zero exist and above one exists. So my output variable is designed between zero and one and therefore the curve which best fits a logistic regression model is called a sigmoid curve. Now, let's discuss the key differences between linear regression and logistic regression. So, here we will be doing a comparative study between linear regression and logistic regression in linear regression, the linear regression models data using continuous numeric values as against velocity regression models the data in the binary values.
So, as we had discussed that in linear regression, my dependent and my independent variables both are continuous in nature, but in case of logistic regression model, my dependent variable is binary nature or we can say it is dichotomous in nature and my independent variables can be continuous in nature as well as categorical in nature. Next linear regression requires to establish the linear relationship among dependent independent variables Faris it is not necessary for logistic regression next the line of best fit for a classical linear regression model is a straight line whereas the line of best fit for logistic regression is a sigmoid This is quite obvious because in classical linear regression model my dependent and the independent variables are linearly related to each other. So, the line of best fit of classical linear regression model is a straight line but in case of logistic regression, my dependent variable is binary nature which is calculating probability of y equal to event and since, it is calculating probability probability values lies between zero and one no value of probability can come below zero and can cross above one therefore, the curve which represents a logistic regression model is called a sigmoid curve.
In case of linear regression, we use ordinary least squares method to generate the model that is to estimate the parameters whereas in logistic regression, we use a technique called maximum likelihood estimation to estimate the parameters. Now, let me explain your what is ordinary least squares method. And what is maximum likelihood estimation. ordinary least square is a type of linear least squares method used for estimating the unknown parameters in a linear regression model, or less, or nearly squares chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares minimizing the sum of the squares of the differences between the observed dependent variable in the given data set and those predicted by the linear function geometrically This is seen as the sum of the squared distances parallel to the axis of the dependent variable between each data point in the set and the corresponding point on the regression surface.
The smaller of the differences the better the model fits the data. Next let's move to the concept of the method of the Let's Move to the concept of the maximum likelihood estimation Emily gives an estimate which is estimated by using the following logic. Suppose we are considering some samples now by maximizing the probability of obtaining those samples from the population when we estimate the parameter, the parameter is known as Emily this process is done by iteration, when the Gauss Markov assumptions are not fulfilled then we cannot use the oil estimate Gauss Markov assumptions means assumptions of classical linear regression model. So, therefore, we cannot use oil is in that case and that time we use the concept of Emily that is maximum likelihood estimation Now, let's discuss about the different applications of logistic regression. The first application raised in regression is is one of the most important one that is credit risk analytics where we decide whether a particular customer is going to repay a loan on right time or not.
So, my probability of y equals to event is the customer is going to repay the loan at writing and y equals to non even disability I question only when does the customer will not be able to repay the loan right thing so, that we do by building a rescue regression model next to sales where we want to calculate the Probability whether the particular potential consumer is going to buy a particular product or not. So, my priority of Ico story event in case of sales is whether the potential consumer is going to buy the product that is provided by equity event is the potential consumer is going to buy the product and rifles to non event probability by equals to non even depravity by close to zero is the potential consumer will not buy the product as you all know we are discussed before priority y equal to E which means revenue by close to one and collaborative i equals to non even means collaborative equals to zero then telecommunication analytics is another important application of ICT relation where the telecommunication comm companies they want they want to predict the gravity of the customers attention with a particular telephone connection.
So the priority of icons to event in this case will be that the customer is going to retain in that particular connection, what is the probability the customer will retain in that particular connection that is probably two equals three with and relative equals non event is the customer will not put is the probability that the customer will not retain in that particular telephone connection next is HR analytics where we are defining the attrition rate their employee retention the probability of employee retention in a particular company is calculated where gravity y equals to event is that the employee will retain in that particular company, what is the impact will retain that particular company that is the total depravity by equal to event and gravity by going to non event is what is the probability that the employee will not retain that particular company. So, these are the different applications of logistic regression In this video we'll be learning till here.
So, Goodbye Thank you. See you all for the next video.