Lecture 23 is about central limit theorem and sampling distribution. If you remember, in the first lecture with the shooting test example, we discussed about the normal graphs. We have also discussed that normal distribution graphs as certain characteristics, which are so useful during the analysis of process under study the major property of normal distribution it can be completely described by knowing the two major statistic measures such as mean and the standard deviation calculating the mean and standard deviation will be covered in later lectures. However, we will understand its effect on normal curves in this lecture itself. Let me illustrate it for your better understanding. Let us say, we have three processes, namely a, b, and c. The mean for all three processes are same, which is equal to 50.
However, the standard deviation of processes differ from each the standard deviation are seven, four and one respectively. Let us see how normal distance bution of all the processes differ from each other. First, let us analyze the process a could you recall this curve? Was it somewhat similar to shooter Pat with highest variation in his shooting process. Now, let us see how the curve for the process B looks like recall now, the shape is somewhat similar to the shooter Bob with lesser variation. Finally, the process see it is somewhat similar to the best shooter bill the desired shape what could you conclude from this?
We can say lesser the standard deviation of a process with same mean better that process will be. Now, let us see how processes with different mean and same standard deviation looks like. Again the same three processes A, B and C, but this time the standard deviation is same, which is equal to two and the mean is different from each other processes that is 1015 and 20 respectively. Now, let us see, how does it look like normal distribution process a normal distribution of process B and normal distribution of process C. Now analyze Is there any change in the shape? Now, there is no change in the shape Am I right then what is the difference? Only the center shift from one place to other could you remember the shooter Joe?
This was similar to his shooting, reduced variation, but target shifted from the center Well, we are now aware that not distribution has many properties which plays an important role and Six Sigma problem solving. But the question before us is whether we get normal curve with every set of data The answer is a big No. It is not necessary that we get a shape of normal curve. Whenever we plot a set of data, it could be among many other shapes such as right or left skewed double p, etc etc. In such cases, we won't be able to use the properties of normal distribution curve. What is the alternative then?
Central Limit Theorem comes into help here. Central Limit Theorem states that the distribution of means of samples will approach to normal distribution as the sample size increases in simple terms, if we take small samples from population and plot the graph of the means, it always follows a normal curve for large size of sample. Let's have an illustration on this. What you see as different shapes of population distribution now, to samples From each population is drawn and calculated the mean what you see now is the distribution of mean with sample size to now, from the same population 25 samples are drawn to calculate the mean. Can you see the sampling distribution now, are they not similar to the shape of normal curve? What is the application of Central Limit Theorem as far as greenbelt professional is concerned?
Well, always try to analyze the sample rather than going for analysis of individual data These distributions are called sampling distributions. The sampling distribution will have its own mean and is called as mean of sample means denoted as x double bar. Similarly, the sampling distribution will have its own standard deviation, it is named as standard error. The standard error is denoted as sigma x bar and is equal to standard deviation of population that is sigma divided by the square root of sample size where, sample size is denoted as let us discuss one more concept called confidence level. We have discussed that in inferential statistics, we draw conclusion. In other words predict about population based on sample measures.
When we predict about something, there is always a risk associated with this prediction. confidence level determines the risk of failing our decision. This will be discussed in details in the analysis phase of this course. Similarly, we will discuss about the control chart in the control phase This course that's all with this lecture. I'm aware that you still have some doubts at the back of your mind. Don't worry, all of these are going to be discussed at many places in the coming lectures for your better understanding.
That is now proceed to next lecture on basic probability concepts. Thank you