Welcome to lecture 38. As I have mentioned in the previous lecture, this is an important topic of six sigma. In this lecture, we will learn the concepts such as significance level type one and two errors determining the sample size distinguish between statistical and practical significance. hypothesis testing is part of inferential statistics, where we predict the parameters of a population based on the descriptive measure of sample in the Define phase We we prioritize problems selected one of them as improvement project. In measure phase, we measured current performance level of the project identified key input parameters 10 to 15 critical acts. In analyze phase, we studied variation in this critical access then narrowed down to two to three critical access to assume those as root causes.
But how can it be validated that we assumed the right Root Causes imagine the risk of assuming wrong root causes in a soft drink manufacturing process, where 1.5 to two millions of bottles produced in a month. How can you bring the confidence of top management to implement your solutions? No one can predict hundred percent there is always a risk involved. What is the alternative then? statisticians provide a method to predict write assumptions with the help of sample data set. This is called hypothesis testing.
Our statistical decision making any prediction involves some risk. Let us discuss the risk in detail. Suppose, during the analysis phase our team have reached to two or three cdp's as root cause of the problem. So, when team assumes something and take a decision based on their assumptions. Then there are two different perspectives of assumptions and decisions each which needs to be evaluated by considering a third factor the risk Let us review at one vertical, we can say those are not the real causes and those are the real causes at the other vertical team decides them as not the causes and team decides them as the real causes. Now, let us see what are the risk associated with your decisions.
Suppose, team has reached on wrong root causes. However, they did not decide to consider them as root causes, then is there any risk involved there is no risk involved. In this case, similarly, suppose the team has identified the real root causes and also decided to consider them as real root causes, then then also there is no risk involved. However, think vice versa. That team has identified wrong root causes and somehow they decided to consider it as the real root causes. Then, is there any risk involved?
Yes, there is a risk of losing huge money in implementing a wrong suggestion which is not going to reap any kind of benefits. As well as loss of production during the implementation phase. Similarly, suppose the team has identified the right root causes, but somehow no decision could be made that they are the real root causes. There is also a risk involved the risk of losing prospective gains as well as loss incurred by continuing with the current methods. Let me convert the above scenario into statistical terms. Statistically, these statements are known as null hypothesis and alternate hypothesis as well as the decisions are known as accept a null hypothesis or reject a null hypothesis.
When no hypothesis is true and team accepts no hypothesis will there be any risk? No, there is no risk or decision matter. But what about when null is true and team rejects the null hypothesis there is a risk involved in your decision and this risk is known as alpha risk or type one error. Let me repeat type one error The error of rejecting a null hypothesis when it is true. Similarly, when null hypothesis is false and it gets rejected, there is no risk involved in this decision whereas, a null hypothesis is false and it gets accepted then there is a risk involved. This risk is known as type two error or vieta error.
Statisticians always prefer that type one error be always low. Normally, Six Sigma practitioners keep this risk as five percentage and six sigma project A null hypothesis is rejected when probability of risk is less than 5% or P value is less than point 05. What is P value? probability value of statistical hypothesis test is probability of wrongly rejecting the null hypothesis when it is true the p value is compared with the desired significance level of artists. If it is smaller, the result is significant that means, null hypothesis is only likely to be true. If null hypothesis were to be rejected at the 5% significance level, this would be reported as p less than 0.05.
Usually, significance level is taken as 5% or 0.05. We reject null hypothesis when P value is less than significance level to remember this, Six Sigma practitioners learn by heart as when p is low, nl must go let us identify the steps involved in hypothesis testing. Step one, define the null and alternate hypothesis statements. Step two, specify suitable test statistics z and its probability distribution. z is equals to sample statistic, minus population parameter divided by standard error. Step three, decide the significance level, P value and projection region.
Step four, draw inference from samples based on the p value. For good hypothesis test It is important to select the proper sample size. How will we define the appropriate sample size for a good hypothesis testing we will get it from the equation discussed in the previous slide. The test statistic z is equals to x bar minus mu zero divided by sigma divided by root n where n denotes the sample size. by rearranging the equation, we can develop the formula for determining the proper sample size which comes as is equals to z sigma divided by x bar minus mu zero whole square. If we consider X bar minus mu zero as E, we can rearrange the equation as n is equals to z sigma divided by A square.
Let us consider an example Six Sigma team want to identify the sample size for conducting the test in 95% confidence level standard deviation known as to mmm the test was to find whether the differs by 0.5 mm we can find the answer as E is equal to 0.5 given that sigma is two mm and confidence level is 95% for finding the z value, we need the confidence level Z value corresponding to 95% confidence level comes as 1.96. Putting the values in the equation, we get the appropriate size of sample as 62 that is identify the statistical versus practical significance is by hypothesis testing, we test for the statistical significance between two populations. However, at times, it is important to see the practical significance of any process under study. For example, two processes A and B shown a minor statistical significance and performance with a better than B.
However, process B is capable of meeting the customer specification with lower cost the difference in performance as No practical significance. Hence, before taking decision based on statistical significance, study the practical significance that involves level of customer specification requirements passed and efforts involved in changing the process. That's all with this lecture. Let us move to the next lecture to continue with the concepts of