Welcome to lecture 32 on continuous probability distribution, in this lecture, we will cover the continuous probability distribution such as normal distribution, chi square distribution, Student's t distribution, F distribution. In the previous lectures, I have described that the normal distribution curves have certain properties. The curves can be divided into halves with equal set of values falling either side the peak of the curve represents the center of the process area under the curve represents hundred percent of the product or output of the process is capable of producing. Also, we learned that a normal distribution can be defined by knowing two factors, the mean new and standard deviation of a process. However, the most important property of a normal distribution is that we can predict the percentage of data falling within mean plus minus one sigma mean plus minus two sigma mean plus minus three sigma. Suppose, we know the mean and standard deviation of processes, we can divide the curve into six parts mean plus or minus one multiplied by standard deviation.
Then, plus or minus two multiplied by standard deviation mean, plus or minus three multiplied by standard deviation. When a data follows a normal distribution, we can assume that 68 percentage of Data folds under mean plus or minus one standard deviation 95 percentage data falls under mean plus minus two standard deviations 99 point 70 percentage of data falls under mean plus minus three standard deviations. Now, what are the practical applications of these properties? Let us review a screw manufacturer knows his process is normally distributed with mean 50 mm and standard deviation sigma as to Mmm a random sample of 1000 screws were collected and he needs to know how many percentage of screws are expected to fall between 48 to 52 mm 46 to 54 mm and 44 to 56 mm. Let us use the properties of normal distribution to find it out. If we calculate the mean plus minus one sigma with this data, we can come to know that values between 48 to 52 fall under mean plus minus One sigma and as per the properties of normal distribution 68% of data falls under the area of mean plus minus one sigma that means 68% of 1000 which is equal to 680.
So, we can predict that 680 screws will be within the diameter of 48 to 52 mm. Similarly, we can predict the expected number of screws that may fall within the diameter of 46 to 54 mm values 46 to 54 fall under the area of plus minus two standard deviation as per the properties of normal distributions 95% of data falls under mean plus minus two standard deviations. It means 950 screws are expected to be falling between diameter of 44 to 56 mm. Similarly, we can find out the percentage of data expected to fall within the diameter range of 44 to 56 mm. That was fine if we know the mean and standard deviation of normal distributed processes, we would be able to predict the percentage of data falling under one, two and three standard deviation from mean, isn't it? But what about the data falling in between them?
Say for example, percentage of data falling under the dimension range of 46.5 and 51.5 mm. Is there any way to find it out? Yes, of course, we will be able to predict it by linking this normal distribution with a standard normal distribution Now the question arises, what is the standard normal distribution? How do I link my normal distribution with a standard deviation? Well, standard deviation is a normal distribution with the mew is equals to zero. standard deviation is equals to one, we link any normal distribution to standard normal distribution with the help of Z score, where z is equal to x minus mean divided by standard deviation, where x is any data point.
In a normal distribution. z value can be either positive or negative Let Us link our normal distribution with standard normal distribution. Let us see the corresponding z values what is Z value corresponding to x is equals to 46.5 minus 50 divided by two is equals to minus 1.75. Similarly, Z value corresponding to 51.5 equals 51.5 minus 50 divided by two which comes as point seven In five Now, what is the use of finding the z score? Let me explain in the coming slides for a standard normal distribution, we can divide the total area under the curve into two halves 50% under the plus side of mean and 50% under the minus side of the mean. When we know the z value, we can find out the area corresponding to that particular Z value.
The area corresponding to a particular Z value is nothing but the probability of data Following below this particular value of z, the area can be obtained from a Z table. Let us predict the percentage of data falling under 46.5 to 51.2 mm of dimensions the prediction is nothing but a probability of occurrences and find out the probability of data falling at these z values. We need to use a Z table Z table is a standard table that can be used to find out the area under the curve corresponding to a particular Z value. To find out the area under the curve for z value is equals to 0.75. Find the row that contains the value 0.7 and the column that contains the value 0.05. Hence, the area under the curve for z value 0.75 is equals to 0.2734.
Similarly, to find out the area under the curve for z value is equals to 1.75 find the row that contains the value 1.7 and the column that contains the value 0.05 and we get the area's corresponding to Z value 1.75 as 0.4599. Now, we can find out the total area under the curve as area corresponding to Z value 0.75 plus area corresponding to Z value 1.75 that is equal to 0.2734 plus 0.4599 is equals to 0.0 7333 Now, this area is nothing but the probability there are 73.33% of screws expected to be within the diameter of 46.5 to 51.2 mm. However, what we have discussed is a normal Z table. There are cumulative z tables also in cumulative tables, the method will be different in cumulative method to find out the area between two z values we use the area between Z one and z two is area between zero to z two minus area between zero to z one.
Normal probability could also be determined using Microsoft Excel. Excel gives the cumulative value of z. for finding normal probability, we need to know two factors mean and standard deviation. The formula for finding Z value in excellence equals two norm dist. Open the bracket x comma mean comma, standard deviation comma true close the bracket to find the z value in Excel well the example we just discussed, type in one cell of Axl as equals to norm list, open bracket 51.5 comma 50 comma two comma true closed bracket, we get the answer as 0.7733. Similarly, type in another cell as equals true, non list, open bracket 46.5 comma 50 comma To comma true close the bracket, we get the answer as 0.040. Since, this is a cumulative table of z subtract 0.7733 from 0.040 we get the probability of data lying between 46.5 and 51.5 as 73.33%.
Now, let us move to the other continuous probability distributions, t distribution are similar to normal distribution The major difference between normal distribution is used when population parameter is known, student distribution is used when population parameters are unknown. In practical situation, population parameters are seldom known. Student t distribution is also suitable when sample size is less than 30. The formula for t distribution is similar to that of Z value, where z is equals to x minus the mean divided by sigma and the T is equals to x minus mean divided by s divided by square root of n. There are two tables for finding out the area under the t distribution curves similar to that of normal distribution curves apart from normal and student t distribution, the other two popularly used continuous probability distributions are chi square and F distribution. They are used for comparing the variance of two samples, our population the major input for these distributions are the degree of freedom which is nothing but one minus of the sample size and minus one these distributions are commonly used during hypothesis testing, the formula of chi square distribution is x square is equals to n minus one into S square divided by sigma square.
Similarly, formula for F distribution is F is equals to x divided by variance one divided by y divided by variance two. That's all for this lecture. This is an important lecture. You may repeat this lecture to understand it thoroughly.