Welcome to lecture 22 on drawing valid statistical conclusions. In this lecture, we will learn to distinguish between the descriptive and inferential statistics and distinguish between population parameter and sample statistic. Let us first understand what statistics is, well, statistics is a way of getting meaningful information from data. Then what a data is. Data is the record of actual observation For example, marks of 35 students in science of a class. But data as it is not useful unless we generate meaningful information from a set of data or is information then information our answers to a question.
For example, from the above data set the marks of 35 students region rate information by asking questions, such as how many students scored below 50% how many students score about 90% What is the trend of fast percentage for the last three years? What percentages of students are expected to pass in upcoming exams based on three years trend, etc etc. meaningful information from a data set can be obtained broadly in two ways, such as mostly describing a set of data by arranging the data in ascending order. Understanding the central tendency, which is mean, median, mode, etc understanding the spread of data range, standard deviation, etc. Secondly, by drawing some conclusions based on about set of descriptions, predicting the outcome of process based on various patterns, probability of percentage of data falling between a range of values. Testing assumptions are hypothesis etc.
In statistics, these methods are descriptive statistics and inferential statistics respectively. These are the two branches of statistics descriptive statistics which measures the central tendency and variation. Various metrics of descriptive statistics are mean median, mode, range, variance, standard deviation, etc inferential statistics that predict the behavior of old data set based on descriptive measures. various techniques involved in inferential statistics are probability distributions. I put them hypothesis testing, etc Now, let us come to the concepts of population and sample. This has got a significant role in handling the six sigma project.
Let us imagine a situation wherein you are assigned to study about the demographic profiles of say 20 to 30 employees in your organization. It seems to be easy, isn't it? Because there are only a few people to be studied. But what about the situation? If your assignment is to study the profiles of employees in say, another branch it becomes a little difficult but still You could manage 50 to 100 profiles of employees of your organization. Now consider an assignment of studying the profiles of see all the employees who are spread across the world in various countries say for 1 million employees, is it possible to study the profiles of each and every member now, or in other words, is it possible always to study hundred percent of large groups of course, not possible at all.
What is the alternative then? As an alternative, we can take a sample from the whole group of study. Now, let us define population and sample respectively. population is termed as a group of all items under study. It is not only a group of people, but any large group of interest. For example, population of Coca Cola produced from one of its plants and the sample is a set of data drawn from population.
Now, instead of studying the hundred percent population, we can draw a sample set and study about the population. It is always better to study large number of small representative samples Now the descriptive measure of population and sample are known as parameter and statistic. parameter. The descriptive statistic measures mean variance and standard deviation of population is known as parameter. statistic. The descriptive statistics measures mean variance and standard deviation of a sample is known as statistic.
Be here Don't be confused with statistics and statistics. Both are different the central tendency The mean of population or parameter is denoted as mu. Whereas, central tendency called the mean of sample is denoted as x bar. Similarly, variance of population is denoted as sigma square. and variance of sample is denoted as a square. and the standard deviation of population is denoted as sigma.
Whereas, standard deviation of sample denoted as small s conclude, let me depict the relation of Sample measures and population measures as it is not possible to study the complete population as an alternative, we select homogeneous samples from the population and find out the sample measures such as sample mean and sample standard deviation. Then, with the help of these sample measures, we try to predict Let me repeat, we can only predict with some level of confidence about the parameters of population such as population mean and population standard deviation. Ah that's all for this lecture. We have entered into the status To expand and start moving towards the complex analysis tools, repeat the lecture if you have not understood it well else. Let's proceed to the next lecture on central limit theorem and sampling distribution. Thank you