Hey everyone, welcome to this lecture, what is a scatter plot? a scatter plot is also known as the scatter diagram. It is a powerful visual tool used to display relationships or associations between two variables, cause and effect and so on. Why plotting the scatter diagram, the independent variable corresponds to the x axis or horizontal axis with the dependent variable on the y axis or vertical axis. The plot pattern identifies whether there is any positive or negative correlation or no correlation. If the data points are trending in an upward direction from left to right, the two variables are considered to be positively correlated.
That is, if the x variable increases, the y variable will increase and vice versa. On the other hand, If the data points are trending in a downward direction from left to right, the two variables are considered to be negatively correlated. That is, if the x variable increases, the y variable will decrease and vice versa. If you get a diagram like this, there is no correlation between the x and y variables. One important point to note is that the scatterplot can only be constructed when both x and y variables are continuous in nature. Let us look at this example.
You are given the data of height and weight of the players of a football team. Your job is to identify if there exists any relationship or association between these two variables height and weight. You draw a scatterplot and get this diagram. As you can see, the dots are moving in an upward direction from left to right. This indicates That the x and y variables that is the weight and height are positively correlated. Also note that positive correlation or negative correlation between variables does not mean there is a cause and effect relationship.
For example, if I change the header of these variables to sale of umbrellas and the crime rate, the scatterplot will still show that both these variables are positively correlated. But you will have to use your judgment to confirm if the two variables are indeed correlated.