Please Help! My book and instructor are no help. Please help me understand why I need to make a scatter diagram, when dealing with bivariate data, and how to find SS(x), SS(y), SS(xy), and r. I have all the formulas, but don't understand. Thanks!
I don't know that the diagram is required to be drawn, if you have all the xi, yi.
Using Excel, there is a method of doing the scatter plot, then a fitted line, and from that, determining SS.
So, I really guess I cant offer much help....is there another student in your class you can team up with...that is my recommendation.
Why a scatter diagram?
Well, to repeat a phrase, a picture is worth a 1000 words. With a scatter diagram, one can usually see, at an instant, any correlation between the variables x and y.
SS(x), SS(y), and SS(xy) are, by themselves, not very interesting or informative. However, they are components of calculations that are very helpful. So, dividing SS(x) by n-1 gives the sample variance of the x value; a very helpful statistic.
Combining the three statistics, according to your formula, gives r, which is a correlation statistic. r provides a measure of the linear relationship (correlation) between x and y. With a high degree of correlation, one can use values of x to predict values of y -- Which is what regression analyses is all about.
I hope this helps.
I understand that you are struggling to understand why you need to make a scatter diagram and how to calculate SS(x), SS(y), SS(xy), and r when dealing with bivariate data. Let me break it down for you.
1. Scatter Diagram:
A scatter diagram, also known as a scatter plot, is a visual representation of the relationship between two variables, x and y. It is used to determine if there is a correlation or pattern between the two variables. By plotting the data points on a graph, you can quickly see the overall trend of the data and identify any outliers or clusters. This visual representation can help in understanding and interpreting the relationship between the variables.
2. SS(x), SS(y), and SS(xy):
SS(x) represents the sum of squares of the x values, which is calculated by taking the difference between each x value and the mean of the x values, squaring it, and summing up all these squared differences. Similarly, SS(y) represents the sum of squares of the y values, and SS(xy) represents the sum of the cross-products of the differences between each x value and the mean of the x values, and the differences between each y value and the mean of the y values.
These calculations are useful because they provide information about the variability and dispersion of the data. SS(x) and SS(y) are used to calculate the sample variance of the x and y values respectively, which is a measure of how spread out the data points are around the mean. SS(xy) is used in the calculation of the correlation coefficient, r.
3. Calculation of r:
The correlation coefficient, denoted by r, measures the strength and direction of the linear relationship between the two variables, x and y. It ranges from -1 to +1, where values close to +1 indicate a strong positive correlation, values close to -1 indicate a strong negative correlation, and values close to 0 indicate little or no correlation.
To calculate r, you divide SS(xy) by the square root of the product of SS(x) and SS(y). This calculation normalizes the covariance (SS(xy)) to provide a standardized measure of the linear relationship between the variables.
In Excel, you can create a scatter plot by selecting the data and using the "Insert" tab to insert a scatter plot. Then, you can add a trendline or a fitted line to the plot to visualize the linear relationship. Excel also provides functions to calculate SS(x), SS(y), SS(xy), and r.
If you are still struggling to understand these concepts, I would encourage you to seek help from your instructor or reach out to a classmate for further explanation and clarification. It can be helpful to work with others who are studying the same material to deepen your understanding.