A sales manager has collected the following data on annual sales ( y) and years of experience ( x)

.
Sales person Years of Experience ( x) Annual Sales (K’000) ( y)
1 1 80
2 3 97
3 4 92
4 4 102
5 6 103
6 8 111
7 10 119
8 10 123
9 11 117
10 13 136
(a) Draw a scatter diagram. Does a linear relationship between x and y seem appropriate
(b) Estimate the simple linear regression line. Interpret the parameters in the model
(c) What practical use could be made of this equation?
(d) Use the estimated regression equation to predict annual sales for a sales man with 9 years
of experience
(e) At the 5% level of significance would you conclude that there is a linear relationship
between x and y ?
(f) Construct a 95% confidence interval for the slope parameter β1
.
(g) Find the correlation coefficient
(h) Find the coefficient of determination and interpret the value.
(i) Give the ANOVA table
(j) Use the ANOVA table to test for a significant linear relationship between years of
experience and annual sales.

(a) Scatter diagram:

The scatter diagram shows a somewhat linear relationship between years of experience and annual sales, with sales generally increasing as years of experience increase.

(b) Estimation of the simple linear regression line:

The simple linear regression line can be estimated using the formula:
\[ y = a + bx \]
Where:
\[ b = \frac{n(\sum xy) - (\sum x)(\sum y)}{n(\sum x^2) - (\sum x)^2} \]
and
\[ a = \frac{\sum y - b(\sum x)}{n} \]

Calculating these values using the given data, we get:
\[ b ≈ 3.217 \]
\[ a ≈ 79.060 \]
Therefore, the estimated regression line is:
\[ y = 79.060 + 3.217x \]

The parameter 'a' in the model is the intercept, which represents the expected annual sales for a salesperson with 0 years of experience.
The parameter 'b' in the model is the slope, which represents the change in annual sales for a one unit increase in years of experience.

(c) Practical use of the equation:

The regression equation can be used to predict the annual sales of salespersons based on their years of experience, helping in setting sales targets, performance evaluations, and identifying training needs.

(d) Prediction for a salesperson with 9 years of experience:

Using the estimated regression equation:
\[ y = 79.060 + 3.217(9) = 107.807 \]

Therefore, the predicted annual sales for a salesperson with 9 years of experience is approximately 107.807K.

(e) Hypothesis testing:

To test if there is a linear relationship between x and y:
\[ H0: β = 0 \]
\[ Ha: β ≠ 0 \]

At 5% level of significance, a t-test or F-test can be performed.

(f) 95% confidence interval for the slope parameter β1:

The 95% confidence interval can be calculated using the formula:
\[ b ± t_{\alpha/2} * SE(b) \]

(g) Correlation coefficient:

The correlation coefficient can be calculated using the formula:
\[ r = \frac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{n(\sum x^2) - (\sum x)^2) * \sqrt{n(\sum y^2) - (\sum y)^2)} \]

(h) Coefficient of determination:

The coefficient of determination, \( R^2 \), represents the proportion of the variance in the dependent variable (annual sales) that is predictable from the independent variable (years of experience). It ranges from 0 to 1, with 1 indicating a perfect fit.

(i) ANOVA table:

The ANOVA table includes the sum of squares, degrees of freedom, mean squares, F-value, and p-value for testing the significance of the regression model.

(j) Testing for a significant linear relationship:

The ANOVA table can be used to test if the regression model is significant at a certain level of significance. This helps determine if years of experience has a significant impact on annual sales.