# The following data represent the asking price of a simple random sample of homes for sale. Construct a 99% confidence interval with and without the outlier included. Comment on the effect the outlier has on the confidence interval.

Here is the information: $162,000 $279,900 $219,900 $143,000 $205,800 $225,000 $459,900 $190,000 $187,500 $276,900 $147,800 $264,900

A) Construct a 99% confidence interval with the outlier included: ($_____ , $ _____)

B) Construct a 99% confidence interval with the outlier removed: ($_____ , $ _____)

## 99% = mean ± 2.575 SEm

SEm = SD/√n

Find the mean first = sum of scores/number of scores (n)

Subtract each of the scores from the mean and square each difference. Find the sum of these squares. Divide that by the number of scores to get variance.

Standard deviation = square root of variance

I'll let you do the calculations.

## 1. A

2. A

3. C

4. B

5. C

6. D

7. B

8. C

9. D

10. C

11. C

12. B

13. C

14. C

15. D

16. B

17. B

18. C

19. C

20. A

## To construct the confidence intervals, we first need to find the mean (x̄) and the standard deviation (s) of the sample. Then we can calculate the confidence intervals using the formulas:

\[ \text{Confidence Interval} = \left( \bar{x} - \text{margin of error}, \bar{x} + \text{margin of error} \right) \]

where,

\[ \text{margin of error} = Z \cdot \left(\frac{s}{\sqrt{n}}\right) \]

and Z is the z-value obtained from the z-table for the desired confidence level and n is the sample size.

Let's calculate the confidence intervals:

A) With the outlier included:

Sample size (n): 12

Sum of the sample values: $2,781,800

Mean (x̄):

\[ \bar{x} = \frac{\text{Sum of the sample values}}{n} = \frac{2,781,800}{12} = $231,817 \]

Standard deviation (s):

To find the standard deviation, we need to calculate the sum of the squared deviations from the mean (SSD) and divide it by (n-1). So,

\[ SSD = \sum{(x_i - \bar{x})^2} \]

Substituting the values:

\[ SSD = (162,000 - 231,817)^2 + (279,900 - 231,817)^2 + \ldots + (264,900 - 231,817)^2 = 21,950,120,700 \]

Now,

\[ s = \sqrt{\frac{SSD}{n-1}} = \sqrt{\frac{21,950,120,700}{11}} = $63,345 \]

Z-value for a 99% confidence level:

Since we want a 99% confidence interval, we have to find the z-value for a 0.995 (0.5% probability from each tail combined) confidence level. By referring to the z-table, we can find that the z-value is approximately 2.576.

Margin of error:

\[ \text{margin of error} = Z \cdot \left(\frac{s}{\sqrt{n}}\right) = 2.576 \cdot \left(\frac{63,345}{\sqrt{12}}\right) = $54,280 \]

Confidence interval (with outlier):

\[ \text{Confidence Interval} = (x̄ - \text{margin of error}, x̄ + \text{margin of error}) = ($231,817 - $54,280, $231,817 + $54,280) = ($177,537, $286,097)

B) With the outlier removed:

In this case, the outlier to be removed is $459,900.

New sample size (n): 11

New sum of the sample values: $2,321,900

New mean (x̄):

\[ \bar{x} = \frac{\text{New sum of the sample values}}{n} = \frac{2,321,900}{11} = $210,173 \]

New standard deviation (s):

To find the standard deviation, we need to calculate the sum of the squared deviations from the mean (SSD) and divide it by (n-1). So,

\[ SSD = (162,000 - 210,173)^2 + (279,900 - 210,173)^2 + \ldots + (264,900 - 210,173)^2 = 12,973,082,200 \]

Now,

\[ s = \sqrt{\frac{SSD}{n-1}} = \sqrt{\frac{12,973,082,200}{10}} = $114,070 \]

New Z-value for a 99% confidence level:

Since the sample size has changed, we need to recalculate the Z-value. By referring to the z-table, we can find that the z-value is still approximately 2.576.

New margin of error:

\[ \text{margin of error} = Z \cdot \left(\frac{s}{\sqrt{n}}\right) = 2.576 \cdot \left(\frac{114,070}{\sqrt{11}}\right) = $74,003 \]

Confidence interval (without outlier):

\[ \text{Confidence Interval} = (x̄ - \text{margin of error}, x̄ + \text{margin of error}) = ($210,173 - $74,003, $210,173 + $74,003) = ($136,171, $284,175)

Effect of the outlier on the confidence interval:

The presence of the outlier significantly increases the standard deviation and therefore increases the margin of error. Consequently, the confidence interval becomes wider, capturing a larger range of possible values. This is evident from the comparison of the two confidence intervals: ($177,537, $286,097) with the outlier and ($136,171, $284,175) without the outlier. Removing the outlier reduces the variability in the sample and results in a narrower confidence interval.

## To construct a confidence interval, we need to find the mean and standard deviation of the data set. Let's start by calculating each of these values.

First, let's calculate the mean (average) of the data set by summing up all the values and dividing by the number of data points:

Mean = (162,000 + 279,900 + 219,900 + 143,000 + 205,800 + 225,000 + 459,900 + 190,000 + 187,500 + 276,900 + 147,800 + 264,900) / 12

Mean = $222,133.33 (rounded to the nearest cent)

Next, we need to calculate the standard deviation. The standard deviation is a measure of how spread out the data is from the mean. However, to find the standard deviation, we need to calculate the squared differences from the mean for each data point, sum them, and then divide by the sample size minus one (12 - 1 = 11), finally taking the square root of that result.

Let's calculate the squared differences from the mean for each data point:

(162,000 - 222,133.33)^2, (279,900 - 222,133.33)^2, (219,900 - 222,133.33)^2, (143,000 - 222,133.33)^2, (205,800 - 222,133.33)^2, (225,000 - 222,133.33)^2, (459,900 - 222,133.33)^2, (190,000 - 222,133.33)^2, (187,500 - 222,133.33)^2, (276,900 - 222,133.33)^2, (147,800 - 222,133.33)^2, (264,900 - 222,133.33)^2

Now, sum up these squared differences and divide by 11:

Sum = sum of [data point - mean)^2] = 1,169,860,897,000

Standard Deviation = sqrt(167,123,699,571.43) = $12921.47

Now, we can construct the confidence intervals.

A) With the outlier included:

We need to find the critical value associated with a 99% confidence level. Since the sample size is relatively small (n = 12), we'll use a t-distribution.

The degrees of freedom for a t-distribution with n-1 = 11 degrees of freedom at 99% confidence is approximately t(0.995, 11) = 3.106.

The margin of error, E, can be calculated as E = critical value * standard deviation / sqrt(n):

E = 3.106 * 12921.47 / sqrt(12) = 19,022.92

The confidence interval is given by: (mean - E, mean + E)

CI = ($222,133.33 - $19,022.92, $222,133.33 + $19,022.92)

CI = ($203,110.41, $241,156.25)

B) With the outlier removed:

Let's remove the outlier, which is $459,900, from the data set.

New mean = (162,000 + 279,900 + 219,900 + 143,000 + 205,800 + 225,000 + 190,000 + 187,500 + 276,900 + 147,800 + 264,900) / 11

New mean = $215,200 (rounded to the nearest cent)

New standard deviation: The squared differences from the new mean would be calculated without taking into account the outlier and then take the square root of the result.

The new standard deviation is sqrt(152065441586.36) = $12343.42 (rounded to the nearest cent)

Now, we need to recalculate the margin of error and construct the new confidence interval.

New margin of error = 3.106 * 12343.42 / sqrt(11) = 20,197.30

The new confidence interval is: (new mean - new E, new mean + new E)

New CI = ($215,200 - $20,197.30, $215,200 + $20,197.30)

New CI = ($195,002.70, $235,397.30)

Comment on the effect the outlier has on the confidence interval:

The outlier, which has a significantly higher value in this case ($459,900), has a substantial impact on the confidence interval. In part A, with the outlier included, the confidence interval ranges from $203,110.41 to $241,156.25. However, in part B, with the outlier removed, the confidence interval narrows down to $195,002.70 to $235,397.30. The removal of the outlier decreases the variability of the data, resulting in a narrower range for the confidence interval.