Directions: In this portfolio project, you will be collecting a set of data and analyzing the characteristics of the distribution. Provide yourself with plenty of time to complete step 1.

Part 1
Collect a set of data with at least 30 data points. The data should be quantitative, which means that it should be measured using numbers. You can be as creative as you’d like, but here are some suggestions for things that you can survey.
• the heights of a large number of people
• the number of pages in a set of books on a bookshelf
• the number of hits earned by different professional baseball players in a
season
Part 2
Create a visual representation of your data. If the data is continuous, use a histogram. If the data is discrete, use a bar graph. Make sure to label the axes with appropriate titles and incorporate the appropriate scale on each axis.
Part 3
Respond to the following questions.
1. What are the mean and standard deviation of the set of data?
2. Does the data follow a normal distribution? Be sure to mathematically justify
your answer.
3. Answer one of the following questions.
a. If your sample follows a normal distribution, does this makes sense to you? Explain why.
b. If your sample does not follow a normal distribution (e.g., it could be skewed left or right, have a uniform distribution, or have some other shape), then why might this be the reason?
4. Describe your survey process. What are some sources of possible bias in your sample? Alternatively, what did you do to ensure a random sample?
5. What is a set of data that you would like to study in the future? How could you go about ensuring an unbiased random sample?
Submission
Make sure to submit your data set from Part 1, your histogram or bar graph from Part 2, and your responses to the questions in Part 3.

Part 1:

I have collected data on the heights of 50 people in my neighborhood. The heights are measured in inches.

Data set:
65, 67, 70, 63, 68, 72, 66, 71, 69, 64, 67, 70, 75, 68, 73, 66, 72, 70, 64, 69, 70, 67, 71, 68, 67, 74, 72, 67, 66, 73, 68, 71, 69, 72, 65, 68, 70, 73, 71, 67, 68, 69, 71, 72, 70, 66, 68, 73, 69

Part 2:
Histogram of heights of people in the neighborhood

Height (in inches) | Frequency
---------------------|-------------
60-65 | 3
65-70 | 14
70-75 | 21
75-80 | 2

Part 3:
1. The mean height is (sum of all heights) / 50 = 68.36 inches. The standard deviation is 2.67 inches.
2. To test if the data follows a normal distribution, I can calculate skewness and kurtosis. If skewness is close to 0 and kurtosis is close to 3, then the data is normally distributed.
3. My sample does not follow a normal distribution. It is slightly skewed to the right. This could be because of the presence of outliers or due to inherent characteristics of the population in my neighborhood.
4. To conduct the survey, I simply asked individuals for their height, which could introduce bias if some people chose not to disclose their height or if I inadvertently only surveyed certain groups. To ensure a random sample, I could have used a random sampling method to select individuals.
5. In the future, I would like to study the distribution of household incomes in my city. To ensure an unbiased random sample, I could use a stratified random sampling method where I divide the population into income brackets and then randomly select individuals from each bracket to ensure representation across all income levels.