STAT 220: Basic Statistics for Quantitative Students
Spring 2006
Assignment due Apr. 14
Type or write your answers to the following questions to turn in on Apr. 14 in class.
As always, show all your work.
- Do exercises 14.2, 14.8, 14.11, 14.38
- You will probably find this
sample R code helpful in completing this question.
(Note: You should read the accompanying example and then copy and paste the
sample code into R in order to understand what it does.)
Use the STAT 100 dataset at
http://www.stat.psu.edu/~dhunter/220/files/datasets/survey.csv
to do the following:
- Produce a scatterplot of Idealwt vs. Weight (print and turn in this plot).
Use Idealwt as the response variable in
your plot. Instead of dots for each point, use the symbol "x" for each male and "o" for
each female. Add two regression lines to your plot, one for the males and one for the females.
- Print out the regression output for females' ideal weight vs. weight (turn in this output).
Based on this output,
answer the following questions:
- What is the predicted increase in ideal weight for each increase of one pound in actual weight?
- Why does the intercept have no useful interpretation in this case?
- What proportion of the variation in ideal weight can be explained by its linear relationship
with actual weight among females in this sample? How do you know this?
- What is the value of the correlation coefficient? How do you know whether it is positive or negative?
- Consider the hypothesis H0: There is no linear relationship between actual weight and
ideal weight in the population. What is the T-statistic for testing this hypothesis? How many
degrees of freedom are associated with this statistic?
- Produce a Q-Q normal plot of the residuals for both the male and the female regression (you do not
have to include these plots with your homework). Which of
the assumptions on page 512 does this plot check? One of the two plots indicates that there are two extreme
residuals. Which plot is it? Finally, what do the overall straight-line patterns observed in these
plots indicate?
As always, email me if you have questions.