STAT 220: Basic Statistics for Quantitative Students

Spring 2006

Assignment due Apr. 14

Type or write your answers to the following questions to turn in on Apr. 14 in class. As always, show all your work.

  1. Do exercises 14.2, 14.8, 14.11, 14.38

  2. You will probably find this sample R code helpful in completing this question. (Note: You should read the accompanying example and then copy and paste the sample code into R in order to understand what it does.) Use the STAT 100 dataset at http://www.stat.psu.edu/~dhunter/220/files/datasets/survey.csv to do the following:

    1. Produce a scatterplot of Idealwt vs. Weight (print and turn in this plot). Use Idealwt as the response variable in your plot. Instead of dots for each point, use the symbol "x" for each male and "o" for each female. Add two regression lines to your plot, one for the males and one for the females.

    2. Print out the regression output for females' ideal weight vs. weight (turn in this output). Based on this output, answer the following questions:

      1. What is the predicted increase in ideal weight for each increase of one pound in actual weight?

      2. Why does the intercept have no useful interpretation in this case?

      3. What proportion of the variation in ideal weight can be explained by its linear relationship with actual weight among females in this sample? How do you know this?

      4. What is the value of the correlation coefficient? How do you know whether it is positive or negative?

      5. Consider the hypothesis H0: There is no linear relationship between actual weight and ideal weight in the population. What is the T-statistic for testing this hypothesis? How many degrees of freedom are associated with this statistic?

    3. Produce a Q-Q normal plot of the residuals for both the male and the female regression (you do not have to include these plots with your homework). Which of the assumptions on page 512 does this plot check? One of the two plots indicates that there are two extreme residuals. Which plot is it? Finally, what do the overall straight-line patterns observed in these plots indicate?

As always, email me if you have questions.