STAT 220: Basic Statistics for Quantitative Students

Spring 2006

Assignment due Feb. 17

Type or write your answers to the following questions to turn in on Feb. 17 in class.

  1. There is a dataset on the web that includes the results of many questions that were administered to the students in three different sections of STAT 100 (on in spring 2004, one in spring 2005, and one in spring 2005). The dataset is stored as a .csv file, which means that the columns are separated by commas. The dataset, called survey.csv, and another file describing the variables, called survey.txt, are in the datasets directory. Read this dataset into R using the following line:
    s = read.csv("http://www.stat.psu.edu/~dhunter/220/files/datasets/survey.csv", na.strings="")
    
    Note: The na.strings="" argument is there so that any blanks are read in as NA. Assuming this sample is representative of some population of interest, answer the following questions about this population. Express your answers as formal hypothesis tests. (That is, give hypotheses, calculate test statistics and p-values, and express your conclusions in plain language.)
    1. Is the proportion of students with pierced ears different in the spring semesters (SP04 and SP05) than in the fall semester (FA05)? The relevant columns are "Class" and "Earprc".
    2. Is the mean GPA for nonsmokers different from that of smokers? The relevant columns are "Cigpacks" and "GPA". The former gives the weekly number of packs of cigarettes smoked, so you may want to create a new TRUE/FALSE variable that tells whether each person is a smoker.
  2. Do exercises 13.12, 13.29, 13.37, 13.54, 13.70, and 13.75.
As always, email me if you have questions.