- The sample variance s2 is given by
(a) s2 = å (x –
x )2 / (n-1).
Alternatively,
(b) s2 = [ å x2
- ( å x )2/ n ] / (n-1)
Example. Suppose the number of chocolate chips is counted in
each of five cookies (chocolate chip cookies of course), with the following
results:
deviation
Cookie x =# chips (x – x ) (x – x )2 x2
- 5 5 – 7 = -2 4 25
2 11 11 – 7 = 4 16 121
3 6 6 – 7 = -1 1 36
4 8 8 – 7 = 1 1 64
5 5 5 – 7 = -2 4 25
å x = 35
å (x – x )2 = 26 å
x2 = 271
Using (a), s2 = å (x
– x )2 / (n-1) = 26 / (5-1) = 26 / 4 = 6.5
Using (b), s2 = [ å x2
- ( å x )2/ n ] / (n-1) = [271 – (35)2
/ 5] / (5-1) = (271 – 245 ) / 4 = 26/4 = 6.5
The sample standard deviation s is the square root of
the sample variance:
s = / s2 = /
6.5 = 2.55
The quantities s and s2 are statistics,
that is, they are numerical values calculated from a sample.
- Find the sample mean x , the sample variance s2 , and the
sample standard deviation (sd) of the following data on the number x of
dates 4 males had in the past month, using both formulas:
Male 1 2 3 4
x = # dates 3 2 6 1
Calculate s and s2 in Minitab to check your
calculations.
- Find the sample mean, sample variance, and sample standard deviation (sd)
for the following number x of tattoos that 3 students randomly selected had:
Student George Roberta Mary
x = # Tattoos 2 4 5
Also, find the median. Mean = ______; Median = ______; s2
= ______; s = ______ .
Hint: it is easier to use the second formula above to find
s2. Calculate s and s2 in Minitab to check your
calculations.
- Change the last number—5—in ii above, to 20 and recalculate the
same 4 quantities. Then discuss briefly the effect of putting in a value far
from the middle value (median). The median is a statistic that is resistant
to outliers, in the sense that it’s values are not greatly affected by
outliers. The Interquartile Range is also a resistant statistic, whereas the
mean and sample standard deviation/variance are not resistant. Calculate s
and s2 in Minitab to check your calculations.
- The weights of students in the Physical Measurements Data Set have the
following statistics:
Males: Average = 175.92, SD s= 27.40
Females: Average = 130.46, SD s = 17.62
Using the Empirical Rule, find the endpoints of the
interval of weights which should contain about 68%, 95%, and 99.7% of the
weights for both males and females. It turns out that the actual percents in
the three intervals for males are 72.24%, 97.34%, and 98.86%, while for
females the actual percents are 75.2%, 95.73%, and 98.58%. See section 2.6 in
Heckard/Utts for a similar example on heights.
- The probability distribution of the number X of times students change
their major(s) is as follows:
x p(x)
- .50
- .30
- .10
- .07
- .03
i. What proportion of students change their majors exactly 2
times? Answer: ____
- What proportion of students change their majors at most 2 times (at most 2
means 2 or fewer times)?
Answer: ____
iii. Find the mean m of the
probability distribution.(show work above) Answer: ____
iv, Find the population variance and SD s
2 and s , respectively. Answer: ____ and
____
4. The probability distribution of the number X of
days/week college students have at least one alcoholic drink is given below.
Find the mean m , and variance s
2 and SD s of X.
x p(x)
- .15
- .15
- .35
- .25
- .08
- .02
- Write a short paragraph explaining the difference between the sample mean
and the population mean, using the random variable X in problem 4 above.
Assume that the population consists of the numbers 0, 1, 2, 3, 4, and 5
corresponding to the number of days per week each of the 30000 PSU students
have at least one alcoholic beverage. Assume also that one has a sample of
size 800 from this population.
- Write a short paragraph explaining the difference between the sample sd
and the population sd, using the random variable X in problem 4 above.
Assume that the population consists of the numbers 0, 1, 2, 3, 4, and 5
corresponding to the number of days per week each of the 30000 PSU students
have at least one alcoholic beverage. Assume also that one has a sample of
size 800 from this population. Also answer the following questions::
- Is the population mean a fixed quantity? Or does it change with each
sample that is taken?
- Is the sample mean a fixed quantity? Or does it change with each sample
that is taken?
- What is the formula used to calculate the sample sd?
- What is the formula used to calculate the population sd?
- Is the sample mean a statistic or a parameter?
- Is the population mean a statistic or parameter?
- What is the relation between the sample mean and population mean?
- What is the relation between the sample sd and the population sd?
Short paragraph:
Answer to questions:
- ------------------------------------------
- ------------------------------------------
- Formula:
- Formula:
- ------------------------------------------
- -------------------------------------------
- ----------------------------------------------------------------------------------------------------------------
- ----------------------------------------------------------------------------------------------------------------