Review Topics for Mid-term: Basic Concepts and things to Know

  1. Basic ideas of statistical inference
    1. Types of hypotheses (research, null, and alternative)
    2. Type I and Type II errors
    3. Rejection regions (ex. Reject H0 : p = p0 vs the alternative Ha : p < p0 if the sample proportion if much smaller than p0 .
    4. P-values: Definition and how used (for example, reject H0 : p = p0 vs the alternative Ha : p < p0 if the p-value is less than .05. The p-value here is the probability of getting the sample proportion you got or an even smaller one).
    5. Specification of null and alternative hypotheses for one proportion, two proportions, three or more proportions, one mean, two means, three or more means, in terms of symbols. Ex. Test about a population proportion: H0 : p = p0 Ha : p < p0 . the symbol here is p (for a proportion). For two means, we would use : H0 : m 1= m 2 Ha : m 1¹ m 2 , a two-sided alternative with the rejection region being reject H0 if the difference between the two sample means is either large negatively or large positively; the symbol here is m with subscripts to denote population means)
    6. Confidence Intervals: their components and how they are put together, margin of error, confidence level
    7. Deciding whether the test is about proportions or means (look at the response variable: if it is categorical, it is about proportions; if it is numerical (quantitative), it is about means).

Suggested materials to look at: Handout on Testing Statistical Hypotheses (worked on in week 2), Cyberstats C-1, Project I.

2.   Numerical and graphical methods for describing data
    1. One categorical variable: numerical description is a count and a sample proportion. Graphical displays include Pie charts, Charts.
    2. Two categorical variables: numerical summaries include tallies for each variable and a cross-tab of both (with explanatory variable forming rows of the table). Graphical displays include side-by-side pie charts of each variable separately, using percents.
    3. Numerical (quantitative) variables: numerical descriptions include the sample mean and standard deviation, five-number summary, percentiles, and interquartile range (IQR). Graphical displays include histograms, stem-and-leaf diagrams, and boxplots.
    4. Measures of location (center): mean and median.
    5. Measures of variability/spread: standard deviation, range, interquartile range and how they are calculated..
    6. Measures of position: percentiles
3.   Random Variables, Probability Distributions, and Expected Values
    1. Definition of 'random variable'
    2. Definition of 'Probability Distribution' and how to answer questions about the probability of values of the random variable. Definition of 'cumulative probability distribution' and what it gives.
    3. Relation between a 'population' and a 'probability distribution' (probability distribution summarizes the values in the population).
    4. Mean of a population/probability distribution/random variable (three equivalent concepts).
    5. How the mean of a probability distribution is calculated [m = å x p(x)].
    6. How the population variance s 2 is calculated [s 2 = å x2 p(x)] - m 2 ]; population standard deviation s = square root of population variance.
    7. Difference between a statistic and a parameter.
    8. Difference between the sample mean and the population mean and the difference between the sample standard deviation s and the population standard s .
    9. The binomial distribution: what it describes, its mean and standard deviation.
    10. The normal distribution: types of random variables it often describes.
4.  Types of variables
    1. Categorical
    2. Numerical (quantitative)

It would probably be helpful to review all of the RAT's, Homework 2, and the CSL activities (especially CSL 4A, 4B, 5A, and Project I). We will have two more working activities in the labs: one on the binomial and normal distributions, and a second on sampling distributions (which I will not hold you responsible for on the Mid-term).