Let's get started! Here is what you will learn in this lesson.
Learning objectives for this lesson
Upon completion of this lesson, you should be able to do the following:
- Understand the relationship between the slope of the regression line and correlation,
- Comprehend the meaning of the Coefficient of Determination, R2,
- Now how to determine which variable is a response and which is an explanatory in a regression equation,
- Understand that correlation measures the strength of a linear relationship between two variables,
- Realize how outliers can influence a regression equation, and
- Determine if variables are categorical or quantitative.
Examining Relationships Between Two Variables
Previously we considered the distribution of a single quantitative variable. Now we will study the relationship between two variables where both variables are qualitative, i.e. categorical, or quantitative. When we consider the relationship between two variables, there are three possibilities:
- Both variables are categorical. We analyze an association through a comparison of conditional probabilities and graphically represent the data using contingency tables. Examples of categorical variables are gender and class standing.
- Both variables are quantitative. To analyze this situation we consider how one variable, called a response variable, changes in relation to changes in the other variable called an explanatory variable. Graphically we use scatterplots to display two quantitative variables. Examples are age, height, weight (i.e. things that are measured).
- One variable is categorical and the other is quantitative, for instance height and gender. These are best compared by using side-by-side boxplots to display any differences or similarities in the center and variability of the quantitative variable (e.g. height) across the categories (e.g. Male and Female).