# Factor Analysis

### Varimax Rotation

Varimax rotation is the most common of the rotations that are available. This first involves scaling the loadings. We will scale the loadings by dividing them by the corresponding communality as shown below:

Here the loading of the ith variable on the jth factor after rotation, where is the communality for variable i. What we want to do is to find the rotation which maximizes this quantity.

The Varimax procedure, as defined below, selects the rotation to find this maximum quantity:

This is the sample variances of the standardized loadings for each factor, summed over the m factors. Our objective is to find a factor rotation that maximizes this variance.

Returning to the options of the factor procedure (marked in blue):

"rotate" asks for factor rotation, and here we have specified the Varimax rotation of our factor loadings.

"plot" asks for the same kind of plot that we were just looking at for the rotated factors. The result of our rotation is a new factor pattern which is given below (page 11 of SAS output):

The result of our rotation is a new factor pattern which is given below:

Here is a copy of page 10 from the SAS output here.

At the top of page 10 of the output, above, we have our orthogonal matrix T . The values of these rotated factor loadings in the SAS Output we have copied the here:

 Factor Variable 1 2 3 Climate 0.021 0.239 0.859 Housing 0.438 0.547 0.166 Health 0.829 0.127 0.137 Crime 0.031 0.702 0.139 Transportation 0.652 0.289 -0.028 Education 0.734 -0.094 -0.117 Arts 0.738 0.432 0.150 Recreation 0.301 0.656 0.099 Economics -0.022 0.651 -0.551

Here we want to look at this new set of values to see if we can interpret the data based on the rotation. We have highlighted the values that are large in magnitude and from this we can make the following interpretation. Note that the interpretation is much cleaner than that of the original analysis.

• Factor 1: primarily a measure of Health, but also increases with increasing scores for Transportation, Education, and the Arts. As each of these variables increase, so do the other three.
• Factor 2: primarily a measure of Crime, Recreation, and the Economy. As one variable increases, so do the other two.
• Factor 3: primarily a measure of Climate alone .

This is just the pattern that exists in the data and no causal inferences should be made from this interpretation. It does not tell us why this pattern exists. It could very well be that there are other essential factors that are not seen at work here.

Let's look at the amount of variation explained by our factors under this rotated model and what it looked like under the original model. Consider, here, the variance explained by each factor under the original analysis and the rotated factors:

 Analysis Factor Original Rotated 1 3.2978 2.4798 2 1.2136 1.9835 3 1.1055 1.1536 Total 5.6169 5.6169

The total amount of variation explained by the 3 factors remains the same. The total amount of the variation explained by both models is identical. Rotations, among a fixed number of factors, does not change how much of the variation is explained by the model. We get equally good fit regardless of what rotation is used.

However, notice what happened to the first factor. Here you see a fairly large decrease in the amount of variation explained by the first factor. This shows what is happening here. We obtained a cleaner interpretation of the data but you can't do it without it costing us something somewhere. What it has done here is to take the variation explained by the first factor and distributes it among the latter two factors, in this case mostly to the second factor.

The total amount of variation explained by the rotated factor model is the same, but the contributions are not the same from the individual factors. We gain a cleaner interpretation, but the first factor is not going to explain as much of the variation. However, this would not be considered a particularly large cost if we are still going to be interested in these three factors.

What we are trying to do here is clean up our interpretation. Ideally, if this works well, what we should find is that the numbers in each column of will be either far away from zero or close to zero. If we have a lot of numbers close to one or negative one or zero in each column this would be the ideal or cleanest interpretation that one could obtain and this is what we are trying to find in one of the rotations of the data. However, data are seldom this cooperative!

Reminder: our objective here is not hypothesis testing but data interpretation. The success of the analysis can be judged by how well it helps you make your interpretation. If this does not help you then the analysis is a failure. If does give you some insight as to the pattern of variability in the data, then we have a successful analysis.

Click on the "Next" above, to continue this lesson.