Tuesday, November 4, 2008

Colinearity and Tolerance

When describing colinearity for part C of question 2, we are using tolerance statistics via SPSS (which have an inverse relationship with with colinearity; e.g., the higher the tolerance value, the less colinear are the variables). SPSS gives tolerance statistics for each variable in every step (or block) of the model. My question: for each specified variable that is excluded (in the first step there are 4, the second, there are 3, and the third, there is 1), does the tolerance stastic associated with that variable refer to colinearity between that variable and the model at that point in time? For example, at Step 1, the tolerance level that is reported for physical appearance is the relationship of colinearity between that variable and the overall model- which up to that point has used intellectual ability and school competence (and therefore only refers to colinearity between these variables used together and physical appearance?) If you could help to deliniate what the tolerance levels mean for each block, it would be helpful.

Chris

6 comments:

Ingrid said...
This comment has been removed by the author.
Ingrid said...
This comment has been removed by the author.
Kris said...

Chris-
Each variable that you remove can change the tolerance either positively or negatively, based on how highly correlated that variable is with the remaining variables in the block. As you remove variables in successive blocks, you will likely see the tolerance either increase or stay the same depending upon whether the removed variable is correlated with the remaining variables or not respectively. Therefore, if a highly collinear variable is removed, you should see an increase in tolerance for the next block in your analysis. Thus, if you see a large increase in your tolerance (i.e. from .20 to .80) when removing a variable from block one to rerun the analysis in block two without that variable, then you know that you have removed a variable that is highly correlated with the variables that remained in the analysis. This could be validated by a simple correlation matrix which will demonstrate which particular variables are correlated with one another. Those that are more highly correlated, you should suspect that, when removed from the analysis blocks, will demonstrate an increase in your tolerance. If I understand your question correctly, in Step 1, your tolerance level should be indicative of the collinearity of physical appearance to all the other variables that are in that regression analysis block and not just the whole model. Only when you remove that variable in Step 2 do you compare the values and see the effect of physical appearance on the remaining model. Does that answer your question?

Kris said...

Actually, I should clarify one other thing. When I said "change the tolerance either positively or negatively" a more accurate statement would be that removing a variable should only increase your tolerance or keep its level the same (as increasing your tolerance value means that you have less collinearity in the model). If the variable that is removed is not correlated, then you know your tolerance level is based on the relationship between the variables that remained in your model. If the tolerance increases, you know that you have removed a variable that is correlated with at least one other variable in the model.

Mari said...

Tolerance = 1 - R2 where the R2 was derived from the prediction of the variable for which the tolerance level is reported from all of the X variables already in the equation.

So, in step 1, the reported tolerance levels are those predicting the new variables from only those X variables already in the equation in step 1.

In step 2, the tolerance is computed using the four X variables already in the equation in step 2.

Chris said...

Thanks!. This was helpful- I really wanted to confirm the fact that tolerance as reported by SPSS is tolerance for the reported variables not currently in the equation with the the combined variables in each block. The model at first, however, was additive- I don't think we were removing any variables throughout the regression. Therefore I believe the converse would be true: that adding variables in successive blocks would either let the tolerance stay the same or decrease it for the variables currently not in the equation (as reported in the "Excluded Variables" table).