Solved – How to use eigenvectors to identify which variables are involved in collinearity?

The question involves a regression of $Y$ on $11$ predictor variables $X_1$ through $X_{11}$.

The problem asks me to identify the variables involved in the collinearity using the eigenvectors that correspond to small eigenvalues.

Using R, I have calculated the $11$ eigenvalues and eigenvectors, but I am not sure how to use an eigenvector to see which predictor variables are involved in the collinearity.

 [1] 7.702574847 1.403077880 0.773435643 0.577055424 0.211498935 0.141941470 0.095142049  [8] 0.050092536 0.033266309 0.008417705 0.003497202  $vectors             [,1]         [,2]        [,3]         [,4]         [,5]        [,6]        [,7]  [1,] -0.3529639 -0.112431387  0.03114403 -0.006932422  0.026272973 -0.09512815  0.26787382  [2,] -0.3299718 -0.260762001  0.07836539 -0.194970349 -0.142783457 -0.23889898  0.34910433  [3,] -0.3510109 -0.139829772  0.04294522 -0.004153543 -0.084990459 -0.18488343  0.35518667  [4,]  0.1610427 -0.552726480  0.11863260  0.785849610  0.096920435  0.09122188  0.09287761  [5,]  0.2663779 -0.346997347 -0.43309789 -0.352178691  0.516283052  0.07200995  0.06450059  [6,] -0.2047881 -0.548146807  0.41844801 -0.380746710 -0.007176897  0.38287792 -0.37681067  [7,]  0.3040550 -0.352222407 -0.22122179 -0.134117215 -0.050372348 -0.57691563 -0.02079064  [8,] -0.3232988 -0.078466513 -0.36961713  0.180329365 -0.200485930 -0.20407455 -0.67496023  [9,] -0.3026624  0.006019985 -0.54645511  0.094905101  0.106514020  0.51959464  0.19659254 [10,] -0.3446125 -0.100475266 -0.26679114  0.040652506 -0.028959499 -0.14008874 -0.06284718 [11,] -0.3117090  0.181885175  0.24279993  0.119155548  0.800493659 -0.27479473 -0.16382124              [,8]        [,9]        [,10]        [,11]  [1,] -0.25888638  0.49677393 -0.290946296  0.617904045  [2,]  0.05057424 -0.65243209  0.290811120  0.258528596  [3,] -0.06800437  0.03290868 -0.466442937 -0.681570251  [4,] -0.06188507 -0.06292276  0.051311641  0.012735988  [5,] -0.43886854 -0.13804308 -0.086127357 -0.045372936  [6,]  0.16574908  0.13359309 -0.004651702 -0.059626414  [7,]  0.55944398  0.24949398 -0.055978181  0.049028663  [8,] -0.15486222 -0.25287357 -0.294111256  0.091346835  [9,]  0.52415223 -0.01482782 -0.055178229  0.052597726 [10,] -0.20261712  0.39402290  0.714256660 -0.259679096 [11,]  0.22167146 -0.06274209  0.017189710 -0.009773591 

Above are the $11$ eigenvalues and then their corresponding eigenvectors.

Any help would be greatly appreciated.

I don't believe there is a magical statistical test to say these four variables are villans and those seven are upstanding citizens. The territory is all shades of grey, not black and white. However, we can find some dark grey pretty easily. Your first eigenvalue is 7.7 with a CI of 1 (obviously) Looking at the corresponding eigenvector [,1] (i think), you can see the vector components are more or less of the same magnitude (though different signs). Now examining the eigenvector for 0.003497202: [,11] you find components ranging in abs from .009 to .68. The larger components will point to the variables which are still clinging together within the orthonormal basis of eigenvector. Looks like 1 & 3 and to a lesser extent 2 & 10. Moving up to the eigenvector associated with eigenvalue 0.008417705 we again see 10 and 3 ugly with 1,2 and now 8 homely. Proceed in this fashion until depression sets in. You really want to go back to the actual variable definitions and verify that the correlations revealed make intuitive sense. Presumably, addressing the problem by dropping redundant variables, clustering them, or finding a good instrument to replace the entangled collinear variables.

Similar Posts:

Rate this post

Leave a Comment