# Solved – Small sample linear regression: Where to start

FULL DISCLOSURE: This is homework.

I have been provided with a small data set (n=21) the data are messy, looking at it in a scatterplot matrix provides me with little to no insight. I've been provided with 8 variables that are metrics created from a longditudinal study (BI, CONS, CL, CR, …, VOBI). The other measurements are of mutual fund sales, returns, asset levels, market share, share of sales, and proportion of sales to assets

Correlations, are everywhere.

``               BI       CONS           CL          CR         QT        COM        CONV        VOBI          s            r           a          ms         ss       share      share2 BI      1.0000000  0.7620445  0.639830594  0.70384322  0.7741463  0.8451500  0.84704440  0.85003686  0.2106773 -0.238431047  0.36184548  0.40007830  0.4076563  0.31643802 -0.28283564 CONS    0.7620445  1.0000000  0.933595967  0.96979599  0.9892533  0.9069803  0.96781703  0.93416972  0.2316209 -0.074351798  0.31952292  0.40259511  0.4442877  0.24783884 -0.14788906 CL      0.6398306  0.9335960  1.000000000  0.88297431  0.8993748  0.8133169  0.89922684  0.81132166  0.1200420 -0.001107093  0.22132116  0.26729067  0.3033221  0.07650924 -0.25595278 CR      0.7038432  0.9697960  0.882974312  1.00000000  0.9788150  0.8965754  0.92335363  0.90848199  0.2934774 -0.119340914  0.35973640  0.46409570  0.5012178  0.32832247 -0.09005985 QT      0.7741463  0.9892533  0.899374782  0.97881497  1.0000000  0.9216887  0.95458369  0.94848419  0.2826278 -0.108430256  0.35520090  0.43290221  0.4823314  0.31761015 -0.12903075 COM     0.8451500  0.9069803  0.813316918  0.89657544  0.9216887  1.0000000  0.90302002  0.89682825  0.4305866 -0.255581594  0.50724121  0.55718441  0.5773171  0.40378679 -0.12085524 CONV    0.8470444  0.9678170  0.899226843  0.92335363  0.9545837  0.9030200  1.00000000  0.96097892  0.1993837 -0.065237725  0.32010735  0.41843335  0.4531298  0.28873934 -0.19668858 VOBI    0.8500369  0.9341697  0.811321664  0.90848199  0.9484842  0.8968283  0.96097892  1.00000000  0.2424889 -0.087126942  0.30390489  0.40390750  0.4845432  0.36588655 -0.07137107 s       0.2106773  0.2316209  0.120041993  0.29347742  0.2826278  0.4305866  0.19938371  0.24248894  1.0000000 -0.173034217  0.91766914  0.84673519  0.8596887  0.61299987  0.32072790 r      -0.2384310 -0.0743518 -0.001107093 -0.11934091 -0.1084303 -0.2555816 -0.06523773 -0.08712694 -0.1730342  1.000000000 -0.22512978 -0.18337773 -0.1030943 -0.17650579  0.51768144 a       0.3618455  0.3195229  0.221321163  0.35973640  0.3552009  0.5072412  0.32010735  0.30390489  0.9176691 -0.225129778  1.00000000  0.92445370  0.8656139  0.63049461  0.03876774 ms      0.4000783  0.4025951  0.267290668  0.46409570  0.4329022  0.5571844  0.41843335  0.40390750  0.8467352 -0.183377734  0.92445370  1.00000000  0.9572730  0.77582501  0.08435813 ss      0.4076563  0.4442877  0.303322147  0.50121775  0.4823314  0.5773171  0.45312978  0.48454322  0.8596887 -0.103094325  0.86561394  0.95727301  1.0000000  0.83931302  0.24371447 share   0.3164380  0.2478388  0.076509240  0.32832247  0.3176102  0.4037868  0.28873934  0.36588655  0.6129999 -0.176505786  0.63049461  0.77582501  0.8393130  1.00000000  0.20313930 share2 -0.2828356 -0.1478891 -0.255952782 -0.09005985 -0.1290307 -0.1208552 -0.19668858 -0.07137107  0.3207279  0.517681444  0.03876774  0.08435813  0.2437145  0.20313930  1.00000000  ``

Now, I've tried running a number of "tests", for example:

``summary.lm(share2 ~ BI + ...)  ``

However, none of them provide any reasonable result (mostly negative adjusted R^2).

I'm wondering, if you had data where it looked like there was no relationships (linear at least).

What would your next steps be?

P.S: I did try a number of model formulas that contained interaction effects and received much better results (R^2 Ra^2 > 80% and significant f-tests) but not all the interaction effects where significant.

Contents