It's sometimes said (e.g. in this talk) that doing univariate analysis before multiple regression may lead to kicking off useful features and other mistakes. So, my questions are as follows.
- Are there any simple example models showing that such an effect may occur?
- If so, does it mean I need not to make univariate analysis at all? Or it just means that I can't reject a feature only due to the bad result of a univariate test?
EDIT: by univariate analysis I mean feature selection
In an article available here Sun and colleagues describe what they entitle "Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis" which may be of interest to you. When they say bivariable they mean what you refer to as univariate.