First I should explain what I did, and it might not be right.

I have a variable that represents a test outcome, it might be positive or negative.

I have a set of observations of one important variable (my point of interest) for the last 5 days before the test was undertaken. I have computed the average for the last 3 days and all the 5 days.

Then, I have some other (not so important to me) variables, some of them are binary (yes or no), some of them are continuous.

I want to create a logit model for my DV based on these variables.

As the histograms dont seem normal enough, I have used Mann-Whitney-U on all the variables (with the outcome of the test as a grouping variable) and have seen that the test is the most significant for the 5-day average (the second most significant was the day 1 before the test was undertaken). So, I have chosen the 5-day average for an univariate logit model, and it was significant. Then I put all the other variables into the model and ran a stepwise model selection in R based on AIC. Now I have a model that contains 5 variables – the 5 day average is highly significant and two other variables, still significant, but there are also two variables that are non-significant at my chosen level (0.05). I have two questions:

- Is the process of selecting one version of the "important" variable for the model acceptable as I did it?
- How do I interpret the two non-significant variables? They have not been shown to have a significant influence on the DV, but in a model without them, the other variables become insignificant. Can I actually use a model that contains non-significant values? Or can I say, that the model fits the available data well, and the two variables contribute to the model but are not significant at my chosen level.

**Contents**hide

#### Best Answer

It is not appropriate to do pre-testing of variables to select which variables should be modeled. Instead use subject matter expertise or possibly data reduction (blinded to $Y$). There is absolutely nothing wrong with having "insignificant" variables in a model and in fact this is a sign that you are doing things correctly.