# Solved – randomForest same formula different results (with same seed!)

I am using randomForest function with same seeds, but gives different results.
(with Boston dataset)

``set.seed=500 regressor = randomForest(x = training_set,                           y = training_set\$medv,                           ntree = 100)  Call:  randomForest(x = training_set, y = training_set\$medv, ntree = 100)                 Type of random forest: regression                      Number of trees: 100 No. of variables tried at each split: 4       Mean of squared residuals: 0.03206772                 % Var explained: 96.78 ``

OR :

``set.seed=500   regressor =randomForest(medv ~ . , data = training_set,ntree=100)   Call:  randomForest(formula = medv ~ ., data = training_set, ntree = 100)                 Type of random forest: regression                      Number of trees: 100 No. of variables tried at each split: 4       Mean of squared residuals: 0.1248719                 % Var explained: 87.48 ``

Gives different call results.
Any helps?

Thanks

Contents

`set.seed=500` initializes a variable called `set.seed` and sets it to 500. It does not set the random number generator seed.

Use `set.seed(500)` instead.

You can look at the help page by `?set.seed`.

In addition, note that your first model (`x = training_set`) includes all columns of the training data set – including the dependent variable `medv`. In contrast, the second one (`medv ~ .`) tells R to exclude the DV from the IVs. Of course these will give different results, since the training data are different.

Below, I give a reproducible example. The last model is an adaptation of your first model, and it indeed gives the same results as your second one.

``library(randomForest) library(MASS) training_set <- Boston  set.seed(500) regressor = randomForest(x = training_set,                           y = training_set\$medv,                           ntree = 100) regressor  set.seed(500) regressor =randomForest(medv ~ . , data = training_set,ntree=100)  regressor  set.seed(500) regressor = randomForest(x = training_set[,-14],                           y = training_set\$medv,                           ntree = 100) regressor ``

Finally, note that you will typically get better help if you include a minimal reproducible example like the one I gave here.

Rate this post