Solved – How to know which class the random forest model is predicting

I have a random forests model with which I am trying to predict species presence or absence.
This is my code:

#read in dataframe containing observations of species presence/absence & predictor variables mydata <- read.csv('mydata.csv')  #fit random forests model fitmodelA <- randomForest(SPECIESA ~ var1 + var2 + var3 + var4 + var5 +var6 + var7 + var8 +  var9 + var10, data=mydata, mytry=3, ntrees=500, replace=T, importance=T, keep.forest=T)  #predict to new data predictmodelA <- predict(fitmodelA, newdata, type="prob")  # save as raster image writeRaster(predictmodelA,"predictSPECIESA.tif") 

Apparently I should get back a matrix that has the probability for both classes, i.e., in two columns. Do I understand correctly that it is also possible to add the "index" argument to predict only one class?

With my code the way it is, my output raster produced one layer with probabilities 0 to 1, but no other attributes – what class is this predicting? I am more interested that my map show predictions of presence rather than absence.
Probably a simple solution to this…? Thanks!

I don't understand your confusion at all. When you choose type="prob" the output is, naturally, the probability of each class. So you should be getting two columns. You can simply choose either one by doing predictmodelA[,1] or predictmodelA[,2] in your two case example. Since the R vanilla implementation of randomForest needs a factor variable for the case of classification, the order of these probability columns will just follow the order of the factor levels.

By the way I don't think your last bit of code will get you to the raster file you want. You would first need to determine it's coordinates, so you need to add them to your prediction vector, for example:

raster_species <- data.frame(species_prediction=predictmodelA,x=X_coordinate, y=Y_coordinate 

then explicitly tell R which variables are the coordinates (I'm guessing you are using the raster package):

coordinates(raster_species)<-~x+y 

Indicate that it is gridded

gridded(raster_species)=TRUE 

and the simply label it a raster

raster_species <- raster(raster_species) 

you can then plot it and save it

plot(raster_species)  writeRaster(raster_species, filename="raster_species.tif", format="GTiff", overwrite=TRUE) 

Similar Posts:

Rate this post

Leave a Comment