I'm doing machine learning in R. I would like to know how we can create a model object that we can pass to "predict" function along with new data so that we obtain predicted values.
To elaborate, I'm trying to write a new machine learning algorithm in R. Till now I have only used predict function but don't know how to create "model" objects to pass to predict function. If we're doing a linear regression, calling lm would create "lm" object. If we're doing naiveBayes classifier, and call it from e1071 package, it would create naiveBayes classifier object, which we will pass to predict function. Now, if I'm writing a new algorithm, how do I create an object of that algorithm? And how exactly predict function will process that? What class variables/methods that "model" object should have so that it can be processed by "predict" function available in R? I know this is a bit open ended question, but I couldn't find any proper documentation. A basic/prototype example in terms of code would be highly appreciated. Though I've been using R, I'm not familiar to classes/objects concept in R. Thank you very much.
Best Answer
You should check out Hadley Wickham's Advanced R Programming guide, especially the part about object oriented programming in R. To provide a couple of useful keywords: you are talking about S3 methods and generic functions.
Basically, predict
is a generic function, so when you call predict(obj, data)
for, say, an lm
model obj
, R looks through its list of known names to find the function predict.lm
, which provides thecae necessary to make predictions from an lm
model.
The model object itself is just a list, with an attribute identifying its class (lm
, in the example I'm citing). You'll need to return this list from the function that runs your algorithm, via code like
result <- list(obj=model, data=data, parameters=params) class(result) <- 'whatever' return(result)
…or whatever is appropriate output from your algorithm. OK, now you call your function like
myobj <- whatever(y~X, data=dat, weights=wt)
And finally if you run
predict(myobj, newdata, wt.new)
R will look for an call your prediction function as
predict.whatever(myobj, newdata, wt.new)
Read Hadley's page for more details – it can get kind of confusing.
Similar Posts:
- Solved – How to get each Decision Tree from Random Forest as independent prediction model?
- Solved – Using predict with PCR in R
- Solved – predict function in glmnet
- Solved – Why lambda (regularization paramter) in predict can be different from lambda for fitting model
- Solved – LogisticRegression on binary-class problem: does not predict the class