I watched online courses about multivariable linear regression which addresses the problem of determining values from numeric inputs. Like, predict prices for houses based on age, size, number of floors…
I also learned about things like logistic regression which concerns classification problems based on numerical inputs. Like, predict if an email is spam or not spam.
But I'm having trouble to find informations about predicting continuous values (like a price) based on a mix of discrete and continuous parameters.
For example, if I have several models of laptops classified with these categories:
[discrete] color: gray, blue, black... [discrete] backlit_keyboard: yes, no [discrete] material: plastic, aluminium [continuous] weight: (numeric) [discrete] brand: apple, sony, lenovo...
Now, assuming that the price will be a function of all these variables, and that I have a large training set of actual prices for each described object, I want to determine this function so that I'll be able to tell an estimated price for an object which has no previous example in the training set.
Which machine learning methods could/should be used to fit such a purpose?
Best Answer
One standard thing to do is to use one-hot encoding, and then run any regression algorithm you'd like (e.g. a variant of linear regression, or maybe kernel ridge regression). That is, the first dimension can be color_is_gray
(which would be 1 if the color is gray, and 0 if not); the second color_is_blue
, and so on. Then concatenate features for the other attributes.
If you have some notion of distance between the discrete attributes, you can instead perform multidimensional scaling and use the features it obtains.
Similar Posts:
- Solved – How to predict the next number in a series while having additional series of data that might affect it
- Solved – How to apply coefficient term for factors and interactive terms in a linear equation
- Solved – Regression model with multimodal outcome
- Solved – Regression model with multimodal outcome
- Solved – Providing latitude and longitude to a house price model