Prediction Model- Regression
The post is a preview for week 2 educational data mining class. Regression analysis itself covers more things and requires more effort to learn.
Usage of Prediction Models
- predict a single variable in the dataset using other variables.
- predict the future.
- make inferences from the present
component of regression
- the dataset used to build a regression model: each value of the predicted variable in the dataset is called a training label, associated with the traning lable is called a set of features. with out the training lables, the model still exists.
- regressors(predictors) + predicted variable
procedures
- transfrom
- unitization
- using others transform functions, e.g., sqrt(x)
caveat (warning)
a regression model: y = 4 + 2*x - 0.1x^2
In this case, it does not mean that x^2 is negatively correlated with y. We need to think it in a big picture, in this case, it means that when including x in the model, the relationship between x^2 and y becames negative.
When the regressors in the model are not independent, we need to be more careful.
regression trees
- non-linear (RepTree) : if x> 4, y=5, else if y=2
- linear (M5 prime, M5’) : if x>4, y=2A+B, else if y=3A+3B
Written on January 28, 2017