Predicting Car Acceleration



Predicting Car Acceleration

0 0


devdatprod

A data products development exercise of a random forest prediction model

On Github kawenks / devdatprod

Predicting Car Acceleration

A Random Forest Prediction Exercise

for the Johns Hopkins Bloomberg School of Public Health

Data Science Specialization -- Developing Data Products

via Coursera.org

May 2015

What does it do?

This Shiny App uses the Auto dataset in the ISLR package. It sets up a Random Forest prediction model to determine a car's acceleration based on several attributes.

Data Exploration - Auto dataset

392 observations (vs. 32--mtcars), 9 variables (vs. 11--mtcars). 2 numeric variables (year, origin) that should be ordinal 3 variables have relatively higher correlation to acceleration

The prediction model

Feature Selection and Cross Validation Fine-Tuning

  • horsepower
  • displacement
  • weight
  • cylinders

     
    Fine-tune with 10-fold cross validation

Sample Error Rate

Random Forest has lower RMSE and explains more of the data variability.

Model RMSE R^2 RMSE sd R^2 sd Random Forest 1.4171 0.7508 0.2087 0.0730 Bayesian Generalized Linear Model 1.5208 0.7210 0.2411 0.0970 Generalized Additive Model 1.4838 0.7121 0.1437 0.0850