SOCR ≫ | DSPA ≫ | Topics ≫ |
Use the ALS (Case Study 15) data to:
Detect and impute missing value if any.
Use the ALSFRS_slope
as a clinically relevant outcome variable.
Randomly split data into training (70%) and testing (30%) datasets.
Use the LASSO to fit a model with cross validation (with optimized regularization parameter) and visualize the result.
Similarly, train a ridge regression model.
Train OLS model and improve it with stepwise variable selection.
Report the coefficient estimates for OLS, Stepwise OLS with AIC, Ridge and LASSO.
Calculate the predicted values for all 4 models and report the models performance metircs (RMSE and \(R^2\)).
Apply knockoff filtering for variable selection, controlling the false discovery rate.
Compare the variables selected by Stepwise OLS, LASSO and knockoff.