Use some of the methods below to do classification, prediction, and model performace evaluation.

Model	Learning Task	Method	Parameters
KNN	Classification	`knn`	`k`
Naïve Bayes	Classification	`nb`	`fL, usekernel`
Decision Trees	Classification	`C5.0`	`model, trials, winnow`
OneR Rule Learner	Classification	`OneR`	None
RIPPER Rule Learner	Classification	`JRip`	`NumOpt`
Linear Regression	Regression	`lm`	None
Regression Trees	Regression	`rpart`	`cp`
Model Trees	Regression	`M5`	`pruned, smoothed, rules`
Neural Networks	Dual use	`nnet`	`size, decay`
Support Vector Machines (Linear Kernel)	Dual use	`svmLinear`	`C`
Support Vector Machines (Radial Basis Kernel)	Dual use	`svmRadial`	`C, sigma`
Random Forests	Dual use	`rf`	`mtry`

\[\textbf{Table 1}\]

1 Model improvement case study

From the course datasets, use the 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv case-study data to perform a multi-class prediction.

Use ResearchGroup as response, which have “PD”,“Control” and “SWEDD” three classes.

Delete ID column, impute missing value with mean or median and justify your choice.
Normalize the covariates.
Implement automated parameter tuning process and report the optimal accuracy and \(\kappa\).
Set arguments and rerun the tuning, trying differents method and number settings.
Train a random forest with tuned parameters, report the result and output cross table.
Use bagging algorithm and report the accuracy and \(\kappa\).
Perform randomForest and report the accuracy and \(\kappa\).
Report the accuracy by AdaBoost and make sure try all three methods.
Finally, give a brief summary about all the model improvement approaches.

Try the procedure on other data in the list of Case-Studies, e.g., Traumatic Brain Injury Study and the corresponding dataset.

SOCR Resource Visitor number

Data Science and Predictive Analytics (UMich HS650)