SOCR ≫ DSPA ≫ Topics ≫

Use some of the methods below to do classification, prediction, and model performace evaluation.

Model Learning Task Method Parameters
KNN Classification knn k
Naïve Bayes Classification nb fL, usekernel
Decision Trees Classification C5.0 model, trials, winnow
OneR Rule Learner Classification OneR None
RIPPER Rule Learner Classification JRip NumOpt
Linear Regression Regression lm None
Regression Trees Regression rpart cp
Model Trees Regression M5 pruned, smoothed, rules
Neural Networks Dual use nnet size, decay
Support Vector Machines (Linear Kernel) Dual use svmLinear C
Support Vector Machines (Radial Basis Kernel) Dual use svmRadial C, sigma
Random Forests Dual use rf mtry

\[\textbf{Table 1}\]

1 Model improvement case study

From the course datasets, use the 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv case-study data to perform a multi-class prediction.

Use ResearchGroup as response, which have “PD”,“Control” and “SWEDD” three classes.

  • Delete ID column, impute missing value with mean or median and justify your choice.

  • Normalize the covariates.

  • Implement automated parameter tuning process and report the optimal accuracy and \(\kappa\).

  • Set arguments and rerun the tuning, trying differents method and number settings.

  • Train a random forest with tuned parameters, report the result and output cross table.

  • Use bagging algorithm and report the accuracy and \(\kappa\).

  • Perform randomForest and report the accuracy and \(\kappa\).

  • Report the accuracy by AdaBoost and make sure try all three methods.

  • Finally, give a brief summary about all the model improvement approaches.

Try the procedure on other data in the list of Case-Studies, e.g., Traumatic Brain Injury Study and the corresponding dataset.

SOCR Resource Visitor number Dinov Email