SOCR ≫ DSPA ≫ Topics ≫

Demonstrate cross validation on these two case-studies independently:

Go through the following protocol:

  • Review each case-study.
  • Choose appropriate dichotomous, polytomous or continuous outcome variables, e.g., use ALSFRS_slope for ALS, CHRONICDISEASESCORE(cutoff at 1.2) for Case06_QoL_Symptom_ChronicIllness.csv and binarize the outcome.
  • Apply proper data preprocessing.
  • Perform regression modeling (OLS, glmnet, Forward or Backward model selection, etc.) for continuous outcomes.
  • Perform classification and prediction using various methods (e.g., LDA, QDA, AdaBoost, SVM, Neural Network, KNN) for discrete outcomes.
  • Apply cross-validation on these regression and classification methods, respectively.
  • Report standard error for the regression type approaches.
  • Report appropriate quality metrics that can be used to rank the forecasting approaches based on the predictive power of their results.
  • Compare the results of model-driven and data-driven (e.g., KNN) techniques.
  • Compare sensitivity and specificity.
  • Use unsupervised classification methods, e.g., k-means and spectral clustering.
  • Evaluate and justify the k-means model and detect the level of agreement the model and the real clusters labels.
  • Report the discrepancy (difference of agreement) between k-means and k-mean++, also including the diagnosis of k-mean++.

SOCR Resource Visitor number Dinov Email