|SOCR ≫||DSPA ≫||Topics ≫|
Start by reviewing Chapter 13 (Model Performance Assessment). Cross-validation is a strategy for validating predictive methods, classification models and clustering techniques by assessing the reliability and stability of the results of the corresponding statistical analyses (e.g., predictions, classifications, forecasts) based on independent datasets. For prediction of trend, association, clustering, and classification, a model is usually trained on one dataset (training data) and subsequently tested on new data (testing or validation data). Statistical internal cross-validation defines a test dataset to evaluate the model predictive performance as well as assess its power to avoid overfitting. Overfitting is the process of computing a predictive or classification model that describes random error, i.e., fits to the noise components of the observations, instead of identifying actual relationships and salient features in the data.
In this Chapter, we will use Google Flu Trends, Autism, and Parkinson’s disease case-studies to illustrate (1) alternative forecasting types using linear and non-linear predictions, (2) exhaustive and non-exhaustive internal statistical cross-validation, and (3) explore complementary predictor functions.
In Chapter 6 we discussed the types of classification and prediction methods, including
unsupervised learning. The former are direct and predictive (there are known outcome variables that can be predicted and the corresponding forecasts can be evaluated) and the latter are indirect and descriptive (there are no a priori labels or specific outcomes).
There are alternative metrics used for evaluation of model performance, see Chapter 13. For example, assessment of supervised prediction and classification methods depends on the type of the labeled outcome responses -
categorical (binary or polytomous) vs.
Confusion matrices reporting accuracy, FP, FN, PPV, NPV, LOR and other metrics may be used to assess predictions of dichotomous (binary) or
\(R^2\), correlations (between predicted and observed outcomes), and RMSE measures may be used to quantify the performance of various supervised forecasting methods on
Before we go into the cross-validation of predictive analytics, we will present several examples of overfitting that illustrate why certain amount of skepticism and mistrust may be appropriate when dealing with forecasting models based on large and complex data.
By 2017, there were only 57 US presidential elections and 45 presidents. That is a small dataset, and learning from it may be challenging. For instance:
This cartoon illustrates some of the (unique) noisy presidential characteristics that are thought to be unimportant to presidential elections or presidential performance.