- Hypotheses generated based on the exploratory dataset.
- Sensitivity analyses to determine the smallest effect size that can be detected given the sample size of the hold-out dataset as well as the data analytic technique proposed to test the hypotheses.
- Machine learning often, but not always, requires cross-validation. For those kinds of machine learning where cross-validation is required, the same logic applies here. For some machine learning approaches (e.g., conditional random forests or autoencoding) we will not require a hold-out set. Authors will need to justify however why they do not have a hold-out set and, like in the cross-validation approach, they will still need to generate a hypothesis from their analyses.