# Summary In cluster-randomised experiments, participants are randomly assigned to the conditions not on an individual basis but in entire groups. For instance, all pupils in a class are assigned to the same condition. This article reports on a series of simulations that were run to determine how the clusters (e.g., classes) in such experiments should be assigned to the conditions if a relevant covariate is available at the outset of the study (e.g., a pretest) and how the data it produces should be analysed if researchers want to maximise their statistical power while retaining nominal Type-I error rates. The R code used for the simulation is accessible here, allowing researchers who need to plan and analyse a cluster-randomised design to tailor the simulation to the specifics of their study and determine which approach is likely to work best. # Organisation The article summarising the setup of the simulations and their results can be found in the directory `article`. The simulated dataset used to demonstrate the different analyses is also available in this directory. The directory `functions` contains the workhorse functions (written in R) for the simulation: * `generated_clustered_data.R` generates a single dataset reflecting a cluster-randomised experiment. If you want to change assumptions about how the data are generated, you need to change this file. * `analyse_clustered_data.R` takes the dataset output by the previous function and runs a bunch of different analyses on it. If you want to check the Type-I error rates and the power of analyses I didn't consider, you need to add them to this file. The directory `scripts` contains the scripts that you can rerun in R if you want to reproduce the simulation. You can also change the simulation parameters here. The scripts should be run in the following order. 1. `simulation_type_I_error.R` runs the simulations to assess the Type-I error rates. The results are saved to the directory `results`. 2. `simulation_power.R` runs the simulations to assess power. The results are saved to the directory `results`. Please note that running these scripts takes a while (about 18 hours per script on my machine). If you want to run your own simulations, you will probably be able to reduce the number of parameter combinations, which will speed up the simulation. 3. `combine_results.R` reads in the simulation results stored to `results`, tabulates the relevant information, and stores this tabulation, again to `results`. 4. `typeI_error.Rmd` draws plots showing the estimated Type-I error rates for different analytic strategies in several contexts. These plots are saved to the directory `figures`. They can also be viewed in the file `typeI_error.html`, which also contains a tabular summary. 5. `power.Rmd` does the same, but for the power estimates. Also see `power.html`. The script `additional_simulations.Rmd` demonstrates two minor points alluded to in the article.