Home

Menu

Loading wiki pages...

View
Wiki Version:
<p>Two MATLAB functions are provided for estimating the sample size through simulations for the cluster-based permutation test used in EEG/MEG data. They can be used for <strong>t-test</strong> between two conditions, <strong>one-way ANOVA</strong> with three or more conditions, or <strong>2×N interactions</strong>.</p> <p><strong>Cite as</strong>: <a href="https://doi.org/10.1111/psyp.13775" rel="nofollow">Wang, & Zhang. (2021). Word frequency effect in written production: Evidence from ERPs and neural oscillations. Psychophysiology, 58: e13775</a>. </p> <p><strong>Please cite the above paper</strong> if you use these functions, which were written for, and first used in this paper.</p> <p>MATLAB functions are stored in the <em>functions</em> file. - <strong>sampleSize_erp</strong> is for ERP analysis - <strong>sampleSize_timefreq</strong> is for time frequency analysis</p> <p>The following two demo files demonstrate how to use the above two functions: - <strong>demo_erp.m</strong> - <strong>demo_timefreq.m</strong></p> <p>Take a simple design with two conditions for example, what the functions do is simulating a dataset with two conditions of ERP (2 dimensions: channel×time) or time-frequency (3 dimensions: channel×frequency×time) data, for each sample size (starting from 10, increasing in steps of 1 until reaching a power of 0.80). In each sample of the ERP/time-frequency data, we simulate a cluster of interest within a predefined time window (eg, 50-250 ms) and frequency band (eg, 4-8 Hz) in neighboring channels (eg, C1, CZ, CP1, CPZ). The size of the cluster can be chosen to be similar to that of the cluster displaying effect of interest in prior existing or pilot studies. The cluster's peak values in the two conditions were sampled from two normal distributions (for a between-subject design) or a bivariate normal distribution (for a within-subject design). The means and standard deviations of the distributions can be chosen to be similar to those in prior existing or pilot studies. Then a cluster-based permutation test is performed on the simulated dataset to test whether any cluster exhibited significant difference between the two conditions with an alpha level of 0.05. For each sample size, the simulation was run for 1000 times, and the power was calculated as the proportion of the number of times finding significant clusters in these 1000 simulations. </p> <p>I also wrote a detailed introduction to this method on the <a href="https://www.fieldtriptoolbox.org/example/samplesize/" rel="nofollow"> Fieldtrip website</a>. (<strong>also see below</strong>, but updated to include more complex designs)</p> <hr> <p>It is recommended and sometimes even required to provide justification for sample size prior to starting a study and when reporting about it <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/psyp.13437" rel="nofollow">(Clayson et al., 2019)</a>. Many researchers use G*Power to estimate the sample size required for their studies. However, although very useful and popular, this software is not suitable for multivariate data or for non-parametric tests. For EEG and MEG we often use a cluster-based permutation test, which is a non-parametric test that exploits the multivariate structure in the data.</p> <p>Here, we demonstrate two easy-to-use MATLAB functions that use simulations to estimate the sample size for cluster-based permutation tests. These functions can be used for EEG/MEG research involving <strong>t-test</strong> between two conditions, <strong>one-way ANOVA</strong> with three or more conditions, or <strong>2×N interactions</strong>. The experiment can be of a with-subjects, between-subjects, or mixed design. The functions were written for, and first used in <a href="https://doi.org/10.1111/psyp.13775" rel="nofollow">Wang and Zhang (2021)</a>. Please cite this paper where appropriate.</p> <p>From this <a href="https://osf.io/rmqhc/files/" rel="nofollow">OSF project</a>, you can download the functions and the corresponding demo files. The MATLAB functions are stored in the <em>functions</em> folder.</p> <ul> <li><strong>sampleSize_erp.m</strong> is for ERP data</li> <li><strong>sampleSize_timefreq.m</strong> is for time-frequency data</li> </ul> <p>Two demo files demonstrating how to use the functions are in the <em>demo</em> folder:</p> <ul> <li><strong>demo_erp.m</strong></li> <li><strong>demo_timefreq.m</strong></li> </ul> <p>In this example, we will first briefly introduce how to estimate sample size through simulations using an intuitive example of coin-tossing. Then, we will demonstrate how to estimate sample size through simulations for t-tests and cluster-based permutation tests.</p> <h2>Power analysis through simulations – an intuitive example of coin-toss</h2> <p>Let's start by a brief introduction to estimate sample size using an intuitive example of coin-tossing. For a more detailed and excellent introduction, please see <a href="https://julianquandt.com/post/power-analysis-by-data-simulation-in-r-part-i/" rel="nofollow">Dr. Julian Quandt’s website</a>. The following section is inspired by Dr. Quandt’s excellent introduction.</p> <p>Suppose we are presented with a coin and we want to know whether it is fair (50% chance lands on head). We can test it by running coin-tossing experiments. Consider the following two experiment scenarios. In Experiment 1, we tossed it 2 times, and observed 2 heads; In Experiment 2, we tossed it 5 times and observed 5 heads. With either experiment, we are inclined to think the coin is unfair. But intuitively, we will be more confident of the coin’s unfairness following Experiment 2 than following Experiment 1. But still, we don’t know how sure we are about our decision to call the coin unfair. We can clarify this using inferential statistics.</p> <ul> <li>Null hypothesis H0: the coin is fair, i.e., 50% chance for heads and 50% chance for tails</li> <li>Alternative hypothesis H1: the coin is totally unfair, e.g., 100% chance to land on head</li> <li>Sample size: 2 or 5 samples (tosses)</li> <li><strong>Power: the probability of rejecting the null hypothesis when H0 is false</strong></li> </ul> <p>The power of a statistical test quantifies how sure we can be to decide the coin is unfair (i.e., rejecting the null hypothesis). By definition, we can calculate power by doing the same experiment a large number of times (e.g., 5000 times), and then calculating the proportion of the number of times in which null hypothesis can be rejected to 5000. However, even a simple task such as coin-tossing will be very time-consuming and tedious when you need to repeat it 5000 times. Luckily, we don’t really have to toss a coin, as MATLAB can simulate the results for us. We know the outcome of tossing a coin conforms to a binomial distribution. Using this distribution, we can simulate the outcome of each experiment (i.e., the number of heads out of a certain number of tosses).</p> <p>First, we need to “guess”, based usually on pilot studies or prior similar studies, the key parameters of the binomial distributions for each of the null and alternative hypotheses. In the current example, the chance of a coin to land on head under the null hypothesis is undoubtedly 50%. And under the alternative hypothesis, we will assume that the coin has a chance of 100% to land on head, based on the result our observations so far. Only with these parameters specified can data be sampled from the distributions. Note that the difference (50% vs. 100%) between the two hypotheses can be construed as the effect size, which has great influence on power: statistical power tends to be greater with larger effect size.</p> <p>Second, we need to decide the criterion for rejecting the null hypothesis. Conventionally, we reject the null hypothesis if the probability of its being correct is less than 0.05 or 5%, which is the so-called alpha level. In each experiment, if the outcome is unlikely (i.e., the probability &lt; 0.05) to happen under the null hypothesis, we reject the null hypothesis.</p> <p>Finally, power is calculated as the proportion of the null hypothesis being rejected following a very large number (say 5000) of repetitions of the experiment. Take the current case for example, the outcome of tossing a 100% unfair coin 5 times can be sampled from the binomial distribution, and the results would always be 5 heads. The probability of 5 heads out of 5 tosses under the null hypothesis is 1/(2^5) = 0.0313 &lt; 0.05. So, in all of the 5000 experiments, the null hypothesis is rejected. Thus, the power is 1 = 5000/5000.</p> <p>All of the above steps can be done using the following block of MATLAB code:</p> <pre class="highlight"><code class="language-matlab">% settings p_heads_h1 = 1; % chance to land on head under H1, aka, the effect size n_sample = 5; % sample size, i.e., number of toss alpha_level = 0.05; % alpha level n_sim = 5000; % number of simulations/experiments % run 5000 experiments of tossing a 100% unfair coin 5 times, store the number of heads generated in each experiment n_heads = binornd(n_sample, p_heads_h1, [1,n_sim]); % calculate the probability of getting at least that many heads if the coin would be fair p_h0 = 1-binocdf(n_heads -1, n_sample, 0.5); % calculate power power = sum(p_h0 &lt; alpha_level)/n_sim</code></pre> <p>A 100% unfair coin is an extreme and simplified example. You can try other settings in the above MATLAB code and see what you get. For example, if we estimate that the coin has a chance of 55% to land on head, and the sample size is 100, we can estimate the power to be 0.241, which is close to the result from G*Power.</p> <p>The previous code only shows how to calculate power with a specified sample size through simulation. The following code will show how to estimate the <strong>required sample size for a desired power</strong> (e.g., 0.9). Basically, what the code does is calculating the power through simulations for each sample size, starting from 21, increasing in steps of 1, until it reaches a power of 0.90. The code will stop when the power reaches the desired level, and the sample size corresponding to this desired power is the sample size we need for this test.</p> <pre class="highlight"><code class="language-matlab">% settings alpha_level = 0.001; % alpha level power_level = 0.9; % desired power level p_heads_h1 = 0.55; % chance to land on head under H1 n_start = 20; % sample size to start n_sim = 1000; % number of experiments % run simulations for each sample size until power reaches desired level power = 0; % initialize the variable power_at_n = zeros(1,n_start); % initialize n_sample = n_start; % sample size while power &lt; power_level % continue increasing sample-size until power reach desired level n_sample = n_sample + 1; % sample size in current iteration disp(n_sample); n_heads = binornd(n_sample, p_heads_h1, [1,n_sim]); p_h0 = 1-binocdf(n_heads-1, n_sample, 0.5); power = sum(p_h0 &lt; alpha_level)/n_sim; % calculate power for the current sample size power_at_n(n_sample) = power; end % plot the result plot(1:n_sample, power_at_n, 'ko'); hold on plot([0 n_sample],[power_level power_level],'r'); xlim([n_start+1 n_sample]); ylim([0 1]) xlabel('Sample size'); ylabel('Power') title(['Required sample size is' num2str(n_sample)])</code></pre> <p>![figure 1](<a href="https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig1.png" rel="nofollow">https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig1.png</a> =500x500)</p> <p>From the above example, we can summarize the basic steps to estimate sample size through simulation:</p> <ol> <li>Specify a null hypothesis, an alpha-level, and a desired power,</li> <li>Specify an alternative hypothesis, which is per se a theoretical distribution from which data can be sampled or simulated. This will also determine the effect size.</li> <li>Start from a small sample size. simulate data 1000 times from the distribution with this sample size. In each time, estimate the probability of the simulated data happening under the null hypothesis. If the probability is less than the alpha level, reject the null hypothesis. Then calculate power as the proportion of the number of times rejecting null hypothesis to 1000.</li> <li>Continue increasing the sample size until the power reaches the desired level.</li> </ol> <h2>Estimating sample size through simulations – t-test</h2> <p>A more realistic problem that we encounter in our daily research life is comparing two conditions. Suppose we have two conditions of (independent) data, how can we estimate the required sample size through simulations? We can do it following the above four steps.</p> <ol> <li>Null hypothesis H0: no difference between the two conditions; for instance, alpha = 0.05; desired power = 0.8.</li> <li>Alternative hypothesis H1: significant difference between the two conditions. The key is to define the two normal distributions from which the two groups of data can be sampled. So, the means and standard deviations need to be specified for each of the two conditions, based on prior existing or pilot studies. Usually, the standard deviations are assumed to be equal, whereas the means are different. The distance between the two means relative to the standard deviation reflects the effect size.</li> <li>Starting from a small sample size, we use the theoretical distribution to simulate two groups of data 1000 times. Each time, we estimate the significance (i.e., p values) of the difference between the two conditions using a t-test. If probability &lt; 0.05, we reject the null hypothesis. Then, calculate power as the proportion of the number of times that the null hypothesis is rejected.</li> <li>Continue increasing the sample size until the power reaches the desired level.</li> </ol> <p>The following block of MATLAB code implements these steps.</p> <pre class="highlight"><code class="language-matlab">% settings alpha_level = 0.05; % alpha level power_level = 0.80; % desired power level mu = [3 4]; % means for the two conditions sd = [2 2]; % standard deviations for the two conditions n_start = 10; % sample size to start n_sim = 1000; % number of experiment % run simulations for each sample size until power reaches desired level power = 0; % initialize the variable power_at_n = zeros(1,n_start); % initialize n_sample = n_start; % sample size while power &lt; power_level % continue increasing sample-size until power reach desired level n_sample = n_sample + 1; % sample size in current iteration p_vals = ones(1,n_sim); % initialize for i=1:n_sim rd1 = normrnd(mu(1), sd(1), n_sample, 1); % sampling data from a normal distribution for group1 rd2 = normrnd(mu(2), sd(2), n_sample, 1); % sampling data from a normal distribution for group2 [~,p_vals(i),~,~] = ttest2(rd1,rd2); % t-test, store the p values end power = sum(p_vals &lt; alpha_level)/n_sim; % calculate power for the current sample size power_at_n(n_sample) = power; end % plot the result figure plot(1:n_sample, power_at_n, 'ko'); hold on plot([0 n_sample],[power_level power_level],'r'); ylim([0 1]) xlim([n_start+1 n_sample]); ylim([0 1]) xlabel('Sample size'); ylabel('Power') title(['Required sample size is ' num2str(n_sample)])</code></pre> <p>The result shows that the sample size required for 80% power in an independent t-test (means: 3 vs. 4; SDs = 2) is 65, which is very close to the result that we can obtain from G*Power.</p> <p>![figure 2](<a href="https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig2.png" rel="nofollow">https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig2.png</a> =500x500)</p> <p>![figure 3](<a href="https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig3.png" rel="nofollow">https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig3.png</a> =500x500)</p> <h2>Estimating sample size through simulations - cluster-based permutation test</h2> <p>For cluster-based permutation tests in MEG/EEG data, the method of estimating sample size through simulations is the same to that for t-tests, except that Step 3 is somewhat different. In the Step 3 for cluster-based permutation tests, several groups of ERP (2 dimensions: channel×time) or time-frequency (3 dimensions: channel×frequency×time) data are simulated. In each sample of the ERP/time-frequency data, we simulate a cluster of interest with a predefined time window (e.g., 50-250 ms) and frequency band (e.g., 4-8 Hz) in neighboring channels (e.g., C1, CZ, CP1, CPZ). The time, frequency, and spatial ranges of the cluster can be chosen to be similar to those of cluster displaying effect of interest in your pilot studies or prior existing studies. The cluster's peak values in the two conditions were sampled from two normal distributions (for a between-subject design) or a bivariate normal distribution (for a within-subject design). The means and standard deviations of the distributions can be chosen to be similar to those in prior existing or pilot studies.</p> <p>Then a cluster-based permutation test is performed on the simulated dataset to test whether there is a significant difference between the two conditions. We run the simulations for 1000 times, and the power is calculated as the proportion of the number of times that the null hypothesis is rejected. These simulations can subsequently be repeated for an incrementally increasing sample size, starting from 10, increasing in steps of 1, until the power reached the desired threshold (e.g., 0.8). The above processing is implemented in the two aforementioned MATLAB functions <code>sampleSize_erp.m</code> and <code>sampleSize_timefreq.m</code>.</p> <p>Next, we will demonstrate how to use the <code>sampleSize_timefreq.m</code> function; the usage of <code>sampleSize_erp.m</code> is very similar. See also the two demo files.</p> <p>First, you need to have a time-frequency dataset that is generated by the <strong><a href="" rel="nofollow">ft_freqanalysis</a></strong> function in FieldTrip. You can also use the <em>exempleData_timefreq.mat</em> dataset which can be downloaded with the functions from <a href="https://osf.io/rmqhc" rel="nofollow">OSF</a>. This dataset is used only for retrieving the data structure for simulating time-frequency data. The <code>sampleSize_timefreq.m</code> function has two parts of parameters, which are respectively stored in <code>cfg</code> and <code>stat_cfg</code>. The structure <code>cfg</code> stores the configurations for simulating the data, and <code>stat_cfg</code> stores the configurations for the cluster-based permutation tests.</p> <pre class="highlight"><code class="language-matlab">%% clear; close all; addpath('F:\SampleSize\functions') load('exampleData_timefreq.mat'); % load a time-freq data obtained from the ft_freqanalysis function, % to retrieve the fieldtrip data structure %=========== set configuration for data simulation ================ % parameters for the power analysis cfg = []; cfg.alpha_level = 0.05; % desired alpha-level cfg.power_level = 0.8; % desired power cfg.num_sims = 500; % number of randomizations, should be &gt;= 500 cfg.n_start = 10; % sample size to start with, should be &lt;=10 % parameters for normal distribution from which the simulated data are sampled cfg.ExpDesign = 'within-subjects'; % 'within-subjects' or 'between-subjects' <a href="http://cfg.mu" rel="nofollow">cfg.mu</a> = [5 3]; % mean of each condition. Can have two or more conditions, the function will automatically select % a t-test for two conditions, and a F-test for three or more conditions for the cluster-based permutation test <a href="http://cfg.sd" rel="nofollow">cfg.sd</a> = [2 2]; % standard deviation of each condition cfg.cor = 0.7; % minimum correlation between paried samples, ONLY needed for a within-subject design % mu, sd, and cor should be corresponding values that you expect from your own data, they can be set to be similar to those in pilot or prior existing studies % mu is particularly important, as its entries determine the amount of differences between conditions % parameters for the time-freq data, should be matched to your own time-freq data cfg.time = exampleData.time; % exampleData is the variable loaded from 'exampleData_timefreq.mat' cfg.freq = exampleData.freq; cfg.label = exampleData.label; cfg.dimord = exampleData.dimord; % parameters for the simulated cluster of interest cfg.clusterfreq = [4 8]; % freq range you want to be included in the cluster displaying effect of interest, cfg.clustertime = [0.03 0.25]; % time range you want to be included in the cluster displaying effect of interest, cfg.clusterchan = {'CZ','C1','CPZ','CP1'}; % channles you want to be included in the cluster displaying effect of interest % These three parameters above should be set to be similar to those of the cluster in pilot or prior existing studies cfg.bufferchan = {'FC3','FC1','FCZ','FC2','C2','CP2','P2','PZ','P1','P3','CP3','C3'}; % channels surrounding the cluster channels, used as a buffer zone from peak values in the cluster channels to 0 cfg.layout = 'NeuroScan_quickcap64_layout.lay'; % your layout file %=========== set configuration for cluster permutation test ================ neighbour_cfg = []; neighbour_cfg.method = 'triangulation'; neighbour_cfg.layout = cfg.layout; neighbour_cfg.feedback = 'no'; neighbours = ft_prepare_neighbours(neighbour_cfg, exampleData); stat_cfg = []; stat_cfg.neighbours = neighbours; stat_cfg.minnbchan = 3; stat_cfg.channel = {'all','-HEOG','-VEOG'}; stat_cfg.avgoverchan = 'no'; stat_cfg.latency = [0 0.5]; % in second stat_cfg.avgovertime = 'no'; stat_cfg.frequency = 'all'; stat_cfg.avgoverfreq = 'no'; stat_cfg.method = 'montecarlo'; stat_cfg.correctm = 'cluster'; stat_cfg.clusterstatistic = 'maxsum'; % 'maxsum' or 'maxsize' stat_cfg.clustertail = 0; % 0 for t-test (two tails); 1 for F test (right tail) stat_cfg.clusteralpha = 0.05; stat_cfg.tail = 0; % 0 for t-test (two tails); 1 for F test (right tail) stat_cfg.alpha = 0.05; stat_cfg.numrandomization = 500; % number of randomizations, should be &gt;= 500 %%% Run the function MyPar = parpool; % start parallel pool res = sampleSize_timefreq(cfg,stat_cfg); save('results_timefreq_pairedSamples.mat','res') delete(MyPar) </code></pre> <p>Running this function would be quite time-consuming, with a lot of simulations to run. It’s time-saving to open <code>parpool</code> to use parallel computing. The results are stored in <code>res</code>. The following block of code plots the results.</p> <pre class="highlight"><code class="language-matlab">%% plot the results of power analysis cfg = res.cfg; figure scatter(cfg.n_start:res.sample_size, res.power_at_n(cfg.n_start:end),400,[0.5 0.5 0.5],'Marker','.'); hold on plot([cfg.n_start res.sample_size],[cfg.power_level cfg.power_level], 'r') text(cfg.n_start+0.5, 0.8,'0.8') xlabel('Sample size'); ylabel('Power'),xlim([cfg.n_start res.sample_size])</code></pre> <p>![figure 4](<a href="https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig4.png" rel="nofollow">https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig4.png</a> =500x500)</p> <p>We can also have a look at the simulated data for the required sample size. The datasets for the two conditions are stored in <code>res.condA</code> and <code>res.condB</code>.</p> <pre class="highlight"><code class="language-matlab">% compute the grand average cfg1 = []; cfg1.channel = 'all'; cfg1.toilim = 'all'; cfg1.foilim = 'all'; for ci=1:length(<a href="http://cfg.mu" rel="nofollow">cfg.mu</a>) grand = ft_freqgrandaverage(cfg1, res.allsub{ci}{:}); % grand-avg clusterchan_idx = match_str(cfg.label,cfg.clusterchan); pow(ci,:,:) = squeeze(mean(grand.powspctrm(clusterchan_idx,:,:),1)); % average cross cluster channels end figure lim = round(max(abs(pow(:))),1); nrow = ceil(sqrt(length(<a href="http://cfg.mu" rel="nofollow">cfg.mu</a>))); for ci=1:length(<a href="http://cfg.mu" rel="nofollow">cfg.mu</a>) subplot(nrow,nrow,ci) contourf(cfg.time,cfg.freq,squeeze(pow(ci,:,:)),40,'linecolor','none') set(gca,'clim',[-lim lim]); colorbar; colormap(jet); xlabel('Time (ms)'); ylabel('Frequency (Hz)') colorbar('YTick',[-lim 0 lim]); end</code></pre> <p>![figure 5](<a href="https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig5.png" rel="nofollow">https://www.fieldtriptoolbox.org/assets/img/example/samplesize/Fig5.png</a> =500x500)</p> <p><strong>The two functions can also be used to estimate sample size for 2-by-N (N&gt;=2) interaction effect</strong>; however, the first factor must be a within-subjects factor, and the second can be a within- or between-subjects factor. </p> <p>Take a 2×3 design for example, the first and second factors can be respectively denoted A and B, and the six cells in this design can be denoted A1B1, A2B1, A1B2, A2B2, A1B3 and A2B3. For each subject, we can get a difference data between A1 and A2, denoted as diff11_21, diff12_22, and diff13_23. Now testing an interaction effect between A and B can be treated as comparing diff11_21, diff12_22, and diff13_23, which can be done by a one-way ANOVA (using ft_statfun_depsamplesFmultivariate if B is a within-subjects factor, or ft_statfun_indepsamplesF if B is a between-subjects factor). If B has only two levels, simply use a t-test to compare the two differences (i.e., diff11_21 and diff12_22). Uing this approach, we can test an interaction effect using cluster-based permutation test. See <a href="https://www.fieldtriptoolbox.org/faq/how_can_i_test_an_interaction_effect_using_cluster-based_permutation_tests/" rel="nofollow">this page</a> for more details.</p> <p>Thus, we can treat two-way ANOVA as comparing differences by using t-tests or one-way ANOVAs. As to the configuration of the functions, simply enter the means, standard deviations, and correlations of these differences you expect for your data respectively into the <a href="http://cfg.mu" rel="nofollow">cfg.mu</a>, <a href="http://cfg.sd" rel="nofollow">cfg.sd</a>, and cfg.cor fields in the demo m-files. </p>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.