Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
**Welcome to the 'same data, different analysts' project!** The results of this study are available in a preprint: https://doi.org/10.32942/X2GG62 **Here is the abstract from the draft of our Stage 2 Registered Report** (this project is a registered report, meaning it has been provisionally accepted for publication based on the quality of the introduction and methods prior to data collection. It is currently under Stage 2 review, for assessment of conformity to our originally-accepted plan): Although variation in effect sizes and predicted values among studies of similar phenomena is inevitable, such variation far exceeds what might be produced by sampling error alone. One possible explanation for variation among results is differences among researchers in the decisions they make regarding statistical analyses. A growing array of studies has explored this analytical variability in different (mostly social science) fields, and has found substantial variability among results, despite analysts having the same data and research question. We implemented an analogous study in ecology and evolutionary biology, fields in which there have been no empirical exploration of the variation in effect sizes or model predictions generated by the analytical decisions of different researchers. We used two unpublished datasets, one from evolutionary ecology (blue tit, Cyanistes caeruleus, to compare sibling number and nestling growth) and one from conservation ecology (Eucalyptus, to compare grass cover and tree seedling recruitment), and the project leaders recruited 174 analyst teams, comprising 246 analysts, to investigate the answers to prespecified research questions. Analyses conducted by these teams yielded 141 usable effects for the blue tit dataset, and 85 usable effects for the Eucalyptus dataset. We found substantial heterogeneity among results for both datasets, although the patterns of variation differed between them. For the blue tit analyses, the average effect was convincingly negative, with less growth for nestlings living with more siblings, but there was near continuous variation in effect size from large negative effects to effects near zero, and even effects crossing the traditional threshold of statistical significance in the opposite direction. In contrast, the average relationship between grass cover and Eucalyptus seedling number was only slightly negative and not convincingly different from zero, and most effects ranged from weakly negative to weakly positive, with about a third of effects crossing the traditional threshold of significance in one direction or the other. However, there were also several striking outliers in the Eucalyptus dataset, with effects far from zero. For both datasets, we found substantial variation in the variable selection and random effects structures among analyses, as well as in the ratings of the analytical methods by peer reviewers, but we found no strong relationship between any of these and deviation from the meta-analytic mean. In other words, analyses with results that were far from the mean were no more or less likely to have dissimilar variable sets, use random effects in their models, or receive poor peer reviews than those analyses that found results that were close to the mean. The existence of substantial variability among analysis outcomes raises important questions about how ecologists and evolutionary biologists should interpret published results, and how they should conduct analyses in the future. **Here is the abstract from our Stage 1 Registered Report** (this project is a registered report, meaning it has been provisionally accepted for publication based on the quality of the introduction and methods prior to data collection): Although variation in effect sizes and predicted values among studies of similar phenomena is inevitable, there is evidence that such variation may far exceed what might be produced by sampling error. This evidence comes from a growing meta-research agenda that seeks to describe and explain variation in reliability of scientific results. One possible explanation for variation among results is differences among researchers in the decisions they make regarding statistical analyses. The best evidence for this comes from a recent social science study that asked 29 different research teams to answer the same question independently by analyzing the same data set. Although many of the effect sizes were similar, some differed substantially from the average. We plan to implement an analogous study in ecology and evolutionary biology, a field in which there has been no empirical exploration of the variation in effect sizes or model predictions of dependent variables generated by analytical decisions of different researchers. We have obtained two unpublished data sets, one from evolutionary ecology and one from conservation ecology, and we will recruit as many independent scientists as possible to conduct analyses of these data to answer prespecified research questions. We will also recruit peer reviewers to rate the analyses based on their methodological descriptions so that we have multiple ratings of each analysis. Next we will quantify the variability in choices of independent variables among analyses and, using meta-analytic techniques, describe and quantify the degree of variability among effect sizes and predicted values for each of the data sets. Finally, we will quantify the extent to which deviation of individual effect sizes and predicted values from the meta-analytic mean for that data set is explained by peer review ratings and by the ‘uniqueness’ of the set of variables chosen for the analysis by each team. **What follows is the information that appeared on this page when we were recruiting analysts and reviewers** We are hoping to find collaborators willing to analyse one of our two datasets (click through to the components of this project) or review analyses of these datasets. Project information can be found in the google doc in the files section of this page You can sign up to take part through this link: http://eepurl.com/gWD42n Check out our Frequently Asked Questions here: https://osf.io/j94fy/ **Analysts** Once you complete your analysis, you will need to: 1) write up a journal-ready methods section 2) write up a journal-ready results section 3) answer a structured survey, providing analysis technique, explanations of their analytical choices, quantitative results, and a statement describing their conclusions. 4) share your analysis files including - the data set formatted for your analyses if relevant - code or procedural files depending on your analysis program of choice **Reviewers** We would love you to help us out be reviewing people's analyses starting from July or August 2020 when we hope all of the analyses will be complete. The timeline is subject to the state of the world. We will ask each reviewer to assess at least four analyses of the same dataset. You will receive include the narrative methods section, the analysis team’s answers to our survey questions regarding their methods, including analysis code, and the data set for one dataset at a time. Then you will be asked to rate the appropriateness of the analysis from 0 to 100, specify whether the analysis is (a) publishable as is, (b) publishable with minor revision, (c) publishable with major revision, (d) deeply flawed and unpublishable and answer some more detailed questions about the suitability of the methods. **For more information email** Tim Parker (parkerth@whitman.edu)
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.