Open Science Literature
**Open Science in the Literature** ---------- *These Zotero libraries contain a growing list of empirical and theoretical papers about open science.* - [Open Science in the Scientific Literature] - [Open Science: News and Editorials] Other great collection of items are in this [Reproducibility Bibliography] and this [group library]. *Below is an annotated selection of papers related to the need for more transparent research and the effectiveness of such research practices. To add to this collection, please do so by [editing the wiki]. Email David Mellor firstname.lastname@example.org for questions.* @[toc] ---------- ## Benefits of Transparency ## Transparent and reproducible research practices help the researcher better organize their work and become more efficient, increases the impact and citation rate of their work, and of course helps the scientific community more quickly build upon preliminary discoveries. ### Citation Advantages ### #### Open Data #### - [The citation advantage of linking publications to research data] - Articles in which data were made available in a repository showed a clear citation advantage of up to 25%. - [Sharing Detailed Research Data Is Associated with Increased Citation Rate] - "The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations..." - [On the Citation Advantage of linking to data: Astrophysics] - "I find that the Citation Advantage presently (at the least since 2009) amounts to papers with links to data receiving on the average 50% more citations per paper per year, than the papers without links to data" - [Data reuse and the open data citation advantage] - "...we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available." - [Linking to Data - Effect on Citation Rates in Astronomy] - "... articles with data links on average acquired 20% more citations (compared to articles without these links) over a period of 10 years." - [A study of the impact of data sharing on article citations using journal policies as a natural experiment] - "...published articles with posted data enjoyed an increase of 97.04 (standard error 34.12) to 109.28 (standard error 41.15) total citations over a mean of approximately 100, suggesting nearly a doubling." #### Preprints #### - [Altmetric Scores, Citations, and Publication of Studies Posted as Preprints] - "Articles with a preprint received higher Altmetric scores and more citations than articles without a preprint." - [Releasing a preprint is associated with more attention and citations for the peer-reviewed article] - "Articles with a preprint had, on average, a 49% higher Altmetric Attention Score and 36% more citations than articles without a preprint" - [The effect of bioRxiv preprints on citations and altmetrics] - "BioRxiv-deposited journal articles received a sizeable citation and altmetric advantage over non-deposited articles." #### Registered Reports #### - [Evaluating Registered Reports: A Naturalistic Comparative Study of Article Impact] - **This is an ongoing study, where data is expected to be added on an annual basis. Reported results are preliminary.** - Citation counts (from web of science) and Altmetric scores of Registered Reports and comparable, traditional articles suggest that there is no citation penalty from publishing in this format, and even show a slight increase in both measures so far. ### Comparing Preprints to Peer Reviewed Articles ### - [Comparing published scientific journal articles to their pre-print versions] - [Comparing quality of reporting between preprints and peer-reviewed articles in the biomedical literature] ### Benefits to Trust, Science, and Society ### - "[Trust and Mistrust in Americans’ Views of Scientific Experts]" - Increase in research scientists has increased in recent years, and "...a majority of U.S. adults (57%) say they trust scientific research findings more if the researchers make their data publicly available. Another 34% say that makes no difference, and just 8% say they are less apt to trust research findings if the data is released publicly." - "[How (and Whether) to Teach Undergraduates About the Replication Crisis in Psychological Science]" - "We developed and validated a 1-hr lecture communicating issues surrounding the replication crisis and current recommendations to increase reproducibility. Pre- and post-lecture surveys suggest that the lecture serves as an excellent pedagogical tool. Following the lecture, students trusted psychological studies slightly less but saw greater similarities between psychology and natural science fields." - [Real-Time Sharing of Zika Virus Data in an Interconnected World] - A case study demonstrating how real time data sharing benefited researchers and possibly patients. - [A quick release of genome data from a deadly E. coli breakout lead to faster and better health benefits.] - [A long journey to reproducible results] - Replicating our work took four years and 100,000 worms but brought surprising discoveries, explain Gordon J. Lithgow, Monica Driscoll and Patrick Phillips. - [Benefits of open and high-powered research outweigh costs.] ([OA]) - [A randomized trial of a lab-embedded discourse intervention to improve research ethics] - "We demonstrate that, compared with the control laboratories, treatment laboratory members [who received project-based training curriculum intended to make ethics discourse a routine practice in university laboratories] perceived improvements in the quality of discourse on research ethics within their laboratories as well as enhanced awareness of the relevance and reasons for that discourse for their work." - [Psychologists update their beliefs about effect sizes after replication studies] - "We examined belief updating in action by tracking research psychologists’ beliefs in psychological effects before and after the completion of four large-scale replication projects. We found that psychologists did update their beliefs; they updated as much as they predicted they would, but not as much as our Bayesian model suggests they should if they trust the results." ### Research Participant Attitudes Toward Data Sharing ### - [Clinical Trial Participants’ Views of the Risks and Benefits of Data Sharing] - "A total of 93% were very or somewhat likely to allow their own data to be shared with university scientists..." - [Public Attitudes toward Consent and Data Sharing in Biobank Research: A Large Multi-site Experimental Survey in the US] ### Costs of closed practices ### - [The war over supercooled water] - How a hidden coding error fueled a seven-year dispute between two of condensed matter’s top theorists (which ended after code became open). - [The Economics of Reproducibility in Preclinical Research] - "An analysis of past studies indicates that the cumulative (total) prevalence of irreproducible preclinical research exceeds 50%, resulting in approximately US$28B/year spent on preclinical research that is not reproducible—in the United States alone." ---------- ## Data sharing policies and practices ## - [Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency] - Without imposing any mandates on authors, a journal was able to substantially increase the rate of data sharing by allowing them the opprotunity to signal these actions to their peers by using [Open Practice Badges]. - [Mandated data archiving greatly improves access to research data] ([preprint]) - Policies that merely encourage data archiving do not affect actions. Policies that mandate data archiving are associated with higher rates of such actions, especially when combined with data accessibility statements. - [Are We Wasting a Good Crisis? The Availability of Psychological Research Data after the Storm] - Data sharing policies that require researchers to make data available only when requested are ineffective. - [An empirical analysis of journal policy effectiveness for computational reproducibility] - "We found that we were able to obtain artifacts from 44% of our sample and were able to reproduce the findings for 26%. We find this policy—author remission of data and code postpublication upon request—an improvement over no policy, but currently insufficient for reproducibility." - [Reproducible and transparent research practices in published neurology research] - "Our results indicate that 9.4% [or neurology articles] provided access to materials, 9.2% provided access to raw data, 0.7% provided access to the analysis scripts, 0.7% linked the protocol, and 3.7% were preregistered." - [The ethics of secondary data analysis: Considering the application of Belmont principles to the sharing of neuroimaging data] - Applicable to any non-clinical human subjects research. The authors lay out the Belmont principles of justice, respect for persons, and beneficence, and then apply those principles into responsible data sharing and privacy steps, finally ending up with how these should apply during data sharing decisions. - [Data policies of highly-ranked social science journals.] - "We conclude that a little more than half of the journals in our study have data policies. A greater share of the economics journals have data policies and mandate sharing, followed by political science/international relations and psychology journals." - [Data sharing in PLOS ONE: An analysis of Data Availability Statements] - The proportion of articles in PLOS ONE with data availability statements has increased. The proportion of articles that comply with desired policy (shared data in persistent repository) is relatively low but increasing. - [Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition.] - The authors found that a data sharing mandate was effective at increasing the rates of data availability, with some exceptions. Approximately two-thirds of the data could be used to computational replicate the reported findings, though that often required assistance from the original authors. - [Authors of trials from high-ranking anesthesiology journals were not willing to share raw data] - "Among 619 randomized controlled trials published in seven high-impact anesthesiology journals, only 24 (4%) had data sharing statements in the manuscript. When asked to share de-identified raw data from their trial, authors of only 24 (4%) manuscript shared data. Among 24 trials with data sharing statements in the manuscript, only one author actually shared raw data." - [Transparency of CHI Research Artifacts: Results of a Self-Reported Survey] - "We surveyed authors of [Computer-Human Interaction] 2018–2019 papers, asking if they share their papers' research materials and data, how they share them, and why they do not. The results (N = 460/1356, 34% response rate) show that sharing is uncommon... This paper and all data and materials are freely available at https://osf.io/csy8q " - [The Evolution of Data Sharing Practices in the Psychological Literature] - "Descriptive results ... were coherent with previous findings: following the policies in Cognition and Psychological Science, data sharing statement rates increased immediately and continued to increase beyond the timeframes examined previously, until reaching close to 100%." - [The Transparency of Quantitative Empirical Legal Research (2018-2020)] - "...we assessed 300 empirical articles from highly ranked law journals including both faculty-edited journals and student-edited journals. We found high levels of article accessibility (86% could be accessed without a subscription, 95% CI = [82%, 90%]), especially among student-edited journals (100% accessibility). Few articles stated that a study’s data are available, (19%, 95% CI = [15%, 23%]), and only about half of those datasets are reportedly available without contacting the author. Preregistration (3%, 95% CI = [1%, 5%]) and availability of analytic scripts (6%, 95% = [4%, 9%]) were very uncommon." ---------- ## Reporting Standards, Guidelines, and Checklists ## - [Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane review] - "The results of this review suggest that journal endorsement of CONSORT may benefit the completeness of reporting of RCTs they publish." - [Authorization of Animal Experiments Is Based on Confidence Rather than Evidence of Scientific Rigor] - Few published studies or applications for animal experiments report key details of the experimental protocols. - [Findings of a retrospective, controlled cohort study of the impact of a change in Nature journals' editorial policy for life sciences research on the completeness of reporting study design and execution.] - [A checklist is associated with increased quality of reporting preclinical biomedical research: A systematic review] - [Two Years Later: Journals Are Not Yet Enforcing the ARRIVE Guidelines on Reporting Standards for Pre-Clinical Animal Studies] - [ARRIVE has not ARRIVEd: Support for the ARRIVE (Animal Research: Reporting of in vivo Experiments) guidelines does not improve the reporting quality of papers in animal welfare, analgesia or anesthesia] - [Reducing waste from incomplete or unusable reports of biomedical research] - "..inadequate reporting occurs in all types of studies—animal and other preclinical studies, diagnostic studies, epidemiological studies, clinical prediction research, surveys, and qualitative studies. In this report, and in the Series more generally, we point to a waste at all stages in medical research." - [A randomized trial of a lab-embedded discourse intervention to improve research ethics] - The intervention is a project-based research ethics curriculum that was designed to enhance the ability of science and engineering research laboratory members to engage in reason giving and interpersonal communication necessary for ethical practice. The randomized trial was fielded in active faculty-led laboratories at two US research-intensive institutions. Here, we show that laboratory members perceived improvements in the quality of discourse on research ethics within their laboratories and enhanced awareness of the relevance and reasons for that discourse for their work as measured by a survey administered over 4 mo after the intervention. ---------- ## Effects of preregistration or Registered Reports ## - [Initial evidence of research quality of registered reports compared with the standard publishing model] - "353 researchers peer reviewed a pair of papers from 29 published RRs from psychology and neuroscience and 57 non-RR comparison papers. RRs numerically outperformed comparison papers on all 19 criteria..." - [An excess of positive results: Comparing the standard Psychology literature with Registered Reports] - "We compared the results in the full population of published Registered Reports in Psychology (N = 71 as of November 2018) with a random sample of hypothesis-testing studies from the standard literature (N = 152) .... [W]e found 96% positive results in standard reports, but only 44% positive results in Registered Reports... This large gap suggests that psychologists underreport negative results to an extent that threatens cumulative science" - [Comparing meta-analyses and preregistered multiple-laboratory replication projects] - "We find that meta-analytic effect sizes are significantly different from replication effect sizes... These differences are systematic and, on average, meta-analytic effect sizes are almost three times as large as replication effect sizes." - [The Meaningfulness of Effect Sizes in Psychological Research: Differences Between Sub-Disciplines and the Impact of Potential Biases] - "The median effect of studies published without pre-registration (i.e., potentially affected by those biases) of Mdnr = 0.36 stands in stark contrast to the median effect of studies published with pre-registration (i.e., very unlikely to be affected by the biases) of Mdnr = 0.16. Hence, if we consider the effect size estimates from replication studies or studies published with pre-registration to represent the true population effects we notice that, overall, the published effects are about twice as large." - [Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias — An Updated Review] - Research that reports the results of statistically significant findings is more likely to be published than null findings. - [Association between trial registration and treatment effect estimates: a meta-epidemiological study] - "Lack of trial prospective registration may be associated with larger treatment effect estimates." - [Association between trial registration and positive study findings: cross sectional study (Epidemiological Study of Randomized Trials ESORT)] - "Among published RCTs, there was little evidence of a difference in positive study findings between registered and non-registered clinical trials, even with stratification by timing of registration." - [Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time] - "The number NHLBI trials reporting positive results declined after the year 2000. Prospective declaration of outcomes in RCTs, and the adoption of transparent reporting standards, as required by clinicaltrials.gov, may have contributed to the trend toward null findings." - [The Chrysalis Effect: How Ugly Initial Results Metamorphosize Into Beautiful Articles] ([OA]) - "...from dissertation to journal article, the ratio of supported to unsupported hypotheses more than doubled (0.82 to 1.00 versus 1.94 to 1.00)." - [Registered trials report less beneficial treatment effects than unregistered ones: a meta-epidemiological study in orthodontics] - "Signs of bias from lack of trial protocol registration were found with non-registered trials reporting more beneficial intervention effects than registered ones." - [Registered reports: an early example and analysis] - "Although [Registered Reports] is usually seen as a relatively recent development, we note that a prototype of this publishing model was initiated in the mid-1970s by parapsychologist Martin Johnson in the European Journal of Parapsychology (EJP). A retrospective and observational comparison of Registered and non-Registered Reports published in the EJP during a seventeen-year period provides circumstantial evidence to suggest that the approach helped to reduce questionable research practices." - [Open Science challenges, benefits and tips in early career and beyond] - "We assessed the percentage of hypotheses that were not supported and compared it to percentages previously reported within the wider literature. 61% of the studies we surveyed did not support their hypothesis (https://osf.io/wy2ek/)" See Nature News article [here]. - [Association of Trial Registration With Reporting of Primary Outcomes in Protocols and Publications] - Discrepancies between the protocol and publication were more common in unregistered trials (6 of 11 trials [55%]) than registered trials (3 of 47 [6%]) (P < .001). Only 1 published article acknowledged the changes to primary outcomes. - [Pre-analysis Plans: A Stocktaking] - "We analyze a representative sample of 195 pre-analysis plans (PAPs) ... to assess whether PAPs are sufficiently clear, precise and comprehensive to be able to achieve their objectives of preventing “fishing” and reducing the scope for post-hoc adjustment of research hypotheses. We also analyze a subset of 93 PAPs from projects that have resulted in publicly available papers to ascertain how faithfully they adhere to their pre-registered specifications and hypotheses. We find significant variation in the extent to which PAPs are accomplishing the goals they were designed to achieve." - [Discrepancies in the Registries of Diet vs Drug Trials] - "Our literature search retrieved 148 drug studies and 343 diet studies, from which 9 and 21, respectively, were included in our sample after applying exclusion criteria. ...[Two] drug trials (22%) and 18 diet trials (86%) had a substantive discrepancy from initial registration, typically involving a change in time frame of the primary outcome or the number of co–primary outcomes." ### Publication Bias ### - [Publication bias in the social sciences: Unlocking the file drawer] ([preprint]) - The authors find a strong bias toward statistically significant findings in reported outcomes, even within a body of work where methodology and rigor did not vary. - [Of Bias and Blind Selection: Pre-registration and Results-Free Review in Observational and Qualitative Research] - "Across the 94 articles examined, there is not a single example of a null finding: of conclusions that directly undermine the primary explanatory or theoretical claim that the article develops." - [The cumulative effect of reporting and citation biases on the apparent efficacy of treatments: the case of depression] - Access to unpublished results via the FDA reviews allowed the authors to discover size of publication bias within this field. - [Outcome reporting bias in randomized-controlled trials investigating antipsychotic drugs] - "Of the 48 RCTs [from ClinicalTrials.gov], 85% did not fully adhere to the prespecified outcomes [in the published article]." - [P values in display items are ubiquitous and almost invariably significant: A survey of top science journals] - "...the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome." ---------- ## Questionable research practices ## - [HARKing: Hypothesizing After the Results are Known] - Using a dataset to generate and then test a hypothesis is circular reasoning that invalidates the test statistics. - [Why Most Published Research Is False] - Small sample sizes, effect sizes, and unreported data analysis flexibility invalidate most statistical tests. - [False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant] - The authors show how even absurd claims can be supported by presenting only a subset of analyses and provide six concrete solutions to this problem. - [Current Incentives for Scientists Lead to Underpowered Studies with Erroneous Conclusions] - Higginson and Munafo demonstrate how researchers are rewarded for underpowered research and how valuation of different research methods can affect researchers' self-interested actions. - [The natural selection of bad science] - "The persistence of poor methods results partly from incentives that favour them, leading to the natural selection of bad science. ... In order to improve the culture of science, a shift must be made away from correcting misunderstandings and towards rewarding understanding. We support this argument with empirical evidence and computational modelling. ### Surveys of QRP prevalence ### *Note: See a quick summary of these surveys here, via [Hannah Fraser]* - [Questionable research practices among Italian research psychologists] - "Nearly all researchers (88%) admitted using at least one [QRP]." - [Questionable research practices in ecology and evolution] - "...we found 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking); 42% had collected more data after inspecting whether results were statistically significant (a form of p hacking) and 51% had reported an unexpected finding as though it had been hypothesised from the start (HARKing)." - [Questionable and Open Research Practices in Education Research] - "Broadly, our results suggest that both questionable and open research practices are part of the typical research practices of many educational researchers." - [Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling] - "...we found that the percentage of respondents who have engaged in questionable practices was surprisingly high" - [Questionable and open research practices: attitudes and perceptions among quantitative communication researchers] - "A non-trivial percent of researchers report using one or more QRPs. While QRPs are generally considered unacceptable, researchers perceive QRPs to be common among their colleagues." ---------- ## The Reproducibility Crisis ## - [The Reproducibility Project: Cancer Biology] - An initiative to independently replicate selected experiments from a number of high-profile papers in the field of cancer biology. In the end 50 experiments from 23 papers were repeated. - **Many Labs 1** "[Investigating Variation in Replicability]" A “Many Labs” Replication Project - "This research tested variation in the replicability of 13 classic and contemporary effects across 36 independent samples totaling 6,344 participants. In the aggregate, 10 effects replicated consistently. One effect – imagined contact reducing prejudice – showed weak support for replicability. And two effects – flag priming influencing conservatism and currency priming influencing system justification – did not replicate. We compared whether the conditions such as lab versus online or US versus international sample predicted effect magnitudes. By and large they did not. The results of this small sample of effects suggest that replicability is more dependent on the effect itself than on the sample and setting used to investigate the effect." - "**Many Labs 2**: [Investigating Variation in Replicability Across Sample and Setting]" - "Cumulatively, variability in observed effect sizes was more attributable to the effect being studied than the sample or setting in which it was studied." - "**Many Labs 3**: [Evaluating participant pool quality across the academic semester via replication]" - "The university participant pool is a key resource for behavioral research, and data quality is believed to vary over the course of the academic semester. This crowdsourced project examined time of semester variation in 10 known effects, 10 individual differences, and 3 data quality indicators over the course of the academic semester in 20 participant pools (N = 2696) and with an online sample (N = 737)." - [Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say "Usually Not"] - Using original data and code when available, the authors were able to computationally reproduce less than half of the original findings from their target sample of 67 studies. - [Evaluating replicability of laboratory experiments in economics] - Of 18 experimental studies published in economics were, 11 (61%) replicated primary findings. - [Estimating the reproducibility of psychological science] ([preprint]) - The authors attempted to replicate 100 studies from the published literature using higher powered designs and original materials and were able to replicate less than 40 original findings. - The [Reproducibility Project: Cancer Biology] - [Drug development: Raise standards for preclinical cancer research] - Commerical attempts to confirm 53 landmark, novel studies resulted in 6 (11%) confirmed research findings. - [Believe it or not: how much can we rely on published data on potential drug targets?] - Of 67 target-validation projects in oncology and cardiovascular medicine conducted at Bayer, 14 projects (20%) showed results that matched with published findings, but were highly inconsistent in 43. - [Repeatability of published microarray gene expression analyses] - In this study, Ioannidis et. al. attempted to repeat the analyses of 18 experiments using data from the original studies. The results of eight experiments were reproduced or partially reproduced. - [Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015] - "We replicate 21 systematically selected experimental studies in the social sciences published in Nature and Science between 2010 and 2015. The replications follow analysis plans reviewed by the original authors and pre-registered prior to the replications... We find a significant effect in the same direction as the original study for 13 (62%) studies, and the effect size of the replications is on average about 50% of the original effect size." - [Estimating the Reproducibility of Experimental Philosophy] - "Drawing on a representative sample of 40 x-phi studies published between 2003 and 2015, we enlisted 20 research teams across 8 countries to conduct a high-quality replication of each study in order to compare the results to the original published findings. We found that x-phi studies – as represented in our sample – successfully replicated about 70% of the time. " - "On the reproducibility of science: unique identification of research resources in the biomedical literature" - [50% of scientific resources used in previously published articles were unidentifiable] - "The Economics of Reproducibility in Preclinical Research" - [$28 billion annually in the US alone wasteful spent on research that cannot be replicated.] - [Rate and success of study replication in ecology and evolution] - "Approximately 0.023% of ecology and evolution studies are described by their authors as replications." - "[No Support for Historical Candidate Gene or Candidate Gene-by-Interaction Hypotheses for Major Depression Across Multiple Large Samples]" ---------- ## Evaluating Journals Policies ## - [Are Psychology Journals Anti-replication? A Snapshot of Editorial Practices] - "Thirty three journals [out of 1151] (3%) stated in their aims or instructions to authors that they accepted replications." - [Effect of impact factor and discipline on journal data sharing policies] - ""[We analyzed] the data sharing policies of 447 journals across several scientific disciplines, including biology, clinical sciences, mathematics, physics, and social sciences. Our results showed that only a small percentage of journals require data sharing as a condition of publication..." - [Evaluation of Journal Registration Policies and Prospective Registration of Randomized Clinical Trials of Nonregulated Health Care Interventions] - "Few journals in behavioral sciences or psychology, nursing, nutrition and dietetics, rehabilitation, and surgery require prospective trial registration, and those with existing registration policies rarely enforce them; this finding suggests that strategies for encouraging prospective registration of clinical trials not subject to FDA regulation should be developed and tested." ---------- ## Recommendations for Increasing Reproducibility ## - [General Principles of Preclinical Study Design] - [A Practical Guide for Transparency in Psychological Science] - "Here we provide a practical guide to help researchers navigate the process of preparing and sharing the products of their research (e.g., choosing a repository, preparing their research products for sharing, structuring folders, etc.)." - [Detecting and avoiding likely false-positive findings – a practical guide] - [A manifesto for reproducible science] - [An Agenda for Purely Confirmatory Research] - [Striving for transparent and credible research: practical guidelines for behavioral ecologists] - [Performing high-powered studies efficiently with sequential analyses] - Sequential analyses give the researcher a tool to minimize sample size and "peek" at incoming results without invalidating the test statistics o increasing the false positive rate. - [Standard Operating Procedures: A Safety Net for Pre-Analysis Plans] - SOPs allow you to provide rationale for your decisions by citing a document that lives outside of your preregistration. This keeps preregs concise, and serves as a lab notebook of "lessons learned" over many years. - [The Psychological Science Accelerator: Advancing Psychology through a Distributed Collaborative Network] - [Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations] ### Reproducible Workflows, Computing, and Data Transparency ### - "[Good enough practices in scientific computing]" - "This paper presents a set of good computing practices that every researcher can adopt, regardless of their current level of computational skill. These practices... encompass data management, programming, collaborating with colleagues, organizing projects, tracking work, and writing manuscripts..." - [Enhancing transparency of the research process to increase accuracy of findings: A guide for relationship researchers] ([OA]) - [Practical Tips for Ethical Data Sharing] ([OA]) - "This Tutorial provides practical dos and don’ts for sharing research data in ways that are effective, ethical, and compliant with the federal Common Rule. - [Analysis of Open Data and Computational Reproducibility in Registered Reports in Psychology] After attempting to reproduce the analyses and results of 62 Registered Reports (20 of which were fully reproducible), the autors provide several specific recommendations: - Data is easier to understand and more reusable if vari-ables and their values are clearly described (eg in a [codebook]) - Code should be well-annotated, so that it is under-standable for researchers who did not write the code - For R code, list all the packages that the code needs to run at the top of the script (and their versions) - Use relative locations (and not “c:/user/myfolder/code”). - When multiple scripts are used in the anal-ysis, include the order in which scripts should be performedon the data in a README file - SPSS users should take careto clearly organize their analysis scripts by adding commentsor a README file that links results generated by the SPSSscript to the analyses reported in the manuscript ### Preregistration ### - [The Preregistration Revolution] - [General overview and resources page on preregistration] - [Guidelines for the Content of Statistical Analysis Plans in Clinical Trials] ### Specification Curves or Multiverse Analysis ### - [Specification Curve: Descriptive and Inferential Statistics on All Reasonable Specifications] - "Empirical results often hinge on data analytic decisions that are simultaneously defensible, arbitrary, and motivated. To mitigate this problem we introduce Specification-Curve Analysis, which consists of three steps: (i) identifying the set of theoretically justified, statistically valid, and non-redundant analytic specifications, (ii) displaying alternative results graphically, allowing the identification of decisions producing different results, and (iii) conducting statistical tests to determine whether as a whole results are inconsistent with the null hypothesis." - [Run All the Models! Dealing With Data Analytic Flexibility] - multiverse paper: https://ppw.kuleuven.be/okp/_pdf/Steegen2016ITTAM.pdf - multiverse shiny app: https://r.tquant.eu/KULeuven/Multiverse/ - multiverse code: https://osf.io/zj68b/ ### Training Resources ### - [COS webinars] - [Data carpentry] - [Improving your statistical inference] by Daniel Lakens - NIH [Clearinghouse for Training Modules to Enhance Data Reproducibility] - [Best practices in open science] ### Split samples, holdout data, or training and validation data sets ### - “[Split-Sample Strategies for Avoiding False Discoveries],” by Michael L. Anderson and Jeremy Magruder ([ungated here]) - “[Using Split Samples to Improve Inference on Causal Effects],” by Marcel Fafchamps and Julien Labonne ([ungated and updated here]) - [The reusable holdout: Preserving validity in adaptive data analysis] ---------- ## Attitudes about open science ## The following studies report on opinions and self-reported frequencies of various open science practices. All include links to the questionnaires and collected data. - [Normative Dissonance in Science: Results from a National Survey of U.S. Scientists] - Norms of behavior in scientific research represent ideals to which most scientists subscribe. Our analysis of the extent of dissonance between these widely espoused ideals and scientists' perceptions of their own and others' behavior is based on survey responses from 3,247 [scientists]. We found substantial normative dissonance, particularly between espoused ideals and respondents' perceptions of other scientists' typical behavior. Also, respondents on average saw other scientists' behavior as more counternormative than normative. ... The high levels of normative dissonance documented here represent a persistent source of stress in science. - [A study of the determinants of psychologists' data sharing and open data badge adoption] - "The results... demonstrate that psychologists' (n = 338) data sharing and open data badge adoption intentions are commonly influenced by perceived community benefit, norm of data sharing, and perceived effort involved to share datasets. Additionally, psychologists' data sharing intentions are affected by additional, normative, and control factors including the norm of reciprocity, IRB requirements, and availability of data repositories. As it concerns open data badge adoption, psychologists are affected by additional attitudinal factors, including perceived academic reputation and risk. This research suggests psychologists' motivations to share data and for open data badge adoption differ..." - [Open Science Practices are on the Rise: The State of Social Science (3S) Survey] - “Data from a recent representative survey of scholars in four large social science disciplines... indicates that the adoption of open science practices has been increasing rapidly over the past decade... Behaviors such as posting data and materials that were nearly unknown in some fields as recently as 2005 are now practiced by the majority of scholars. Other newer practices, such as study pre-registration, have experienced a sharp rise in adoption just in recent years, especially among scholars who engage in experimental research... The second main finding of the analysis is that stated support for open science practices is outpacing both their actual adoption and respondents’ beliefs about others’ support..." - [Attitudes towards animal study registries and their characteristics: An online survey of three cohorts of animal researchers] - "The respondents indicated, that some aspects of ASRs can increase administrative burden but could be outweighed by other aspects decreasing this burden. Animal researchers found it more important to register studies that involved animal species with higher levels of cognitive capabilities." - [Why Do Some Psychology Researchers Resist Adopting Proposed Reforms to Research Practices? A Description of Researchers’ Rationales] - Our results suggest that (a) researchers have adopted some of the proposed reforms (e.g., reporting effect sizes) more than others (e.g., preregistering studies) and (b) rationales for not adopting them reflect a need for more discussion and education about their utility and feasibility. - [Data Sharing in Psychology: A Survey on Barriers and Preconditions] - "The results confirmed that data are shared only infrequently. Perceived barriers included respondents’ belief that sharing is not a common practice in their fields, their preference to share data only upon request, their perception that sharing requires extra work, and their lack of training in sharing data." - [The state of social and personality science: Rotten to the core, not so bad, getting better, or getting worse?] - [1,500 scientists lift the lid on reproducibility] | [Questionnaire and Data] - 90% of scientists feel that there is a significant or slight reproducibility crisis. 3% feel that there is no crisis. - [Survey on open peer review: Attitudes and experience amongst editors, authors and reviewers] | [Materials and data] - [The State of Open Data] | [Data and Survey] - [Open Data: The Researcher Perspective] | [Data and Survey] : https://www.zotero.org/groups/osf/items/collectionKey/6NTIIMHN : https://www.zotero.org/groups/osf/items/collectionKey/QK9TP9B9 : https://reproducibility.dash.umn.edu/ : https://www.zotero.org/groups/2526436/meta-research_on_os-related_surveys : https://help.osf.io/hc/en-us/articles/360019737474-Edit-the-Wiki : https://arxiv.org/pdf/1907.02565.pdf : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000308 : https://hal-hprints.archives-ouvertes.fr/hprints-00714715 : https://peerj.com/articles/175/ : https://arxiv.org/pdf/1111.3618.pdf : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0225883 : https://jamanetwork.com/journals/jama/fullarticle/2670247 : https://elifesciences.org/articles/52646 : https://www.biorxiv.org/content/10.1101/673665v1 : https://osf.io/5y8w7/ : https://link.springer.com/article/10.1007/s00799-018-0234-1 : https://www.biorxiv.org/content/10.1101/581892v3 : https://www.pewresearch.org/science/wp-content/uploads/sites/16/2019/08/PS_08.02.19_trust.in_.scientists_FULLREPORT_8.5.19.pdf : https://journals.sagepub.com/doi/abs/10.1177/0098628318762900?journalCode=topa : https://jamanetwork.com/journals/jamapediatrics/fullarticle/2511238 : http://opendatahandbook.org/value-stories/en/open-sourcing-genomes/ : https://www.nature.com/news/a-long-journey-to-reproducible-results-1.22478 : https://www.ncbi.nlm.nih.gov/pubmed/28714729 : https://psyarxiv.com/fcxge : https://www.pnas.org/content/early/2020/01/08/1917848117 : https://www.nature.com/articles/s41562-021-01220-7 : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6057615/ : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5339111/ : https://physicstoday.scitation.org/do/10.1063/PT.6.1.20180822a/full/ : http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165 : http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002456 : http://cos.io/badges : http://www.fasebj.org/content/27/4/1304.short : https://arxiv.org/pdf/1301.3744.pdf : http://www.collabra.org/articles/10.1525/collabra.13/ : http://www.pnas.org/content/early/2018/03/08/1708290115 : https://researchintegrityjournal.biomedcentral.com/articles/10.1186/s41073-020-0091-5 : http://www.sciencedirect.com/science/article/pii/S1053811913001742 : https://osf.io/preprints/socarxiv/9h7ay : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194768 : https://osf.io/preprints/bitss/39cfb/ : https://www.jclinepi.com/article/S0895-4356%2818%2930606-1/pdf : https://osf.io/3bu6t : https://psyarxiv.com/3xdja : https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4046154 : https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/2046-4053-1-60 : http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2000598 : https://www.biorxiv.org/content/early/2017/09/12/187245 : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0183591 : http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001756 : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0197882 : https://www.thelancet.com/journals/lancet/article/PIIS0140-6736%2813%2962228-X/fulltext : https://www.pnas.org/doi/full/10.1073/pnas.1917848117 : https://www.nature.com/articles/s41562-021-01142-4 : https://psyarxiv.com/p6e9c : https://www.nature.com/articles/s41562-019-0787-z : https://www.frontiersin.org/articles/10.3389/fpsyg.2019.00813/full : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066844 : http://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-016-0639-x : http://www.bmj.com/content/356/bmj.j917 : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132382 : https://journals.sagepub.com/doi/abs/10.1177/0149206314527133 : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1027.8580&rep=rep1&type=pdf : https://www.sciencedirect.com/science/article/pii/S0895435617311381 : https://peerj.com/articles/6232/ : https://psyarxiv.com/3czyt/ : https://www.nature.com/articles/d41586-018-07118-1 : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5818784/ : https://www.georgeofosu.com/files/Ofosu-Posner-191007-1.pdf : https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2755303 : http://science.sciencemag.org/content/345/6203/1502 : http://www.law.nyu.edu/sites/default/files/upload_documents/September%209%20Neil%20Malhotra.pdf : https://politics.sites.olt.ubc.ca/files/2018/11/Jacobs-Prereg-and-RBR-in-Qualitative-and-Observational-1.pdf : https://www.cambridge.org/core/journals/psychological-medicine/article/cumulative-effect-of-reporting-and-citation-biases-on-the-apparent-efficacy-of-treatments-the-case-of-depression/71D73CADE32C0D3D996DABEA3FCDBF57#fndtn-information : http://www.nature.com/tp/journal/v7/n9/full/tp2017203a.html : http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0197440 : http://psr.sagepub.com/cgi/doi/10.1207/s15327957pspr0203_4 : http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124 : http://pss.sagepub.com/lookup/doi/10.1177/0956797611417632 : http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2000995 : https://royalsocietypublishing.org/doi/10.1098/rsos.160384 : https://twitter.com/HannahSFraser/status/1190063017804718080 : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0172792 : http://dx.plos.org/10.1371/journal.pone.0200303 : https://edarxiv.org/f7srb/ : https://journals.sagepub.com/doi/abs/10.1177/0956797611430953 : https://psyarxiv.com/7uyn5/ : https://www.cos.io/rpcb : https://econtent.hogrefe.com/doi/full/10.1027/1864-9335/a000178 : https://psyarxiv.com/9654g : https://www.sciencedirect.com/science/article/pii/S0022103115300123 : http://www.federalreserve.gov/econresdata/feds/2015/files/2015083pap.pdf : http://science.sciencemag.org/content/351/6280/1433 : http://science.sciencemag.org/content/349/6251/aac4716 : https://osf.io/447b3/ : https://elifesciences.org/collections/reproducibility-project-cancer-biology : https://www.nature.com/nature/journal/v483/n7391/full/483531a.html : http://www.nature.com/nrd/journal/v10/n9/full/nrd3439-c1.html : http://www.nature.com/ng/journal/v41/n2/full/ng.295.html : https://www.nature.com/articles/s41562-018-0399-z : https://link.springer.com/article/10.1007/s13164-018-0400-9 : https://peerj.com/articles/148/ : https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165 : https://peerj.com/articles/7654/ : https://www.ncbi.nlm.nih.gov/pubmed/30845820 : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5387793/ : https://www.tandfonline.com/doi/abs/10.1080/08989621.2019.1591277?journalCode=gacr20 : https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2727849?guestAccessKey=6db00880-5864-4de7-8e82-a107a6323d98&utm_source=silverchair&utm_medium=email&utm_campaign=article_alert-jamainternalmedicine&utm_content=etoc&utm_term=050619 : https://link.springer.com/chapter/10.1007/164_2019_277 : https://www.collabra.org/articles/10.1525/collabra.158/ : http://onlinelibrary.wiley.com/doi/10.1111/brv.12315/full : http://www.nature.com/articles/s41562-016-0021 : https://journals.sagepub.com/doi/full/10.1177/1745691612463078 : https://academic.oup.com/beheco/article-abstract/doi/10.1093/beheco/arx003/3069145/Striving-for-transparent-and-credible-research?redirectedFrom=fulltext : http://onlinelibrary.wiley.com/doi/10.1002/ejsp.2023/abstract : https://www.stat.berkeley.edu/~winston/sop-safety-net.pdf : https://psyarxiv.com/785qu/ : https://link.springer.com/article/10.1007/s10654-016-0149-3 : https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510 : http://psycnet.apa.org/record/2014-55649-001 : https://etiennelebel.com/documents/cl&l%282014,pr%29.pdf : http://journals.sagepub.com/doi/pdf/10.1177/2515245917747656 : http://louisville.edu/mobileelsi/wgm-2-thought-leader-input-and-regulatory-framework/wgm-2-background-materials/practical-tips-for-ethical-data-sharing/view : https://psyarxiv.com/fk8vh/ : https://help.osf.io/hc/en-us/articles/360019739054-How-to-Make-a-Data-Dictionary : http://www.pnas.org/content/early/2018/03/08/1708274114 : http://cos.io/prereg : https://jamanetwork.com/journals/jama/fullarticle/2666509 : https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2694998 : https://www.psychologicalscience.org/observer/run-all-the-models-dealing-with-data-analytic-flexibility : https://cos.io/our-services/training-services/cos-training-tutorials/ : https://datacarpentry.org/lessons/ : https://www.coursera.org/learn/statistical-inferences : https://www.nigms.nih.gov/training/pages/clearinghouse-for-training-modules-to-enhance-data-reproducibility.aspx : https://help.osf.io/hc/en-us/categories/360001530634-Best-Practices : http://www.nber.org/papers/w23544 : https://are.berkeley.edu/~jmagruder/split-sample.pdf : http://www.nber.org/papers/w21842 : https://julienlabonne.files.wordpress.com/2017/06/sample_split_simulations_web.pdf : http://science.sciencemag.org/content/349/6248/636 : https://www.jstor.org/stable/10.1525/jer.2007.2.4.3?seq=1#page_scan_tab_contents : https://onlinelibrary.wiley.com/doi/abs/10.1002/leap.1388 : https://osf.io/preprints/metaarxiv/5rksu/ : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226443 : http://journals.sagepub.com/doi/abs/10.1177/2515245918757427 : https://journals.sagepub.com/doi/full/10.1177/2515245917751886 : http://psycnet.apa.org/fulltext/2017-18565-001.html : https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970 : https://figshare.com/articles/Nature_Reproducibility_survey/3394951/1 : http://%20http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0189311 : https://zenodo.org/record/439531 : https://figshare.com/articles/The_State_of_Open_Data_Report/4036398 : https://figshare.com/articles/Open_Data_Survey/4010541 : https://www.elsevier.com/__data/assets/pdf_file/0004/281920/Open-data-report.pdf : https://data.mendeley.com/datasets/bwrnfb4bvh/1
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.