# Open Science in CSD: Survey Data Metadata
## General info
**Link to dataset**
https://osf.io/5yvnx
**Dataset name**
Open Science in CSD: Survey Data
**Dataset description**
This dataset was collected as part of a study aimed at describing the knowledge, practices, and barriers to open science among researchers in the field of Communication Sciences and Disorders (CSD). The data are cross-sectional and were acquired via an online survey using Qualtrics.
The raw data were preprocessed to simplify variable names, remove irrelevant variables generated by Qualtrics (e.g., duration taken to complete the survey), and include only participants who met the inclusion criteria. The preprocessed data is shared.
**Dataset contributors**
Mariam El Amin <sup>1</sup>, James Borders<sup>2</sup>, Helen Long<sup>3</sup>, Mary Alice Keller<sup>4</sup>, & Elaine Kearney<sup>5</sup>
**Organization(s) affiliated to the dataset**
<sup>1</sup>University of Georgia
<sup>2</sup>Teachers College, Columbia University
<sup>3</sup>Waisman Center, University of Wisconsin - Madison
<sup>4</sup>HCA Healthcare, Graduate Medical Education
<sup>5</sup>Boston University
**Location of data collection**
Online
**Participants**
Active researchers (PhD students, post-doctoral researchers, faculty, research scientists) engaged in research in CSD and residing in the USA. Engagement in research was defined as participation in any aspect of the research process.
**Variable types**
Individual-level demographic info, item-level data
**Time points in dataset**
1
**Date of data collection**
July 1 2021-Aug 12 2021
**License**
CC-By Attribution 4.0 International
## Codebook
Note: Any variable that has a _free_text suffix is a character variable and corresponds to the text input of the previous variable where “other” was an option.
The first set of variables correspond to questions determining the eligibility of participants:
**participate**
- Binary variable (I want to participate/I do not want to participate)
- Indicating interest in participation
**engaged_research_binary**
- Binary variable (Yes/No)
- Indicating engagement in research in CSD
**location_usa**
- Binary variable (Yes/No)
- Indicating residence in the USA
The next set of variables correspond to questions about participant demographics:
**research_position_description**
- Categorical variable
- Describing research position
- Selected from pre-populated drop-down menu
**phd_year**
- Numeric variable
- Year PhD was awarded
- NA = missing data (likely due to PhD not yet being completed)
**research_years_experience**
- Numerical variable
- Years of research experience in CSD including time spent as research assistant, PhD student, postdoc, faculty
**carnegie_classification**
- Categorical variable
- Describing Carnegie classification of current institution
- Selected from pre-populated drop-down menu
**slp_research_area**
- Categorical variable
- Describing research area
- Selected from pre-populated drop-down menu
- Was possible to select more than one
area
**number_publications**
- Numerical variable
- Number of manuscripts submitted for peer review as author or co-author in past 3 years
**research_engagement**
- Categorical variable
- Describing type of research engagement
- Selected from pre-populated drop-down menu
- Was possible to select more than one type of engagement
**authoring_background**
- Categorical variable
- Describing experience in scientific authorship
- Selected from pre-populated drop-down menu
- Was possible to select more than one experience
The next set of variables correspond to questions about four different open science practices, namely preregistration (prereg), self-archiving (self_arch), gold open access (open_access), and open data (open_data). A number of variables are common across the practices, for example, all practices had a question asking participants how knowledgable they are about the specific practice. For those common questions, the variables are named [practice]_[construct]. These variables will be explained first, followed by the practice-specific variables. For likert scale variables, 1 represents the lower end of the spectrum (e.g., not at all knowledgable) and 6 represents the higher end (e.g., extremely knowledgable).
**[practice]_knowledgable**
- Discrete variable
- 1-6 likert scale
- Indicating degree of knowledge about a given practice
**[practice]_learn_more**
- Discrete variable
- 1-6 likert scale
- Indicating interest in learning more about a given practice
**[practice]_before_binary**
- Binary variable (Yes/No)
- Indicating whether they had implemented a given practice in the past 12 months
**[practice]_future_binary**
- Binary variable (Yes/No)
- Indicating whether they intended to implement a given practice in the next 12 months
**[practice]_beneficial_daily**
- Discrete variable
- 1-6 likert scale
- Indicating perceived benefit of a given practice for daily life as a researcher
**[practice]_beneficial_research**
- Discrete variable
- 1-6 likert scale Indicating perceived benefit of a given practice for research field
**[practice]_beneficial_public**
- Discrete variable
- 1-6 likert scale
- Indicating perceived benefit of a given practice for public society
**[practice]_barriers_extent**
- Discrete variable
- 1-6 likert scale
- Indicating extent of perceived barriers in implementing a given practice
**[practice]_barriers**
- Categorical variable
- Describing type of barriers faced in
implementing a given practice
- Selected from pre-populated drop-down menu
- Was possible to select more than one barrier
**prereg_projects_before_percentage**
- Continuous variable
- Applicable if prereg_projects_before_binary = Yes
- Estimate of percentage of previous projects preregistered
**prereg_projects_before_where**
- Categorical variable
- Applicable if prereg_projects_before_binary = Yes
- Describing platforms used when preregistering
- Selected from pre-populated drop-down menu
- Was possible to select more than one platform
**self_arch_papers_before_where**
- Categorical variable
- Applicable if self_arch_papers_before_binary = Yes
- Describing platforms used when self-archiving
- Selected from pre-populated drop-down menu
- Was possible to select more than one platform
**open_access_papers_before_percentage**
- Continuous variable
- Applicable if open_access_papers_before_binary = Yes
- Estimate of percentage of previous papers made gold open access
**open_access_valuable**
- Discrete variable
- 1-6 likert scale
- Indicating perceived value of open access publishing
**open_access_reasons**
- Categorical variable
- Describing reasons for publishing in open access journals
- Selected from pre-populated drop-down menu
The final set of variables are calculated variables:
**years_since_phd**
- Numerical variable
- Calculated as 2021 - phd_year NA = missing data
**id**
- Discrete variable
- Unique identifier for each subject; assigned one per row; not linked to any identifying subject information