Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
**Enhanced Supporting Information Publisher Pilot - NSF FAIR Chemical Data Publishing Guidelines Workshop: Chemical Structures and Spectra** ***Overview of Pilot*** The purpose of the Enhanced Supporting Information Publisher Pilot is to test a plan for disseminating machine-readable chemical structures and spectra alongside published journal articles. Disseminating enhanced supporting information contributes to the chemical communities’ desire for open data that is Findable, Accessible, Interoperable and Reusable (FAIR) broadly by the scientific community. The Pilot seeks to build on existing publisher supporting information workflows by adapting the current workflow to include an organized package file containing machine-readable structure and spectral data files, along with corresponding metadata. As current machine-readable chemical structure file formats best represent and support small molecule chemistry, the initial target audience is organic chemistry authors and editors. To incentivize author participation, we envision that publishers and editors would offer various value-add advertising and promotion associated with articles containing enhanced supporting information (e.g., more FAIR data). ***Selected Example Benefits for Authors, Reviewers, and Publishers*** Authors - Receive credit for sharing FAIR data - Archived data extends usefulness of research and is available for future researchers - Increase reputation of integrity and quality science - Demonstrate robust data sharing methods to funders Gain authority in field and increase metrics/citations Reviewers and Readers - More easily reproducible the science - Enhanced ability to critically evaluate the data and conclusions - Ability to repurpose and reuse the data - Improve own scientific data sharing Publishers and Editors - Increase readership - Increase image and reputation for the journal - Increased citations - Ensure rigor/quality/reputation of authors and journal - Attract higher profile authors to publish - Being competitive against other journal publishers ***Overview of FAIR data and reuse examples of machine-readable structure and spectral data.*** Data are considered FAIR when they are Findable, Accessible, Interoperable and Reusable by both human experts and machines. FAIR data can be discovered, ingested, and compiled programmatically with minimal corruption and loss into downstream applications, such as computational analysis. Making sure data are FAIR involves sharing data files in open and accessible formats as much as possible, ideally in both raw and any processed forms. Processed data are generally easier to reuse directly in applications and including raw data as part of supplemental information can promote transparency in research as well as provide full signal information for re-analysis to compare techniques or develop new methodologies. It is also critical to include adequate description of the data files so that users can accurately assess the parameters of the samples and measurement techniques. Representing this information so it can be processed by machines is referred to as metadata. In the digital environment, persistent identifiers are increasingly used to connect data files, descriptions and other research outputs such as articles. DOIs are regularly used for the official copy of record for articles and can also be used for datasets. In addition to unique alpha-numeric descriptors, DOIs include a basic set of metadata, general level description to establish provenance and facilitate discovery. Full description for analysis and reproducibility of specific measurements is usually included as part of the file format. A common example of data that are published in FAIR manner that is familiar to many chemists are crystallographic data of small organic molecules. Processed crystallographic data associated with articles are saved in the open CIF standard file format and deposited into the Cambridge Structure Database (CSD) where they can be further searched and analyzed. Some compelling reuse cases: McAlpine, J. B. et al. The value of universally available raw NMR data for transparency, reproducibility, and integrity in natural product research. Nat. Prod. Rep., 2019, 36,35-107. DOI: 10.1039/C7NP00064B. Coley, C. W.; Green, W. H.; Jensen, K. F. Machine Learning in Computer-Aided Synthesis Planning. Accounts of Chemical Research 2018, 51, 1281-1289. DOI: 10.1021/acs.accounts.8b00087
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.