Scripts, data and output to reproduce "Addressing the Challenges of Reconstructing Systematic Reviews Datasets: A Case Study and a Noisy Label Filter Procedure"

doi:10.17605/OSF.IO/PJR97

Title	Authors

Home

Software used: Zotero (managing RIS-files, list of records), ASReview (NLF procedure), ASReview Makita (simulation study), ASReview Datatools (deduplication and labeling). 1. Reconstruction - In the document ‘*reconstructed_data_after_reproducting_search_queries.ris*’, the list of records after exporting but before deduplication can be found. In other words, these are the data after exporting the relevant search queries, without having performed any correction yet. - In the document ‘*reconstructed_data_after_quality_check_2.ris*’, the reconstructed data have been adjusted following the steps ‘quality check 1, deduplication, and quality check 2’. This means the data are deduplicated, and the initially relevant records are labeled so (‘ASReview_relevant’) and all the other records – the noisy labels – are labeled as ‘ASReview_not_seen’ (by the *ASReview datatools* script referred to in the paper.) 02. Applying NLF procedure - The document ‘nlf_procedure_test_trimbos.asreview’ can be opened in ASReview. The NLF procedure was performed in ASReview, and the ASReview file was exported. - In the document ‘reconstructed_data_after_NLF_procedure.ris’, following the results of the NLF procedure, all noisy labels were labeled as irrelevant. Now the relevant records are labeled as ‘ASReview_relevant’ and the irrelevant records are labeled as ‘ASReview_irrelevant’. 03. Simulation - In the folder ‘Simulation_Makita’, all relevant data concerning the simulation study can be found. For example, one can see the list of records used, the ASReview statistics and a graph in which different simulation study modes can be compared. Information about steps from NLF procedure to Simulation Study: Following the classification of scenario 2d in the NLF procedure’s methods section, one simulation study was conducted with n=20 relevant records and n=1033 irrelevant records. Before the simulation study could start, the reconstructed dataset had to be adapted such that all noisy labels would be labeled irrelevant. This was done by running the following script in the command line interface (requires ASReview Datatools (De Bruin et al., 2022): asreview data compose output.ris -l "na_dedup_na2ededup.ris" -i "na_dedup_na2ededup.ris" The meaning of this script: ‘compose’ makes sure that a dataset with different labels will be assembled into a new, single dataset (De Bruin et al., 2022). ‘-l’ means that the existing labels from the previous dataset are used first. In this case, the relevant records were already labeled as relevant, so in the new dataset these records will be labeled relevant too. ‘-i’ means that ‘all records should be labeled irrelevant’. However, because the ‘-l’ is before the ‘-i’ in the script, existing labels are favored over labeling all records as irrelevant. In this way, all unlabeled items (noisy labels) became labeled as irrelevant. Data were extracted from this original study (Oud et al., 2018): https://doi.org/10.1177/0004867418791257

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.