# Preprocessing systematic review data of Brouwer et al. (2019) for use in ASReview
This repository contains steps and scripts to produce one single dataset of records that were screened in subsets as part of the
systematic review conducted by Brouwer et al. (2019).
The raw files used to produce the dataset are stored in `/raw`.
The dataset is spread out in different subsets:
- Additional search (update of searches at the end of the review process)
We went through the following steps to build the dataset:
## Retrieve abstracts
Some records in the raw data did not contain an abstract. We imported the `bib` files into an EndNote library to retrieve missing abstracts.
The following steps were taken for each `bib` file:
1. Import `bib` file to Endnote.
2. Sort records on abstract field and select records without abstracts
3. Update references without abstracts (update empty fields only)
4. Export as a .ris file to the folder `raw/1 - retrieve abstracts/ris-with-abstracts`.
See also `raw/1 - retrieve abstracts/datasets bockting.xlsx` for a log.
## Search and label inclusions.
The raw files do not contain information whether a record is relevant or not. Therefore, the data had to be manually inspected to identify the relevant records.
We then imported the `RIS` files in Zotero. For each of the subsets, it was necessary to check which of the which of the relevant records are present in the respective subset. The appendix from the original systematic review contained the reference list of relevant records.
If a relevant record was found in a subset, the 'Tag' functionality of Zotero was used to assign a label "Included" to the record. A log of the process can be found in the folder `raw/2 - detect inclusions/zotero-labeling`
Three relevant records in the reference list were missing (see
`missing_records.docx`, marked yellow). After consulting Marlies Brouwer, these were manually added to the Zotero library.
The Zotero libraries were exported as separate `.csv` files, to be found in the
`raw/2 - detect inclusions/zotero-export/csv-included-tag` folder. The Zotero library as is can be found in `raw/2 - detect inclusions/zotero-export/rdf-included-tag`.
## Combining the subsets into one dataset
The R-script `scripts/Detect_inclusions2.R` combines all subsets into one dataset and subsequently adds a column `included` to indicate the relevant records in the dataset, and a column `search` to indicate the subset from which the record originates.
The output is [`brouwer_2019.csv`], which can be found in the main
directory of the OSF repository.
## Second round of DOI-retrieval and de-duplication
To retrieve missing DOIs, we used the script `scripts/crossref_doi_retrieval.ipynb` from [van den Brand et al. (2021)]. The output is stored in the file [` brouwer_2019_doi_retrieved.xlsx`].
To remove duplicates we ran the script `scripts/master_script_deduplication.R` from [van den Brand et al. (2021)]. The output is stored in the file [`brouwer_2019_deduplicated.xlsx`].
The number of records in the de-duplicated datafile is:
|search | n_inclusions| n before/after duplication|
|additional | 7| 4.259 / 3.766 |
|behavior | 5| 10.245 / 9.472 |
|cognitive | 25| 8.369 / 7.266 |
|cognitive_additional | 3| 5.461 / 4.962 |
|diathesis | 3| 5.336 / 4.823 |
|missing | 3| 3 / 3|
|personality | 16| 6.624 / 5.944 |
|psychodynamic | 1| 10.639 / 10.140 |
|grand total | 63| 50.936 / 46.376 |
The final file is [`brouwer_2019_deduplicated.csv`].
#Reference to the original study
Brouwer, M. E., Williams, A. D., Kennis, M., Fu, Z., Klein, N. S., Cuijpers, P., & Bockting, C. L. H. (2019). Psychological theories of depressive relapse and recurrence: A systematic review and meta-analysis of prospective studies. Clinical Psychology Review, 74, 101773. https://doi.org/10.1016/j.cpr.2019.101773
This dataset has a CC-By 4.0 license.
This project is funded by a grant from the Centre for Urban Mental Health, University of Amsterdam, The Netherlands
For any questions or remarks, please send an email to [Marlies Brouwer](https://orcid.org/0000-0002-9972-9058).