Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
The literature folder contains Lens data that has been processed.The data consists of: - antarctic_affiliation_edited.csv.bz2 A bzip file containing the available Microsoft Academic Graph (MAG) affiliation data. The table has been edited to improve coverage of affiliations per paper using the original author affiliation data where available. The paperid id is the join field. - antarctic_authors.csv.bz2 An table of author infornation from Microsoft Academic Graph (January 2019 release). The paperid id is the join field. - fos.csv Fields of Study table of article labels from Microsoft Academic Graph (January 2019 release). Use the paperid as the join field. A single record may attract more than one label. - literature.csv consisting of the raw literature table plus addiitional columns for use as filters. - literature.rda. The above for R users. - textfields.csv A csv file consisting of Lens identifiers (lens_id) and the joined title, abstract, author keywords, fields of study and MeSH fields converted to lowercase for text mining. The field separator in the joined field is "_". Be aware of junk in the MAG data such as #R##R etc and the presence of na_na_na from uniting the text fields. - literature_add_filters.R. An R script setting out the processing steps to add the filters. - query.csv, query.rda and query_string.txt The search query used with the Lens. The query is adapted for use in R in query_string.txt by using the OR operator '|' as the separator and word boundaries for phrases "\\\b".
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.