This folder contains taxonomic data for Antarctica and the Southern Ocean. Please make a careful note to use the [GBIF citation file](https://osf.io/85byh/) if using these datasets. The taxonomy datasets consist of four files.
- [gbif_antarctica](https://osf.io/x3cvh/download). GGIF Occurrence Records for Antarctica. This dataset is made up of two occurrence datasets. One consists of locality information linked to country code AQ. The second consists of occurrence records 60 degrees latitude South or above. The locality data has been cleaned using a combination of SCAR place names with Geonames place names used in support. Records with a valid Antarctic locality (has_locality) become column `antarctic_species` in other datasets for the scientific literature and patents. Please see R/gbif.R for details of processing.
- [lit_taxonomy](https://osf.io/htxaf/download). This is a table of taxonomic names extracted by text mining the scientific literature metadata fields for any kind of taxonomic name (uninomial or binomial). The names are then mapped to gbif_antarctica to mark `antarctic_species`
- [pat_taxonomy](https://osf.io/7z68h/download). The same as above for the patent data based on text mining of available full texts.
- [gbif_citation](https://osf.io/85byh/). Please use this file to reference the datasets to give due credit to the data providers.