Authority Management of People Names (a working meeting)
========================================================
Biodiversity Next Preconference Workshop
----------------------------------------
Date: Monday October 21st 2019
Participants:
- Quentin Groom (Meise Botanic Garden) https://orcid.org/0000-0002-0596-5376
- Elspeth Haston (Royal Botanic Garden Edinburgh) https://orcid.org/0000-0001-9144-2848
- Anne Thessen (Oregon State University, USA) https://orcid.org/0000-0002-2908-3327
- Anton Güntsch (Free University of Berlin, Germany) https://orcid.org/0000-0002-4325-4030
- Brenda Daly (South African National Biodiversity Institute) https://orcid.org/0000-0002-3732-8339
- Chloé Besombes (National Museum of Natural History, Paris, France)
- Christian Bräuchler (Naturhistorisches Museum Wien) https://orcid.org/0000-0002-6176-1669
- David Shorthouse (AAFC, Ottawa, Canada) https://orcid.org/0000-0001-7618-5230
- Dominik Röpert (Botanic Garden and Botanical Museum Berlin) https://orcid.org/0000-0001-6565-8450
- Frederik Berger (Museum of Natural Science, Berlin, Germany) https://orcid.org/0000-0001-8400-3337
- Heather Lindon (Royal Botanic Gardens, Kew) https://orcid.org/0000-0002-0414-7398
- Iris Sampaio (University of the Azores / Senckenberg am Meer) https://orcid.org/0000-0003-3305-7567
- Jiri Frank (National Museum, Prague, Czech Republic)
- Jonathan Krieger (Royal Botanic Gardens, Kew, UK)
- Laurence Livermore (Natural History Museum, London) https://orcid.org/0000-0002-7341-1842
- Matt Woodburn (Natural History Museum, London) https://orcid.org/0000-0001-6496-1423
- Nicky Nicolson (Royal Botanic Gardens, Kew, UK) https://orcid.org/0000-0003-3700-4884
- Nicole Kearney (Biodiversity Heritage Library, Australia) https://orcid.org/0000-0003-2883-0906
- Paul Braun (National Museum of Natural History, Luxembourg) https://orcid.org/0000-0002-3620-6188
- Rafaël Govaerts (Royal Botanic Gardens, Kew, UK) https://orcid.org/0000-0003-2991-5282
- Robert Cubey (Royal Botanic Garden Edinburgh, UK) https://orcid.org/0000-0001-7902-3843
- Ron Canepa (iDigBio, USA) https://orcid.org/0000-0002-4756-1070
- Rod Page (Glasgow University) https://orcid.org/0000-0002-7101-9767
- Sarah Phillips (Royal Botanic Gardens, Kew, UK) https://orcid.org/0000-0002-9155-8573
- Simon Chagnoux (National Museum of Natural History, Paris, France) https://orcid.org/0000-0002-4210-484X
- Sharif Islam (Naturalis, Netherland) https://orcid.org/0000-0001-8050-0299
Aims:
Major Challenges identified:
- Multiple identifiers - Wikidata, one ID to rule them all?
- Multiple individuals in record
- Order of names
- Disambiguation
- Transliteration
- The needs and motivations of the authors/collectors/institutions
- Ownership of your ID (for the living) & updating metadata such as affiliations, publications
- Identifiers for living collectors who do not want ORCID or wikidata
- Volume of unknown people, converting strings to things
- Mobilising existing datasets and resources
Activities:
Break up into smaller task groups. The list below are current suggestions.
**1. Analysis & visualisations (Tech Group)**
- Creating visualisations of data from Wikidata to illustrate the coverage of biological collectors and the links to other identifiers and different data types.
- Number of specimens per linked person
- Number of identifiers per person
- Number of people without identifiers
- With identifier, but with biographical details
- Totally anonymous people
- Demography of linked versus unlinked people
- Creating visualisation that show the value of connecting collections, collectors and authors with identifiers
- Reveal a collector's travels
- Uncover missteps in digitization of specimens when cross-referenced against external information about people, produce recommendations for data quality filters
- https://docs.google.com/document/d/1oiAYFFKdc46eoOGXtwFGKK5QNweKEFqXZQ4xyKbdunA/edit?usp=sharing
**2. Engagement group**
- Draft an position/opinion piece on why all taxonomists should have an ORCID ID.
- How do we encourage uptake of ORCID IDs among biological collectors and taxonomists?
- How do we encourage the linking and accumulation of biographical details of biological collectors?
- https://docs.google.com/document/d/1n34t9fkjFlJIeQi1msjf0SqpVuxo5RQpKTvE9P4ITrA/edit#heading=h.mwbg5swumzck
**3. Darwin Core and TDWG group**
- Writing a charter for a TDWG Task Group on Person identifiers under the Attribution Interest Group. Defining its rationale and aims.
- Best practices for storing people names
- Collector teams
- Where should identifiers be stored?
- Where should bibliographic details be stored?
- [extension to Darwin Core Archive][1], produce definitions for actions (eg collected, identified, georeferenced, etc.), decide what to do & how to reconcile relationship with other extensions where people names are recorded (eg Darwin Core Identification History)
**4. Paper writing group**
- Developing the introduction to the draft paper titled "Identifiers for people working on biodiversity". Reviewing what has already been published and what approaches other disciplines use to identify people uniquely.
- develop thesis: challenges, solutions, next steps
- identify cultural biases
- reconcile with GDPR
**5. Disambiguation group**
- Drafting a best practise for disambiguation of people
- How can disambiguation of people be improved? Can disambiguation be automated? Are there suitable algorithms? Are there statistical methods that can indicate the likelihood if a match?
**6. [Datasets][3] for Challenges group**
Expanding existing tests and pilots (look at options for including some of this work within SYNTHESYS+)
- BGBM Model: Participating institutes could select their most common collectors and add identifiers
- MNHN Model: Participating institutes could try the protocol within their own collections for top collectors
- Develop communications plan to approach proprietors of relevant datasets that could be made openly available and mobilized
Prior to workshop:
Increase implementation of Wikidata identifiers in RDF within Stable URI implementations.
### Reading Materials ###
Groom, Q.J., C. O’Reilly, and T. Humphrey. 2014. Herbarium specimens reveal the exchange network of British and Irish botanists, 1856–1932. New Journal of Botany 4: 95–103. https://doi.org/10.1179/2042349714Y.0000000041
Penn, M.G., S. Cafferty, and M. Carine. 2017. Mapping the history of botanical collectors: spatial patterns, diversity, and uniqueness through time. Systematics and Biodiversity 16: 1–13. https://doi.org/10.1080/14772000.2017.1355854
----------
## Timetable ##
09:00-09:30 Introductions
- Analysis & visualisations (Nicole, Íris, Rod, Ron, David, Dominik)
- Disambiguation guidelines (Quentin, Paul, Chloé, Anton, Rob)
- Writing the introduction to the paper (Elspeth, Sarah, Simon, Anton, Anne, Jiri)
10:30-11:00 Coffee Break
11:00-11:10 Regroup/Report
- Analysis & visualisations (Rod, Ron, David, Dominik )
- Disambiguation guidelines (Quentin, Paul, Chloé, Rob)
- Writing the introduction to the paper (Elspeth, Íris, Sarah, Simon, Anton, Anne, Jiri)
12:30-13:30 Lunch
13:30-13:40 Regroup/Report
- Analysis & visualisations (Ron, David, Dominik)
- TDWG Task Group Proposal (Quentin, Íris, Chloé, Paul..)
- Engagement (Nicole, Anne, Jiri)
- Datasets Group (Elspeth, Sarah, Simon, Anton, Rob)
15:00-15:30 Coffee Break
15:30-15:40 Regroup/Report
- Analysis & visualisations (Ron, David, Dominik)
- TDWG Task Group Proposal (Quentin, Íris, Chloé, Paul)
- Engagement (Nicole, Anne, Jiri)
- Datasets Group (Elspeth, Sarah, Simon, Anton, Rob)
16:30-17:00 Wrap-up
- What are the next set of challenges?
[3]: https://docs.google.com/document/d/1k3NWS0cGM3jnLEcXYxP7wA6OxVENxnSBCsgymJX0Obo/editAuthority Management of People Names (a working meeting)
[1]: https://github.com/tdwg/attribution/tree/master/dwc