**Ontologies for Psychological Science**
----------------------------------------
**Ontology-related sessions at SIPS 2019**
There were three explicitly ontology-related sessions at SIPS: An unconference on ontologies for psychological science[1], an unconference bringing together projects working on metadata, ontologies, and standards[2], and a hackathon dedicated to developing an approach to collaborative ontology development[3]. A number of other sessions touched on potential uses of ontologies, e.g. in the standardisation of datasets (psych-DS) and analysis reporting (Scienceverse).
***Uses of ontologies***
The unconference on ontologies for psychological science identified a number of uses for ontologies. Some of the uses are:
- Monitor the use of measures (for same and different constructs) and identify patterns in the use of constructs, methods, and measures
- Monitor changes in the field (in methods, use of measures, research questions, etc.)
- Identify and remove redundancies and contradictions in measures
- Identify inconsistencies in theories and contradictions arising when theories are extended
- Identify mismatches between theories and measures
- Identify close replications of own and others’ work in the literature
- Synthesise research, e.g. facilitating meta-analyses
- In publishing, providing a machine-readable version of articles alongside the standard
article[4]
- In teaching, e.g. to help students find constructs and measures
***Challenges in developing and using ontologies***
The unconference also identified a number of challenges to the development and adoption of ontologies in psychology. On one hand, researchers may disagree with classifications made in an ontology, meaning that the ontology may not serve to facilitate communication between researchers. There is no way to force researchers to accept any particular ontology. As a consequence, promises to resolve theoretical disagreements with the help of ontologies may be far-fetched. On the other hand, ontologies risk reifying constructs. At worst, the ontology becomes dogma, and constructs not contained in an ontology are deemed beyond the boundaries of scholarly inquiry. Both challenges require ontology developers to seek external input, both in initially defining constructs and in expanding and updating the ontology as the field develops.
**Related projects**
***Ontology projects at SIPS***
Present at SIPS were members of the Cooperation Databank (CoDa)[5] and Human Behaviour Change Project[6,7] teams. Another ontology project that was brought up is the Cognitive Atlas[8].
***Related projects at SIPS***
Among a number of ontology-related projects at SIPS, two provide direct connection points with ontologies such as the CoDa ontology. These are psych-DS[9,10], a project aimed at standardising the reporting of datasets, and Scienceverse[11], which seeks to standardise the reporting of analysis pipelines.
*psych-DS*
psych-DS seeks to define a standard for formatting and documenting (scientific) datasets. It combines standards for formatting spreadsheets and data dictionaries with folder structure and metadata. As an end product, it makes data available in a standardised, computer-readable format and findable on the web.
psych-DS standardises the formatting of both spreadsheets and data dictionaries. This includes standardising the naming pattern of variables. However, the project by and large does not seek to specify a dictionary relating variable names to constructs (i.e., researchers would be free to use any variable names, as long as they fit the specified format). (There are some plans to specify the naming of very commonly used, typically demographic variables).
psych-DS could benefit from integration with an ontology that links variable names in a given data dictionary to a definition of constructs. This could be achieved in a modular way, i.e. by letting users specify any ontology in the metadata.
CoDa could benefit from psych-DS by using its data reporting standard for data contained in the databank. In particular, data extracted from the databank could be outputted in a format specified by psych-DS. This would mean that CoDa would adopt formatting standards including naming patterns and the metadata standard from psych-DS.
*Scienceverse*
Scienceverse seeks to define a standard for reporting analysis pipelines (as a first among other goals.
Because Scienceverse draws on a lexicon, it may benefit from integration with ontologies that specify statistical and methodological constructs, such as the statistical methods ontology Stato[13]. It could also be linked to ontologies for different substantive domains (such as the CoDa and HBC ontologies), e.g. in specifying hypotheses. This could be achieved in a modular manner given appropriate APIs.
CoDa could benefit from psych-DS by using its reporting standard for documenting analysis steps users take when running meta-analyses on the platform. This would allow CoDa to provide a standardised, reproducible description of any analysis conducted on the platform.
*Decentralized Construct Taxonomies*
DCTs are a standard to describe, for any construct, and in a way that does not require central oversight yet enabled unequivocal reference to a given construct specification:
1. Construct label
2. Construct definition
3. Instructions for developing a measurement instrument
4. Instructions for coding measurement instrument (e.g. for reviews)
5. Instructions for developing a manipulation
6. Instructions for coding manipulations (e.g. for reviews)
7. Instructions for eliciting 'aspects' (construct 'content' in a given contex/population/etc)
8. Instruction for coding aspects (e.g. for qualitative research)
These are in a way the opposite of centrally curated ontologies, but aim to serve partly the same goals: achieve clear communication. In addition, construct definitions as included in ontologies can be expressed as DCTs, and vice versa.
See [here](https://r-packages.gitlab.io/dct/articles/decentralized-construct-taxonomies.html) for more information.
(Also see [this thread](https://twitter.com/matherion/status/1151066031852597248) on Twitter)
*Projects at an early conceptualization stage*
Willem Sleegers (Tilburg University) is planning to develop an ontology of statistical test in order to implement this knowledge in R packages and to properly differentiate statistics coming from different analysis techniques.
CoDa could benefit from such an effort because currently we do not have a refined coding for statistical tests (e.g., we do not differentiate ts coming from different test), that can help us for the effect sizes calculation.
***Common standards and APIs***
During the unconference on metadata, ontologies, and standards, it became clear that many projects would benefit from (a) choosing (shared) metadata standards and (b) providing APIs allowing them to be linked to other projects.
RDF (Resource Description Framework) [14] has been proposed among the candidates to adopt as metadata standards, defined as formats in which metadata can be represented.
Some projects (such as psych-DS) already use schema.org[15] for metadata standards. This is particularly helpful to achieve searchability on the web. As another example in psychology, DataWiz [16] provides guidelines to store research data in a standardized form and to the choice of metadata.
**References**
[1] https://osf.io/4b2hu/
[2] https://osf.io/c9txy/
[3] https://osf.io/5fujp/
[4] https://www.addictionpat.org/
[5] https://amsterdamcooperationlab.com/databank/
[6] https://www.humanbehaviourchange.org/
[7] https://osf.io/86m75/
[8] https://www.cognitiveatlas.org
[9] https://github.com/psych-ds/psych-DS
[10] https://docs.google.com/document/d/1lD7554E99mkMdy-STFqkfCvtqMRzX-bBwIAwXxnm2iY/edit
[11] https://scienceverse.github.io/scienceverse/
[12] https://docs.google.com/document/d/1DKhnypsG__XG9k_16smU3IJDYGgnxFP5LHw4P6Qh50g/edit
[13] http://www.obofoundry.org/ontology/stato.html
[14] https://www.w3.org/TR/PR-rdf-syntax/
[15] http://schema.org/
[16] https://datawizkb.leibniz-psychology.org/