Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
**About the CHITA Data Model and Data Dictionary** Data models structure and organize the data elements (fields) and the necessary information (metadata) about a data set to facilitate computational analyses. We can think of a data model as a series of connected tables of information. For cell-based assays or in vitro NAMs data, there is a vast amount of information that could be collected to fully describe the data set, particularly if one wishes to fully describe protocol details and reproduce experiments. However, we have a more focused goal on information needed for research biologists to be able to find assays, use and understand the resulting data. Thus, our data model will reflect this objective. For in vitro NAMs data, our data model includes fields that describe (1) the assay in its entirety, which is the combination of the assay system and endpoint measured (e.g., assay source, analysis methods, biological relevance, tissue relevance, etc.), (2) the assay system (e.g., cell types, conditions, time point, culture methods, etc.), (3) the endpoint measurement (e.g., measurement type, and if relevant, gene or protein identifiers, etc.), (4) test agents; and (5) experimental results. **Key tables** • Assay Information • Assay System Information • Assay Endpoint Information • Test Agent Information • Experimental Results Information **The CHITA Data Dictionary** Data dictionaries list the terms (e.g., field or table column headers), definitions, and information type (string, text, numeric, etc.) used in the data model. The data model incorporates these terms and for example, in a relational database such as SQL, organizes terms in tables with a structure that describes how the tables are connected to one another. Note that the current data dictionary does not include database field labels, which have limited human readability. **The FAIR approach** Data that meet the FAIR data principles are findable, accessible, interoperable and reusable. So, in developing our data dictionary and constructing our data model, we want to follow data governance best practices. This includes the use of unique identifiers and registries for terms that has been standardized, such as gene IDs, etc. Public ontologies, such as BioAssay Ontology, Cellosaurus, BRENDA, are good sources for such terms. We also can take advantage of related work on assay metadata templates such as the Data FAIRy template developed by the Pistoia Alliance (Makarov, 2024). **A work in progress** The current model was designed for a pilot set of assays and data from diverse in vitro NAMs assay types. We expect this data model to evolve over time to accommodate new assay types and input from additional stakeholders and potential users of the CHITA database.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.