<p><strong>Metadata</strong></p> <p>The metadata is based off of existing data description standards including:</p> <ul> <li><a href="https://figshare.com/articles/Common_Metadata_Elements_for_Cataloging_Biomedical_Datasets/1496573" rel="nofollow">Common metadata elements for Cataloging Biomedical Datasets</a></li> <li><a href="http://schema.datacite.org/" rel="nofollow">DataCite Metadata</a></li> <li><a href="http://wiki.datadryad.org/Metadata_Profile" rel="nofollow">Dryad Metadata</a></li> <li><a href="https://www.w3.org/TR/vocab-dcat/" rel="nofollow">W3C Data Catalog Vocabulary</a></li> <li><a href="https://zenodo.org/record/28019-dcat/" rel="nofollow">NIH BioCADDIE Metadata v1</a></li> </ul> <p>All schemas we analyzed and compared, and specific metadata elements were selected based on their relevance and applicability to the datasets described within the data catalog. One of our main goals was to make sure that our metadata aligned with <a href="https://biocaddie.org/about" rel="nofollow">BioCADDIE</a>'s so that when this system becomes available metadata transfer to the national data discovery index will be seamless.</p> <p><strong>Data Model</strong></p> <p>In total, our model required 24 separate entities and 54 database tables. To display a full Dataset record requires at least 20-30 database calls, as reported by Symfony’s built-in debugging tools. The Doctrine Object-Relational Mapping (ORM) tool that ships with Symfony (<a href="http://www.doctrine-project.org/" rel="nofollow">http://www.doctrine-project.org/</a>) handles most of the database communication behind the scenes. At the database itself, there is of course the built-in MySQL query cache, which speeds up a repeat query by more than half in our case. Interestingly, the Doctrine ORM also maintains its own query cache on the web server, as well as a results cache and metadata cache.</p>
