Skip to content

Training materials about developing and applying data specifications

Notifications You must be signed in to change notification settings

cidgoh/specification-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 

Repository files navigation

Specification Training Materials (notes)

Last Update: 2023-12-15 Rhiannon & Charlie

Training materials about developing and applying data specifications. We use ontologies for a lot of examples but people shouldn't be limited to using them, though the principles should be reused.

Where to Start

  • The Rules (not just for standardizing terms, but across all stages/modules)
    • How they apply to the challenges
    • How are they FAIR / OBO compatible
    • Knowing the above will how use these them to develope PHA4GE registry metrics to different results / scenarios
  • The different stages of the "spec pipeline"
    • Boken up into modules
    • Rules are reused across the modules, but the "hands-on" exercises/guidance will differ.
    • Rhiannon has focused a lot of energy on "standardizing fields and terms" parts of the pipeline, but Charlie can take over diving in and drafting the manuscript (based on Emma's outline and collaboration) for improving data descriptions.

Considerations

  • What is a specification?
  • Starting Resources - i.e. what are the inputs/outputs of different modules?
  • Data need assessment
  • Data descriptions (which is a whole lesson/module in it's own)
  • Going from data descriptions to tabular data
  • Forming your fields, which are infromed by the fields
    • need to have this understanding before you can map against and consider against existing standards.
  • Standardize your terms (based on 10 simple rules paper)
    • Many of which should be broken into sublessons as individual rules cover several concepts.
  • The importance of consultation and consensus

But having the individual lessons map to a master list of challenges and rules (and the asosciations between them whihc are not one to one) not only inform building the trianing materials but also metrics of evaluation as these rules/challenges are tied to FAIR and ontological principles. When we apply the rules we know what part of FAIR.OBO is being applied and can use this to help us define registry metrics.

For reference, 10 simple rules for standarizing your terms:

note: a lot if not all of these rules also apply to developing fields. So perhaps the rules need to be simplified to be more reusable across both field and term components, and then narrower examples are provided for the hands-on cases for fields and terms.

  • Rule 1: Reuse Existing Terms
  • Rule 2: Keep Concepts Simple & Specific
  • Rule 3: Use Technical (not Colloquial) Words
  • Rule 4: Keep Terms as Universal as Possible
  • Rule 5: Avoid Abbreviations
  • Rule 6: Use IRIs
  • Rule 7: Provide Definitions
  • Rule 8: Update and Deprecate
  • Rule 9: Use Tags to Avoid Ambiguity
  • Rule 10: Use Well-Maintained Ontologies

Also have the FSCI course, which focuses on ontologizing picklists but has good hands on examples.

Emma's Specification development pipeline.

image

About

Training materials about developing and applying data specifications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published