diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..e69de29b diff --git a/404.html b/404.html new file mode 100644 index 00000000..1d8d73e3 --- /dev/null +++ b/404.html @@ -0,0 +1,1573 @@ + + + +
+ + + + + + + + + + + + + +The uPheno project aims to unify the annotation of phenotypes across species in a manner analogous to unification of gene function annotation by the Gene Ontology.
+uPheno 2.0 builds on earlier efforts with a strategy that directly leverages the work of the phenotype ontology development community and incorporates phenotypes from a much wider range of species.
+We have organised a collaborative community effort, including representatives of all major model organism databases, to document and align formal design patterns for representing phenotypes and further develop reference ontologies, such as PATO, which are used in these patterns.
+A common development infrastructure makes it easy to use these design patterns to generate both species-specific ontologies and a species-independent layer that subsumes them.
+The resulting community-curated ontology for the representation and integration of phenotypes across species serves two general purposes:
+- Providing a community-developed framework for ontology editors to bootstrap, maintain and extend their phenotype ontologies in a scalable and standardised manner.
+- Facilitating the retrieval and comparative analysis of species-specific phenotypes through a deep layer of species-independent phenotypes.
Currently, the development of uPheno is organized by a group that meets biweekly. See the meetings page for more info, including how to participate.
+ + + + + + +EQ definitions are powerful tools for reconciling phenotypes across species and driving reasoning. However, they are not all that useful for many "normal" users of our ontologies.
+We have developed a little workflow extension to take care of that.
+src/ontology/mp-odk.yaml
):
+http://purl.obolibrary.org/obo/YOURONTOLOGY/components/eq-relations.owl
. For example, for MP, the IRI is http://purl.obolibrary.org/obo/mp/components/eq-relations.owl
.prepare_release
).The custom uPheno Makefile is an extension to your normal custom Makefile (for example, hp.Makefile, mp.Makefile, etc), located in the src/ontology directory of your ODK set up.
+To install it:
+(1) Open your normal custom Makefile and add a line in the very end:
+ +(2) Now download the custom Makefile:
+https://raw.githubusercontent.com/obophenotype/upheno/master/src/ontology/config/pheno.Makefile
+and save it in your src/ontology
directory.
Feel free to use, for example, wget:
+cd src/ontology
+wget https://raw.githubusercontent.com/obophenotype/upheno/master/src/ontology/config/pheno.Makefile -O pheno.Makefile
+
From now on you can simply run
+ +whenever you wish to synchronise the Makefile with the uPheno repo.
+(Note: it would probably be good to add a GitHub action that does that automatically.)
+ + + + + + +brew install yamllint
error line too long
yaml syntax errors for dos-dp yaml templates.
+ You can create a custom configuration file for yamllint in your home folder:
+
+ The content of the config file should look like this:
+ # Custom configuration file for yamllint
+# It extends the default conf by adjusting some options.
+
+extends: default
+
+rules:
+ line-length:
+ max: 80 # 80 chars should be enough, but don't fail if a line is longer
+# max: 140 # allow long lines
+ level: warning
+ allow-non-breakable-words: true
+ allow-non-breakable-inline-mappings: true
+
error line too long
errors to warnings.pip install dosdp
Patternisation is the process of ensuring that all entity quality (EQ) descriptions from textual phenotype term definitions have a logical definition pattern. A pattern is a standard format for describing a phenotype that includes a quality and an entity. For example, "increased body size" is a pattern that includes the quality "increased" and the entity "body size." The goal of patternisation is to make the EQ descriptions more uniform and machine-readable, which facilitates downstream analysis.
+The first step in the Phenotype Ontology Editors' Workflow is to identify a group of related phenotypes from diverse organisms. This can be done by considering proposals from phenotype editors or by using the pattern suggestion pipeline. +The phenotype editors may propose a group of related phenotypes based on their domain knowledge, while the pattern suggestion pipeline uses semantic similarity and shared Phenotype And Trait Ontology (PATO) quality terms to identify patterns in phenotype terms from different organism-specific ontologies.
+Once a group of related phenotypes is identified, the editors propose a phenotype pattern. To do this, they create a Github issue to request the phenotype pattern template in the uPheno repository. +Alternatively, a new template can be proposed at a phenotype editors' meeting which can lead to the creation of a new term request as a Github issue. +Ideally, the proposed phenotype pattern should include an appropriate PATO quality term for logical definition, use cases, term examples, and a textual definition pattern for the phenotype terms.
+The next step is to discuss the new phenotype pattern draft at the regular uPheno phenotype editors meeting. During the meeting, the editors' comments and suggestions for improvements are collected as comments on the DOS-DP yaml
template in the corresponding Github pull request. Based on the feedback and discussions, a consensus on improvements should be achieved.
+The DOS-DP yaml
template is named should start with a lower case letter, should be informative, and must include the PATO quality term.
+A Github pull request is created for the DOS-DP yaml
template.
---
+pattern_name: ??pattern_and_file_name
+
+pattern_iri: http://purl.obolibrary.org/obo/upheno/patterns-dev/??pattern_and_file_name.yaml
+
+description: 'A description that helps people chose this pattern for the appropriate scenario.'
+
+# examples:
+# - example_IRI-1 # term name
+# - example_IRI-2 # term name
+# - example_IRI-3 # term name
+# - http://purl.obolibrary.org/obo/XXXXXXXXXX # XXXXXXXX
+
+contributors:
+ - https://orcid.org/XXXX-XXXX-XXXX-XXXX # Yyy Yyyyyyyyy
+
+classes:
+ process_quality: PATO:0001236
+ abnormal: PATO:0000460
+ anatomical_entity: UBERON:0001062
+
+relations:
+ characteristic_of: RO:0000052
+ has_modifier: RO:0002573
+ has_part: BFO:0000051
+
+annotationProperties:
+ exact_synonym: oio:hasExactSynonym
+ related_synonym: oio:hasRelatedSynonym
+ xref: oio:hasDbXref
+
+vars:
+ var??: "'anatomical_entity'" # "'variable_range'"
+
+name:
+ text: "trait ?? %s"
+ vars:
+ - var??
+
+annotations:
+ - annotationProperty: exact_synonym
+ text: "? of %s"
+ vars:
+ - var??
+
+ - annotationProperty: related_synonym
+ text: "? %s"
+ vars:
+ - var??
+
+ - annotationProperty: xref
+ text: "AUTO:patterns/patterns/chemical_role_attribute"
+
+def:
+ text: "A trait that ?? %s."
+ vars:
+ - var??
+
+equivalentTo:
+ text: "'has_part' some (
+ 'XXXXXXXXXXXXXXXXX' and
+ ('characteristic_of' some %s) and
+ ('has_modifier' some 'abnormal')
+ )"
+ vars:
+ - var??
+...
+
Once a consensus on the improvements for a particular template is achieved, they are incorporated into the DOS-DP yaml
file. Typically, the improvements are applied to the template some time before a subsequent ontology editor's meeting. There should be enough time for off-line review of the proposed pattern to allow community feedback.
+The improved phenotype pattern candidate draft should get approval from the community at one of the regular ontology editors' call or in a Github comment.
+The ontology editors who approve the pattern provide their ORCIDs and they are credited as contributors in an appropriate field of the DOS-DP pattern template.
Once the community-approved phenotype pattern template is created, it is added to the uPheno Github repository.
+The approved DOS-DP yaml
phenotype pattern template should pass quality control (QC) steps.
+1. Validate yaml syntax: yamllint
+2. Validate DOS-DP
+Use DOSDP Validator.
+* To validate a template using the command line interface, execute:
+```sh
+yamllint
After successfully passing QC, the responsible editor merges the approved pull request, and the phenotype pattern becomes part of the uPheno phenotype pattern template collection.
+ + + + + + +This document is on how to merge new DOSDP design patterns into an ODK ontology and then how to replace the old classes with the new ones.
+$ODK-ONTOLOGY/src/patterns/data/default/
+
Make sure that the tsv filenames match that of the relevant yaml DOSDP pattern files.
+$ODK-ONTOLOGY/src/patterns/dosdp-patterns/external.txt
+
external.txt
list from external sources into the current working repository¶cd ODK-ONTOLOGY/src/ontology
+sh run.sh make update_patterns
+
cd ODK-ONTOLOGY/src/ontology
+sh run.sh make ../patterns/definitions.owl IMP=false
+
cd ODK-ONTOLOGY/src/ontology
+sh run.sh make remove_patternised_classes
+
For example:
+++ + + + + + +I have migrated the ... table and changed the tab colour to blue. +You can delete the tab if you wish.
+
In order to run a release you will have to have completed the steps to set up s3.
+cd src/scripts
sh upheno_pipeline.sh
cd ../ontology
make prepare_upload S3_VERSION=2022-06-19
make deploy S3_VERSION=2022-06-19
To be able to upload new uPheno release to the uPheno S3 bucket, you need to set yourself up for S3 first.
+ +The most convenient way to interact with S3 is the AWS Command Line Interface (CLI). You can find the installers and install instructions on that page (different depending on your Operation System): +- For Mac +- For Windows
+Next, you need to ask someone at BBOP (such as Chris Mungall or Seth Carbon) to provide you with an account that gives you access to the BBOP s3 buckets. You will have to provide a username. You will receive: +- User name +- Access key ID- +- Secret access key +- Console link to sign into bucket
+You will now have to set up your local system. You will create two files:
+ +and
+ +in ~/.aws/credentials
make sure you add the correct keys as provided above.
Now, you should be set up to write to your s3 bucket. Note that in order for your data to be accessible through https
after your upload, you need to add --acl public read
.
aws s3 sync --exclude "*.DS_Store*" my/data-dir s3://bbop-ontologies/myproject/data-dir --acl public-read
+
If you have previously pushed data to the same location, you wont be able to set it to "publicly readable" by simply rerunning the sync command. If you want to publish previously private data, follow the instructions here, e.g.:
+ + + + + + + +Historically, most repos have been using Travis CI for continuous integration testing and building, but due to +runtime restrictions, we recently switched a lot of our repos to GitHub actions. You can set up your repo with CI by adding +this to your configuration file (src/ontology/upheno-odk.yaml):
+ +When updateing your repo, you will notice a new file being added: .github/workflows/qc.yml
.
This file contains your CI logic, so if you need to change, or add anything, this is the place!
+Alternatively, if your repo is in GitLab instead of GitHub, you can set up your repo with GitLab CI by adding +this to your configuration file (src/ontology/upheno-odk.yaml):
+ +This will add a file called .gitlab-ci.yml
in the root of your repo.
The editors workflow is one of the formal workflows to ensure that the ontology is developed correctly according to ontology engineering principles. There are a few different editors workflows:
+This document only covers the first editing workflow, but more will be added in the future
+Workflow requirements:
+Ensure that there is a ticket on your issue tracker that describes the change you are about to make. While this seems optional, this is a very important part of the social contract of building an ontology - no change to the ontology should be performed without a good ticket, describing the motivation and nature of the intended change.
+In your local environment (e.g. your laptop), make sure you are on the main
(prev. master
) branch and ensure that you have all the upstream changes, for example:
Create a new branch. Per convention, we try to use meaningful branch names such as: +- issue23removeprocess (where issue 23 is the related issue on GitHub) +- issue26addcontributor +- release20210101 (for releases)
+On your command line, this looks like this:
+ +Using your editor of choice, perform the intended edit. For example:
+Protégé
+src/ontology/upheno-edit.owl
in ProtégéTextEdit
+src/ontology/upheno-edit.owl
in TextEdit (or Sublime, Atom, Vim, Nano)Consider the following when making the edit.
+src/ontology/upheno-edit.owl
src/ontology/components
), see here.This step is very important. Rather than simply trusting your change had the intended effect, we should always use a git diff as a first pass for sanity checking.
+In our experience, having a visual git client like GitHub Desktop or sourcetree is really helpful for this part. In case you prefer the command line:
+ +Now it's time to run your quality control checks. This can either happen locally (5a) or through your continuous integration system (7/5b).
+If you chose to run your test locally:
+ +This will run the whole set of configured ODK tests on including your change. If you have a complex DOSDP pattern pipeline you may want to addPAT=false
to skip the potentially lengthy process of rebuilding the patterns.
+
+When you are happy with the changes, you commit your changes to your feature branch, push them upstream (to GitHub) and create a pull request. For example:
+git add NAMEOFCHANGEDFILES
+git commit -m "Added biological process term #12"
+git push -u origin issue23removeprocess
+
Then you go to your project on GitHub, and create a new pull request from the branch, for example: https://github.com/INCATools/ontology-development-kit/pulls
+There is a lot of great advise on how to write pull requests, but at the very least you should:
+- mention the tickets affected: see #23
to link to a related ticket, or fixes #23
if, by merging this pull request, the ticket is fixed. Tickets in the latter case will be closed automatically by GitHub when the pull request is merged.
+- summarise the changes in a few sentences. Consider the reviewer: what would they want to know right away.
+- If the diff is large, provide instructions on how to review the pull request best (sometimes, there are many changed files, but only one important change).
If you didn't run and local quality control checks (see 5a), you should have Continuous Integration (CI) set up, for example: +- Travis +- GitHub Actions
+More on how to set this up here. Once the pull request is created, the CI will automatically trigger. If all is fine, it will show up green, otherwise red.
+Once all the automatic tests have passed, it is important to put a second set of eyes on the pull request. Ontologies are inherently social - as in that they represent some kind of community consensus on how a domain is organised conceptually. This seems high brow talk, but it is very important that as an ontology editor, you have your work validated by the community you are trying to serve (e.g. your colleagues, other contributors etc.). In our experience, it is hard to get more than one review on a pull request - two is great. You can set up GitHub branch protection to actually require a review before a pull request can be merged! We recommend this.
+This step seems daunting to some hopefully under-resourced ontologies, but we recommend to put this high up on your list of priorities - train a colleague, reach out!
+When the QC is green and the reviews are in (approvals), it is time to merge the pull request. After the pull request is merged, remember to delete the branch as well (this option will show up as a big button right after you have merged the pull request). If you have not done so, close all the associated tickets fixed by the pull request.
+It is sometimes difficult to keep track of changes made to an ontology. Some ontology teams opt to document changes in a changelog (simply a text file in your repository) so that when release day comes, you know everything you have changed. This is advisable at least for major changes (such as a new release system, a new pattern or template etc.).
+ + + + + + +We can define custom checks using SPARQL. SPARQL queries define bad modelling patterns (missing labels, misspelt URIs, and many more) in the ontology. If these queries return any results, then the build will fail. Custom checks are designed to be run as part of GitHub Actions Continuous Integration testing, but they can also run locally.
+src/sparql
. The name of the file should end with -violation.sparql
. Please give a name that helps to understand which violation the query wants to check.src/ontology/uberon-odk.yaml
:-violation.sparql
part) to the list inside the key custom_sparql_checks
that is inside robot_report
key.If the robot_report
or custom_sparql_checks
keys are not available, please add this code block to the end of the file.
The documentation for UPHENO is managed in two places (relative to the repository root):
+docs
directory contains all the files that pertain to the content of the documentation (more below)mkdocs.yaml
file contains the documentation config, in particular its navigation bar and theme.The documentation is hosted using GitHub pages, on a special branch of the repository (called gh-pages
). It is important that this branch is never deleted - it contains all the files GitHub pages needs to render and deploy the site. It is also important to note that the gh-pages branch should never be edited manually. All changes to the docs happen inside the docs
directory on the main
branch.
All the documentation is contained in the docs
directory, and is managed in Markdown. Markdown is a very simple and convenient way to produce text documents with formatting instructions, and is very easy to learn - it is also used, for example, in GitHub issues. This is a normal editing workflow:
.md
file you want to change in an editor of choice (a simple text editor is often best). IMPORTANT: Do not edit any files in the docs/odk-workflows/
directory. These files are managed by the ODK system and will be overwritten when the repository is upgraded! If you wish to change these files, make an issue on the ODK issue tracker.The documentation is not automatically updated from the Markdown, and needs to be deployed deliberately. To do this, perform the following steps:
+If everything was successful, you will see a message similar to this one:
+ +3. Just to double check, you can now navigate to your documentation pages (usually https://obophenotype.github.io/upheno/). + Just make sure you give GitHub 2-5 minutes to build the pages! + + + + + + +The release workflow recommended by the ODK is based on GitHub releases and works as follows:
+These steps are outlined in detail in the following.
+Preparation:
+git status
should say that there are no modified files)git pull
)git checkout -b release-2021-01-01
)docker pull obolibrary/odkfull
To actually run the release, you:
+cd upheno/src/ontology
)sh run.sh make prepare_release -B
. Note that for some ontologies, this process can take up to 90 minutes - especially if there are large ontologies you depend on, like PRO or CHEBI.Release files are now in ../.. - now you should commit, push and make a release on your git hosting site such as GitHub or GitLab
.This will create all the specified release targets (OBO, OWL, JSON, and the variants, ont-full and ont-base) and copy them into your release directory (the top level of your repo).
+upheno.obo
- this reflects a useful subset of the whole ontology (everything that can be covered by OBO format). OBO format has that speaking for it: it is very easy to review!upheno-base.owl
- this reflects the asserted axioms in your ontology that you have actually edited.upheno-full.owl
, which may reveal interesting new inferences you did not know about. Note that the diff of this file is sometimes quite large.Once your CI checks have passed, and your reviews are completed, you can now merge the branch into your main branch (don't forget to delete the branch afterwards - a big button will appear after the merge is finished).
+upheno.obo
file and check the data-version:
property. The date needs to be prefixed with a v
, so, for example v2020-02-06
.When you are dealing with large ontologies, you need a lot of memory. When you see error messages relating to large ontologies such as CHEBI, PRO, NCBITAXON, or Uberon, you should think of memory first, see here.
+Sometimes you will get cryptic error messages when using legacy tools using OBO format, such as the ontology release tool (OORT), which is also available as part of the ODK docker container. In these cases, you need to track down what axiom or annotation actually caused the breakdown. In our experience (in about 60% of the cases) the problem lies with duplicate annotations (def
, comment
) which are illegal in OBO. Here is an example recipe of how to deal with such a problem:
make: *** [cl.Makefile:84: oort] Error 255
you might have a OORT error. sh run.sh make IMP=false PAT=false oort -B
(assuming you are already in the ontology folder in your directory) upheno-edit.owl
in Protégé and find the offending term and delete all offending issue (e.g. delete ALL definition, if the problem was "multiple def tags not allowed") and save.
+*While this is not idea, as it will remove all definitions from that term, it will be added back again when the term is fixed in the ontology it was imported from and added back in.sh run.sh make IMP=false PAT=false oort -B
and if it all passes, commit your changes to a branch and make a pull request as usual.Your ODK repositories configuration is managed in src/ontology/upheno-odk.yaml
. Once you have made your changes, you can run the following to apply your changes to the repository:
There are a large number of options that can be set to configure your ODK, but we will only discuss a few of them here.
+NOTE for Windows users:
+You may get a cryptic failure such as Set Illegal Option -
if the update script located in src/scripts/update_repo.sh
+was saved using Windows Line endings. These need to change to unix line endings. In Notepad++, for example, you can
+click on Edit->EOL Conversion->Unix LF to change this.
You can use the update repository workflow described on this page to perform the following operations to your imports:
+We will discuss all these workflows in the following.
+To add a new import, you first edit your odk config as described above, adding an id
to the product
list in the import_group
section (for the sake of this example, we assume you already import RO, and your goal is to also import GO):
Note: our ODK file should only have one import_group
which can contain multiple imports (in the products
section). Next, you run the update repo workflow to apply these changes. Note that by default, this module is going to be a SLME Bottom module, see here. To change that or customise your module, see section "Customise an import". To finalise the addition of your import, perform the following steps:
src/ontology/upheno-edit.owl
file. We suggest to do this using a text editor, by simply copying an existing import declaration and renaming it to the new ontology import, for example as follows:
+ src/ontology/catalog-v001.xml
, for example:
+ Note: The catalog file src/ontology/catalog-v001.xml
has one purpose: redirecting
+imports from URLs to local files. For example, if you have
in your editors file (the ontology) and
+<uri name="http://purl.obolibrary.org/obo/upheno/imports/go_import.owl" uri="imports/go_import.owl"/>
+
in your catalog, tools like robot
or Protégé will recognize the statement
+in the catalog file to redirect the URL http://purl.obolibrary.org/obo/upheno/imports/go_import.owl
+to the local file imports/go_import.owl
(which is in your src/ontology
directory).
If you simply wish to refresh your import in light of new terms, see here. If you wish to change the type of your module see section "Customise an import".
+To remove an existing import, perform the following steps:
+src/ontology/upheno-edit.owl
.src/ontology/upheno-odk.yaml
, eg. - id: go
from the list of products
in the import_group
.src/imports/go_import.owl
src/imports/go_terms.txt
src/ontology/catalog-v001.xml
file.By default, an import module extracted from a source ontology will be a SLME module, see here. There are various options to change the default.
+The following change to your repo config (src/ontology/upheno-odk.yaml
) will switch the go import from an SLME module to a simple ROBOT filter module:
A ROBOT filter module is, essentially, importing all external terms declared by your ontology (see here on how to declare external terms to be imported). Note that the filter
module does
+not consider terms/annotations from namespaces other than the base-namespace of the ontology itself. For example, in the
+example of GO above, only annotations / axioms related to the GO base IRI (http://purl.obolibrary.org/obo/GO_) would be considered. This
+behaviour can be changed by adding additional base IRIs as follows:
import_group:
+ products:
+ - id: go
+ module_type: filter
+ base_iris:
+ - http://purl.obolibrary.org/obo/GO_
+ - http://purl.obolibrary.org/obo/CL_
+ - http://purl.obolibrary.org/obo/BFO
+
If you wish to customise your import entirely, you can specify your own ROBOT command to do so. To do that, add the following to your repo config (src/ontology/upheno-odk.yaml
):
Now add a new goal in your custom Makefile (src/ontology/upheno.Makefile
, not src/ontology/Makefile
).
imports/go_import.owl: mirror/ro.owl imports/ro_terms_combined.txt
+ if [ $(IMP) = true ]; then $(ROBOT) query -i $< --update ../sparql/preprocess-module.ru \
+ extract -T imports/ro_terms_combined.txt --force true --individuals exclude --method BOT \
+ query --update ../sparql/inject-subset-declaration.ru --update ../sparql/postprocess-module.ru \
+ annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) --output $@.tmp.owl && mv $@.tmp.owl $@; fi
+
Now feel free to change this goal to do whatever you wish it to do! It probably makes some sense (albeit not being a strict necessity), to leave most of the goal instead and replace only:
+ +to another ROBOT pipeline.
+A component is an import which belongs to your ontology, e.g. is managed by +you and your team.
+src/ontology/upheno-odk.yaml
components
components
section, add a new section called products
.
+This is where all your components are specifiedproducts
section, add a new component, e.g. - filename: mycomp.owl
Example
+ +When running sh run.sh make update_repo
, a new file src/ontology/components/mycomp.owl
will
+be created which you can edit as you see fit. Typical ways to edit:
components/mycomp.owl:
make target in src/ontology/upheno.Makefile
+and provide a custom command to generate the componentWARNING
: Note that the custom rule to generate the component MUST NOT depend on any other ODK-generated file such as seed files and the like (see issue).src/ontology/upheno-odk.yaml
, source
,
+to specify that this component should simply be downloaded from somewhere on the web.Since ODK 1.3.2, it is possible to simply link a ROBOT template to a component without having to specify any of the import logic. In order to add a new component that is connected to one or more template files, follow these steps:
+src/ontology/upheno-odk.yaml
.use_templates: TRUE
is set in the global project options. You should also make sure that use_context: TRUE
is set in case you are using prefixes in your templates that are not known to robot
, such as OMOP:
, CPONT:
and more. All non-standard prefixes you are using should be added to config/context.json
.products
section.use_template: TRUE
. This will create an empty template for you in the templates directory, which will automatically be processed when recreating the component (e.g. run.bat make recreate-mycomp
).templates
field to add as many template names as you wish. ODK will look for them in the src/templates
directory.template_options
field. This should be a string with option from robot template. One typical example for additional options you may want to provide is --add-prefixes config/context.json
to ensure the prefix map of your context is provided to robot
, see above.Example:
+components:
+ products:
+ - filename: mycomp.owl
+ use_template: TRUE
+ template_options: --add-prefixes config/context.json
+ templates:
+ - template1.tsv
+ - template2.tsv
+
Note: if your mirror is particularly large and complex, read this ODK recommendation.
+ + + + + + +The main kinds of files in the repository:
+Release file are the file that are considered part of the official ontology release and to be used by the community. A detailed description of the release artefacts can be found here.
+Imports are subsets of external ontologies that contain terms and axioms you would like to re-use in your ontology. These are considered "external", like dependencies in software development, and are not included in your "base" product, which is the release artefact which contains only those axioms that you personally maintain.
+These are the current imports in UPHENO
+Import | +URL | +Type | +
---|---|---|
go | +https://raw.githubusercontent.com/obophenotype/pro_obo_slim/master/pr_slim.owl | +None | +
nbo | +http://purl.obolibrary.org/obo/nbo.owl | +None | +
uberon | +http://purl.obolibrary.org/obo/uberon.owl | +None | +
cl | +http://purl.obolibrary.org/obo/cl.owl | +None | +
pato | +http://purl.obolibrary.org/obo/pato.owl | +None | +
mpath | +http://purl.obolibrary.org/obo/mpath.owl | +None | +
ro | +http://purl.obolibrary.org/obo/ro.owl | +None | +
omo | +http://purl.obolibrary.org/obo/omo.owl | +None | +
chebi | +https://raw.githubusercontent.com/obophenotype/chebi_obo_slim/main/chebi_slim.owl | +None | +
oba | +http://purl.obolibrary.org/obo/oba.owl | +None | +
ncbitaxon | +http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl | +None | +
pr | +https://raw.githubusercontent.com/obophenotype/pro_obo_slim/master/pr_slim.owl | +None | +
bspo | +http://purl.obolibrary.org/obo/bspo.owl | +None | +
ncit | +http://purl.obolibrary.org/obo/ncit.owl | +None | +
fbbt | +http://purl.obolibrary.org/obo/fbbt.owl | +None | +
fbdv | +http://purl.obolibrary.org/obo/fbdv.owl | +None | +
hsapdv | +http://purl.obolibrary.org/obo/hsapdv.owl | +None | +
wbls | +http://purl.obolibrary.org/obo/wbls.owl | +None | +
wbbt | +http://purl.obolibrary.org/obo/wbbt.owl | +None | +
plana | +http://purl.obolibrary.org/obo/plana.owl | +None | +
zfa | +http://purl.obolibrary.org/obo/zfa.owl | +None | +
xao | +http://purl.obolibrary.org/obo/xao.owl | +None | +
hsapdv-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-hsapdv.owl | +custom | +
zfa-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-zfa.owl | +custom | +
zfs-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-zfs.owl | +custom | +
xao-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-xao.owl | +custom | +
wbbt-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-wbbt.owl | +custom | +
wbls-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-wbls.owl | +custom | +
fbbt-uberon | +http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-fbbt.owl | +custom | +
xao-cl | +http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-xao.owl | +custom | +
wbbt-cl | +http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-wbbt.owl | +custom | +
fbbt-cl | +http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-fbbt.owl | +custom | +
Components, in contrast to imports, are considered full members of the ontology. This means that any axiom in a component is also included in the ontology base - which means it is considered native to the ontology. While this sounds complicated, consider this: conceptually, no component should be part of more than one ontology. If that seems to be the case, we are most likely talking about an import. Components are often not needed for ontologies, but there are some use cases:
+These are the components in UPHENO
+Filename | +URL | +
---|---|
phenotypes_manual.owl | +None | +
upheno-mappings.owl | +None | +
cross-species-mappings.owl | +None | +
One of the most frequent problems with running the ODK for the first time is failure because of lack of memory. This can look like a Java OutOfMemory exception,
+but more often than not it will appear as something like an Error 137
. There are two places you need to consider to set your memory:
robot_java_args: '-Xmx8G'
to your src/ontology/upheno-odk.yaml file, see for example here.robot_java_args
variable. You can manage your memory settings
+by right-clicking on the docker whale in your system bar-->Preferences-->Resources-->Advanced, see picture below.This page discusses how to update the contents of your imports, like adding or removing terms. If you are looking to customise imports, like changing the module type, see here.
+Note: some ontologies now use a merged-import system to manage dynamic imports, for these please follow instructions in the section title "Using the Base Module approach".
+Importing a new term is split into two sub-phases:
+There are three ways to declare terms that are to be imported from an external ontology. Choose the appropriate one for your particular scenario (all three can be used in parallel if need be):
+This workflow is to be avoided, but may be appropriate if the editor does not have access to the ODK docker container. +This approach also applies to ontologies that use base module import approach.
+Now you can use this term for example to construct logical definitions. The next time the imports are refreshed (see how to refresh here), the metadata (labels, definitions, etc.) for this term are imported from the respective external source ontology and becomes visible in your ontology.
+Every import has, by default a term file associated with it, which can be found in the imports directory. For example, if you have a GO import in src/ontology/go_import.owl
, you will also have an associated term file src/ontology/go_terms.txt
. You can add terms in there simply as a list:
Now you can run the refresh imports workflow) and the two terms will be imported.
+This workflow is appropriate if:
+To enable this workflow, you add the following to your ODK config file (src/ontology/upheno-odk.yaml
), and update the repository:
Now you can manage your imported terms directly in the custom external terms template, which is located at src/templates/external_import.owl
. Note that this file is a ROBOT template, and can, in principle, be extended to include any axioms you like. Before extending the template, however, read the following carefully.
The main purpose of the custom import template is to enable the management off all terms to be imported in a centralised place. To enable that, you do not have to do anything other than maintaining the template. So if you, say currently import APOLLO_SV:00000480
, and you wish to import APOLLO_SV:00000532
, you simply add a row like this:
When the imports are refreshed see imports refresh workflow, the term(s) will simply be imported from the configured ontologies.
+Now, if you wish to extend the Makefile (which is beyond these instructions) and add, say, synonyms to the imported terms, you can do that, but you need to (a) preserve the ID
and ENTITY
columns and (b) ensure that the ROBOT template is valid otherwise, see here.
WARNING. Note that doing this is a widespread antipattern (see related issue). You should not change the axioms of terms that do not belong into your ontology unless necessary - such changes should always be pushed into the ontology where they belong. However, since people are doing it, whether the OBO Foundry likes it or not, at least using the custom imports module as described here localises the changes to a single simple template and ensures that none of the annotations added this way are merged into the base file.
+If you want to refresh the import yourself (this may be necessary to pass the travis tests), and you have the ODK installed, you can do the following (using go as an example):
+First, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory). +
+Then, you regenerate the import that will now include any new terms you have added. Note: You must have docker installed.
+ +Since ODK 1.2.27, it is also possible to simply run the following, which is the same as the above:
+ +Note that in case you changed the defaults, you need to add IMP=true
and/or MIR=true
to the command below:
If you wish to skip refreshing the mirror, i.e. skip downloading the latest version of the source ontology for your import (e.g. go.owl
for your go import) you can set MIR=false
instead, which will do the exact same thing as the above, but is easier to remember:
Since ODK 1.2.31, we support an entirely new approach to generate modules: Using base files. +The idea is to only import axioms from ontologies that actually belong to it. +A base file is a subset of the ontology that only contains those axioms that nominally +belong there. In other words, the base file does not contain any axioms that belong +to another ontology. An example would be this:
+Imagine this being the full Uberon ontology:
+Axiom 1: BFO:123 SubClassOf BFO:124
+Axiom 1: UBERON:123 SubClassOf BFO:123
+Axiom 1: UBERON:124 SubClassOf UBERON 123
+
The base file is the set of all axioms that are about UBERON terms:
+ +I.e.
+ +Gets removed.
+The base file pipeline is a bit more complex than the normal pipelines, because +of the logical interactions between the imported ontologies. This is solved by _first +merging all mirrors into one huge file and then extracting one mega module from it.
+Example: Let's say we are importing terms from Uberon, GO and RO in our ontologies. +When we use the base pipelines, we
+1) First obtain the base (usually by simply downloading it, but there is also an option now to create it with ROBOT)
+2) We merge all base files into one big pile
+3) Then we extract a single module imports/merged_import.owl
The first implementation of this pipeline is PATO, see https://github.com/pato-ontology/pato/blob/master/src/ontology/pato-odk.yaml.
+To check if your ontology uses this method, check src/ontology/upheno-odk.yaml to see if use_base_merging: TRUE
is declared under import_group
If your ontology uses Base Module approach, please use the following steps:
+First, add the term to be imported to the term file associated with it (see above "Using term files" section if this is not clear to you)
+Next, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory). +
+Then refresh imports by running
+ +Note: if your mirrors are updated, you can runsh run.sh make no-mirror-refresh-merged
+This requires quite a bit of memory on your local machine, so if you encounter an error, it might be a lack of memory on your computer. A solution would be to create a ticket in an issue tracker requesting for the term to be imported, and one of the local devs should pick this up and run the import for you.
+Lastly, restart Protégé, and the term should be imported in ready to be used.
+ + + + + + +For details on what components are, please see component section of repository file structure document.
+To add custom components to an ODK repo, please follow the following steps:
+1) Locate your odk yaml file and open it with your favourite text editor (src/ontology/upheno-odk.yaml) +2) Search if there is already a component section to the yaml file, if not add it accordingly, adding the name of your component:
+ +3) Add the component to your catalog file (src/ontology/catalog-v001.xml)
+ <uri name="http://purl.obolibrary.org/obo/upheno/components/your-component-name.owl" uri="components/your-component-name.owl"/>
+
4) Add the component to the edit file (src/ontology/upheno-edit.obo) +for .obo formats:
+ +for .owl formats:
+ +5) Refresh your repo by running sh run.sh make update_repo
- this should create a new file in src/ontology/components.
+6) In your custom makefile (src/ontology/upheno.Makefile) add a goal for your custom make file. In this example, the goal is a ROBOT template.
$(COMPONENTSDIR)/your-component-name.owl: $(SRC) ../templates/your-component-template.tsv
+ $(ROBOT) template --template ../templates/your-component-template.tsv \
+ annotate --ontology-iri $(ONTBASE)/$@ --output $(COMPONENTSDIR)/your-component-name.owl
+
(If using a ROBOT template, do not forget to add your template tsv in src/templates/)
+7) Make the file by running sh run.sh make components/your-component-name.owl
The uPheno editors call is held every second Thursday (bi-weekly) on Zoom, provided by members of the Monarch Initiative and co-organised by members of the Alliance and Genome Resources. If you wish to join the meeting, you can open an issue on https://github.com/obophenotype/upheno/issues with the request to be added, or send an email to phenotype-ontologies-editors@googlegroups.com.
+The meeting coordinator (MC) is the person charged with organising the meeting. The current MC is Ray, @rays22.
+The uPheno organises an outreach call every four weeks to listen to external stakeholders describing their need for cross-species phenotype integration.
+Date | +Lesson | +Notes | +Recordings | +
---|---|---|---|
2024/04/05 | +TBD | +TBD | ++ |
2024/3/08 | +Computational identification of disease models through cross-species phenotype comparison | +Diego A. Pava, Pilar Cacheiro, Damian Smedley (IMPC) | +Recording | +
2024/02/09 | +Use cases for uPheno in the Alliance of Genome Resources and MGI | +Sue Bello (Alliance of Genome Resources, MGI) | +Recording | +
*
The
Drosophila
phenotype
+ontology
Osumi-Sutherland et al, J Biomed Sem.
The DPO is formally a subset of FBcv, made available from +http://purl.obolibrary.org/obo/fbcv/dpo.owl
+Phenotypes in FlyBase may either by assigned to FBcv (dpo) classes, or +they may have a phenotype_manifest_in to FBbt (anatomy).
+For integration we generate the following ontologies:
+*
http://purl.obolibrary.org/obo/upheno/imports/fbbt_phenotype.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/uberon_phenotype.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/go_phenotype.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/cl_phenotype.owl
(see Makefile)
+This includes a phenotype class for every anatomy class - the IRI is +suffixed with "PHENOTYPE". Using these ontologies, Uberon and CL +phenotypes make the groupings.
+We include
+*
http://purl.obolibrary.org/obo/upheno/dpo/dpo-importer.owl
Which imports dpo plus auto-generated fbbt phenotypes.
+The dpo-importer is included in the [MetazoanImporter]
+We create a local copy of fbbt that has "Drosophila " prefixed to all +labels. This gives us a hierarchy:
+* eye phenotype (defined using Uberon)
\
+* compound eye phenotype (defined using Uberon)
\
+* drosophila eye phenotype (defined using FBbt)
*
http://code.google.com/p/cell-ontology/issues/detail?id=115
ensure all CL to FBbt equiv axioms are present (we have good coverage for Uberon)
* project page -
https://sourceforge.net/apps/trac/pombase/wiki/FissionYeastPhenotypeOntology
\
+*
FYPO: the fission yeast phenotype ontology
Harris et al, Bioinformatics
Note that the OWL axioms for FYPO are managed directly in the FYPO +project repo, we do not duplicate them here
+ + + + + + +*
http://www.human-phenotype-ontology.org/
\
+* Köhler S, Doelken SC, Mungall CJ, Bauer S, Firth HV, Bailleul-Forestier I, Black GC, Brown DL, Brudno M, Campbell J, FitzPatrick DR, Eppig JT, Jackson AP, Freson K, Girdea M, Helbig I, Hurst JA, Jähn J, Jackson LG, Kelly AM, Ledbetter DH, Mansour S, Martin CL, Moss C, Mumford A, Ouwehand WH, Park SM, Riggs ER, Scott RH, Sisodiya S, Van Vooren S, Wapner RJ, Wilkie AO, Wright CF, Vulto-van Silfhout AT, de Leeuw N, de Vries BB, Washingthon NL, Smith CL, Westerfield M, Schofield P, Ruef BJ, Gkoutos GV, Haendel M, Smedley D, Lewis SE, Robinson PN. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data.
Nucleic Acids Res.
2014 Jan;
42
(Database issue):D966-74 [
pubmed
]
*
HPO
+browser
\
+*
HP
in
+OntoBee
\
+*
HP
in
OLSVis
The OWL axioms for HP are in the +src/ontology/hp +directory on this site.
+The structure is analagous to that of the [MP].
+The OWL axiomatization is updated frequently to stay in sync with +changes in the MP
+The edit file is currently:
+*
http://purl.obolibrary.org/obo/hp/hp-equivalence-axioms-subq-ubr.owl
Edit this in protege.
+ + + + + + +*
The
Mammalian
Phenotype
Ontology:
enabling
robust
+annotation
and
comparative
+analysis
Smith CL, Eppig JT
\
+*
MP
browser
at
+MGI
\
+*
MP
in
+OntoBee
\
+*
MP
in
OLSVis
The OWL axioms for MP are in the +src/ontology/mp +directory on this site.
+*
http://purl.obolibrary.org/obo/mp.owl
- direct conversion of MGI-supplied obo file
\
+*
http://purl.obolibrary.org/obo/mp/mp-importer.owl
- imports additional axioms, including the following ones below:
\
+*
http://purl.obolibrary.org/obo/mp.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/chebi_import.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/uberon_import.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/pato_import.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/go_import.owl
\
+*
http://purl.obolibrary.org/obo/upheno/imports/mpath_import.owl
\
+*
http://purl.obolibrary.org/obo/mp/mp-equivalence-axioms-subq-ubr.owl
\
+\
The OWL axiomatization is updated frequently to stay in sync with +changes in the MP
+The edit file is currently:
+*
http://purl.obolibrary.org/obo/mp/mp-equivalence-axioms-edit.owl
Edit this in protege.
+The file mp-equivalence-axioms.obo is DEPRECATED!
+*
http://mp.termgenie.org/
\
+*
http://mp.termgenie.org/TermGenieFreeForm
* Schindelman, Gary, et al.
Worm
Phenotype
Ontology:
+integrating
phenotype
data
within
and
beyond
the
C.
+elegans
+community.
BMC bioinformatics 12.1 (2011): 32.
\
+*
WBPhenotype
in
+OntoBee
\
+*
WBPhenotype
in
+OLSVis
The OWL axioms for WBPhenotype are in the +src/ontology/wbphenotype +directory on this site.
+*
http://purl.obolibrary.org/obo/wbphenotype.owl
- direct conversion of WormBase-supplied obo file
\
+*
http://purl.obolibrary.org/obo/wbphenotype/wbphenotype-importer.owl
- imports additional axioms.
The structure roughly follows that of the [MP]. The worm anatomy is +used.
+Currently the source is wbphenotype/wbphenotype-equivalence-axioms.obo, +the OWL is generated from here. We are considering switching this +around, so the OWL is edited, using Protege.
+ + + + + + +This page describes the generation of the zebrafish phenotype ontology
+The ZP differs considerably from [HP], [MP] and others. ZFIN do not +annotate with a pre-composed phenotype ontology - all annotations +compose phenotypes on-the-fly using a combination of PATO, ZFA, GO and +other ontologies.
+We use these combinations to construct ZP on the fly, by naming each +distinct combination, assigning it an ID, and placing it in the +hierarchy.
+The process is described here:
+The OWL formalism for ZFIN annotations is described here:
+The java implementation is here:
+The OWL axioms for ZP are in +zp.owl +that is build on our hudson server.
+ + + + + + +Here, we discuss the core concepts of the computational phenotype model underpinning the uPheno effort.
+"Characteristics" or "qualities" refer to an inherent or distinguishing characteristic or attribute of something or someone. +It represents a feature that defines the nature of an object, organism, or entity and can be used to describe, compare, and categorize different things. +Characteristics can be either qualitative (such as color, texture, or taste) or quantitative (such as height, weight, or age).
+The Phenotype And Trait Ontology (PATO) is the reference ontology for general characteristics in the OBO world.
+Some of the most widely use characteristics can be seen in the following tables
+quality | +description | +example | +
---|---|---|
Length (PATO:0000122) | +A 1-D extent quality which is equal to the distance between two points. | ++ |
Mass (PATO:0000128) | +A physical quality that inheres in a bearer by virtue of the proportion of the bearer's amount of matter. | ++ |
Amount (PATO:0000070) | +The number of entities of a type that are part of the whole organism. | ++ |
Morphology (PATO:0000051) | +A quality of a single physical entity inhering in the bearer by virtue of the bearer's size or shape or structure. | ++ |
Note from the authors: The descriptions above have been taken from PATO, but they are not very.. user friendly.
+ +Characteristics such as the one above can be used to describe a variety of entities such as biological, environmental and social. +We are specifically concerned with biological traits, which are characteristics that refer to an inherent characteristic of a biological entity, such as an organ (the heart), a process (cell division), a chemical entity (lysine) in the blood.
+The Ontology of Biological Attributes (OBA) is the reference ontology for biological characteristics in the OBO world. +There are a few other ontologies that describe biological traits, such as the Vertebrate Phenotype Ontology and the Ascomycete Phenotype Ontology (APO), but these are more species specific, and, more importantly, are not integrated in the wider EQ modelling framework.
+Property | +Example term | +Definition | +
---|---|---|
Length | +OBA:VT0002544 | +The length of a digit. | +
Mass | +OBA:VT0001259 | +The mass of a multicellular organism. | +
Level | +OBA:2020005 | +The amount of lysine in blood. | +
Morphology | +OBA:VT0005406 | +The size of a heart. | +
In biological contexts, the term "bearer" refers to the entity that possesses or carries a particular characteristic or quality. +The bearer can be any biological entity, such as an organism, an organ, a cell, or even a molecular structure, that exhibits a specific trait or feature. +Some examples:
+In each example, the "bearer" is the entity that has, carries, or exhibits a particular biological characteristic. This concept is fundamental in biology and bioinformatics for linking specific traits, qualities, or features to the entities that possess them, thereby enabling a clearer understanding and categorization of biological diversity and functions.
+ +A phenotypic change refers to some deviation from reference morphology, physiology, or behavior. +This is the most widely used, and most complicated category of phenotype terms for data specialists to understand.
+Conceptually, a phenotypic change comprises:
+Biological attributes such as blood lysine amount
(OBA:2020005) have been discussed earlier in this document.
+The most widely used change modifier used in practice is abnormal
(PATO:0000460).
+This modifier signifies that the phenotypic change term describes a deviation that is abnormal, such as "Hyperlysinemia" (HP:0002161), which describes and increased concentration of lysine in the blood.
+Other modifiers include normal
(PATO:0000461), which describes a change within in the normal range (sometimes interpreted as "no change").
+A directional modifier like increased
(PATO:0040043) or decreased
(PATO:0040042). In practice, most of our "characteristic" terms have specialised directional variants such as decreased amount
(PATO:0001997) which can be used to describe phenotypes.
Comparators are the most confusing aspects of phenotypic change.
+The first question someone has to ask when they see a concept describing is change like increased blood lysine levels
is "compared to what?".
+Depending on biological context, the assumed comparators vary widely.
+For example, in clinical phenotyping, it is mostly assumed that
+a phenotypic feature corresponds to a deviation from the normal range, see HPO docs.
+However, it is just just as easily imaginable that HPO terms are used to describe change compared to a previous state of the same individual (increased tumor size compared to last time we checked).
+In research settings such as GWAS study annotations, HPO terms are used to annotate variants where a statistically significant change was observed compared to the general population.
+The same is true for many model phenotyping efforts such as MGI, where the situation is even further complicated that the comparator is not "the general population", but a control group. In summary, comparators can be:
And the compared charactertistics could be
+No matter how much we want it - concepts describing phenotypic change will be used in many creative ways, and unfortunately, once the data hits your data analysis pipeline, you will likely not know for sure the nature of the comparator. +Where you can, you should try to figure it out from the metata.
+This sounds like bad news. However, keep one thing in mind: +Phenotype associations (to anything, including genes) are rarely strictly causal. +Even if a change is observed "compared to some non-representative control" there is likely to be some signal useful for downstream inference - somehow, the "gene has something to do with the phenotype".
+In the clinical domain, many ontologies exist that define concepts that are very strongly related to our notion of "phenotype". +In SNOMED, for example, "clinical findings" are defined as normal/abnormal observations, judgments, or assessments of patients (e.g. Abnormal urinalysis (finding)). +For most analytic purposes, we think of SNOMEDs (and other medical terminologies) notion of clinical finding of something ortologous to our notion of "phenotype" (and their "observale entity" as a trait/biological attribute). +However, if one gets into the weeds, many discrepencies in judgement can be observed, in particular when it comes to the separation from disease.
+"Phenotype" is typically used in its "singular" form to describe the set of all observable characteristics of a subject. +However, because we have over time gotten used to talking about "cardiovascular phenotype" and "increased blood glucose level", we have started using the plural form more, i.e. "phenotypes". +We now tend to use the term "phenotypic profile" to describe the set of phenotypes that an organism exhibits at some point in time.
+"Phenotypic feature" is a commonly used term that refers to the same idea, but mostly in the context of disease to describe an observable characteristic commonly associated with a disease.
+"Phenotypic abnormality" is the formal term to describe a concept in the HPO, and is sometimes used to refer to the same idea in HPO-related papers. +There is a bit of an assumption here, compared to the more general concepts described in this section, which is that the term should refer to a "deviation from the normal range", but, as described in the section of "implicit comparators", this assumption does not always hold in practice.
+"Phenotypic change" is a recent invention by David Osumi-Sutherland in an attempt to subsume the ideas above, in particular to explicitly step back from the concept of "deviation from normal" to "statistically significant deviation" (which includes the normal range).
+The Unified Phenotype Ontology (uPheno) is the reference ontology for biological abnormalities in the OBO world. +There are a many species-specific ontologies in the OBO world, such as the Mammalian Phenotype Ontology (MP), the Human Phenotype Ontology (HPO) and the Drosophila Phenotype Ontology (DPO), see here.
+Property | +Example term | +Definition | +
---|---|---|
Length | +UPHENO:0072215 | +Increased length of the digit. | +
Mass | +UPHENO:0054299 | +Decreased multicellular organism mass. | +
Level | +UPHENO:0034327 | +Decreased level of lysine in blood. | +
Morphology | +UPHENO:0001471 | +Increased size of the heart. | +
Diseases are among the most important concepts in the phenotype data space. Phenotypes relate +One big source of confusion in our community is the seperation of "phenotypic features" or changes from diseases. +The HPO docs provide an explanation geared at clinicians to help them distinguish between the two. +The quest on developing an operational definition is still ongoing, but for now, we recommend to go with the following basic assumptions:
+In biological data curation, it’s essential to differentiate between traits (observable characteristics such as "blood glucose level") and measurements (a process to observe such characteristics, e.g. "blood glucose level assay", "BMI"). +Just from the term itself this is often difficult. +"Blood glucose level" can refer both a measurement and a trait when taken out of context, but the ontologies they appear in should differenciate cleanly between the two. +Here are some ways to distinguish them: +- traits are + - observable characteritics of an organism + - can be qualitative ("red eye colour") or quantitative ("35 cm tail length") +- measurements are + - activties performed by an agent (such as a researcher) + - involve the quantification or qualification of a specific trait + - correspond to measurement instruments / techniques (such as assays, BMIs)
+In practice, it is true that a lot of data records a wild mix of the two. +It is the job of (semantic) data modeling specialists to clearly distinguish the two when integrating annotate data from sources with divergent curation practices.
+Characteristics (A) and bearers of characteristics (B) are the core constituents of traits/biological attributes (C). Phenotypes are comprised of trait terms (C) combined with a modifier (D). Species-specific phenotypes (F), including phenotypic abnormalities defined in the Human Phenotype Ontology (HPO) are feature of diseases (G). Measurements (H), such as assays, quantify or qualify (measure) traits (C).
+ + + + + + +Before we get started, let's remind ourselves of the basic structure of phenotype data.
+ +Characteristics (A) and bearers of characteristics (B) are the core constituents of traits/biological attributes (C). Phenotypes are comprised of trait terms (C) combined with a modifier (D). Species-specific phenotypes (F), including phenotypic abnormalities defined in the Human Phenotype Ontology (HPO) are feature of diseases (G). Measurements (H), such as assays, quantify or qualify (measure) traits (C).
+Integrating all kinds of phenotype data into the "uPheno framework" is a complex process which we will break down in the following.
+Imports directory:
+*
http://purl.obolibrary.org/obo/upheno/imports/
Currently the imports includes:
+* imports/chebi_import.owl
\
+* imports/doid_import.owl
\
+* imports/go_import.owl
\
+* imports/mpath_import.owl
\
+* imports/pato_import.owl
\
+* imports/pr_import.owl
\
+* imports/uberon_import.owl
\
+* imports/wbbt_import.owl
To avoid multiple duplicate classes for heart, lung, skin etc we map all +classes to [Uberon] where this is applicable. For more divergent species +such as fly and C elegans we use the appropriate species-specific +ontology.
+Currently there are a small number of highly specific classes in FMA +that are being used and have no corresponding class in Uberon
+We use the OWLAPI SyntacticLocalityModularityExtractor, via [OWLTools]. +See the http://purl.obolibrary.org/obo/upheno/Makefile for details
+ + + + + + +The current design patterns are such that the abnormal qualifier is only +added when the quality class in the definition is neutral.
+However, we still need to be able to infer
+* Hyoplasia of right ventricle SubClassOf Abnormality of right ventricle
Because the latter class definition includes qualifier some abnormal, +the SubClassOf axiom will not be entailed unless the qualifier is +explicitly stated or inferred
+We achieve this by including an axiom to PATO such that decreased sizes +etc are inferred to be qualifier some abnormal.
+We do this with an exiom in imports/extra.owl
+* 'deviation(from normal)' SubClassOf qualifier some abnormal
Anything under 'increased', 'decreased' etc in PATO is pre-reasoned in +PATO to be here.
+See the following explanation:
+http://phenotype-ontologies.googlecode.com/svn/trunk/doc/images/has-qualifier-inference.png
+For this strategy to work it requires the PATO classes themselves to be +classified under deviation from normal. This may not always be the case
+Do not be distracted by the fact the has-qualifier relation is named +has-component at the moment
+https://code.google.com/p/phenotype-ontologies/issues/detail?id=45
+Much has been written on the subject of representing absence. Before +diving into the logical issues it is worth examining patterns in +existing phenotype ontologies to understand what user expectations may +typically be for absence.
+*
Absence_Phenotypes_in_OWL
(Phenoscape Wiki)
\
+* (outdated) material on the old
PATO
+wiki
.
It is not uncommon to see patterns such as
+From a strict logical perspective, this is inverted. "absent incisors" +surely means "absence of all incisors", or put another way "the animal +has no incisors". Yet it would be possible to have an animal with +*absent* lower incisors and *present* upper incisors, yielding what +seems a contradiction (because the subClass axiom would say this +partial-incisor animal lacked all incisors).
+If the ontology were in fact truly modeling "absence of *all* S" then +it would lead to a curious ontology structure, with the typical tree +structure of the anatomy ontology representing S inverted into a +polyhierarchical fan in the absent-S ontology.
+From this it can be cautiously inferred that the intent of the phenotype +ontology curator and user is in fact to model "absence of *some* S" +rather than "absence of *all* S". This is not necessarily a universal +rule, and the intent may vary depending on whether we are talking about +a serially repeated structure or one that typically occurs in isolation. +The intent may also be to communicate that a *significant number* of S +is missing.
+It is also not uncommon to see patterns such as:
+Again, from a strict logical perspective this is false. If the spleen is +absent then what does the "morphology" of the parent refer to?
+However, this inference is clearly a desirable one from the point of +view of the phenotype ontology editors and users, as it is common in +ontologies for a variety of structures. For example:
+And:
+These patterns can be formally defended on developmental biology +grounds. "absence" here is _not_ equivalent to logical absence. It +refers specifically to developmental absence.
+Furthermore, strict logical absence leads to undesirable inferences. It +would be odd to include a nematode worm as having the phenotype "spleen +absent", because worms have not evolved spleens. But the logical +description of not having a spleen as part fets a worm.
+Similarly, if the strict cardinality interpretation were intended, we +would expect to see:
+i.e. if you're missing your entire hindlegs, you're *necessarily* +missing your femurs. But it must be emphatisized that this is *not* +how phenotype ontologies are classified. This goes for a wide range of +structures and other relationship types. In MP, "absent limb buds" are +*not* classified under "absent limbs", even though it is impossible +for a mammal to have limbs without having had limb buds.
+The existing treatment of absence can be formally defended +morphologically by conceiving of a morphological value space, with +"large" at one end and "small" at the other. As we get continuously +smaller, there may come an arbitrary point whereby we say "surely this +is no longer a limb" (and of course, we are not talking about a pure +geometrical size transformation here - as a limb reaches extreme edges +of a size range various other morphological changes necessarily happen). +But this cutoff is arguably arbitrary, and the resulting discontinuity +causes problems. It is simpler to treat absence as being one end of a +size scale.
+This is barely touching the subject, and is intended to illustrate that +things may be more subtle than naively treating words like "absent" as +precisely equivalent to cardinality=0. An understanding of the medical, +developmental and evolutionary contexts are absolutely required, +together with an understanding of the entailments of different logical +formulations.
+Even though existing phenotype ontologies may not be conceived of +formally, it is implicit than they do not model absence as being +equivalent to cardinality=0 / not(has_part), because the structure of +these ontologies would look radically different.
+Link to Jim Balhoff's PhenoDay paper and discussion
+Here's the link: http://phenoday2014.bio-lark.org/pdf/11.pdf
+ + + + + + +The goals of this document are:
+Category | +Example datasets | +Example phenotype | +
---|---|---|
Gene to phenotype associations | +Online Mendelian Inheritance in Man (OMIM), Human Phenotype Ontology (HPO), Gene Ontology (GO) | +Achondroplasia (associated with FGFR3 gene mutations) | +
Gene to disease associations | +The Cancer Genome Atlas (TCGA), Online Mendelian Inheritance in Man (OMIM), GWAS Catalog | +Breast invasive carcinoma (associated with BRCA1/BRCA2 mutations) | +
Phenotype-phenotype semantic similarity | +Human Phenotype Ontology (HPO), Unified Medical Language System (UMLS), Disease Ontology (DO) | +Cardiac abnormalities (semantic similarity with congenital heart defects) | +
Quantified trait data (QTL etc) | +NHGRI-EBI GWAS Catalog, Genotype-Tissue Expression (GTEx), The Human Protein Atlas | +Height (quantified trait associated with SNPs in genomic regions) | +
Electronic health records | +Medical Information Mart for Intensive Care III (MIMIC-III), UK Biobank, IBM Watson Health | +Acute kidney injury (recorded diagnosis during ICU stay) | +
Epidemiological datasets | +Framingham Heart Study, National Health and Nutrition Examination Survey (NHANES), Global Burden of Disease Study (GBD) | +Cardiovascular disease (epidemiological study of risk factors and disease incidence) | +
Clinical trial datasets | +ClinicalTrials.gov, European Union Clinical Trials Register (EUCTR), International Clinical Trials Registry Platform (ICTRP) | +Treatment response (clinical trial data on efficacy and safety outcomes) | +
Environmental exposure datasets | +Environmental Protection Agency Air Quality System (EPA AQS), Global Historical Climatology Network (GHCN), National Centers for Environmental Information Climate Data Online (NCEI CDO) | +Respiratory diseases (association with air pollutant exposure) | +
Population surveys e.g., UK Biobank | +UK Biobank, National Health Interview Survey (NHIS), National Health and Nutrition Examination Survey (NHANES) | +Chronic diseases (population-based study on disease prevalence and risk factors) | +
Behavioral observation datasets | +National Survey on Drug Use and Health (NSDUH), Add Health, British Cohort Study (BCS) | +Substance abuse disorders (survey data on drug consumption and addiction) | +
Phenotype data comes in many different shapes and forms. In the following, we will describe some of the most common features of such data:
+ + +Pre-coordinated phenotype data is popular in the clinical domain, where a lot of observations are taken by a clinician and recorded as "phenotypic abnormalities" with the goal of eventual diagnosis.
+Phenopackets such as the one below are an emerging standard to capture and sharing disease and phenotype information. Phenotypic features in particular are captured as so called "pre-coordinated phenotype terms" such as "Attenuation of retinal blood vessels" (HP:0007843). "Pre-coordinated" in this context means that the various aspects of the phenotype term, such as the bearer ("retinal blood vessels") and the characteristic ("Attenuation", or "thinning/narrowing"), and the modifier (in the case of HPO terms, simply abnormal), are combined ("coordinated") into a single term.
+{
+"id": "PMID:23559858-Ajmal-2013-BBS1-IV-5/family_A",
+"subject": {
+ "id": "IV-5/family A",
+ "timeAtLastEncounter": {
+ "age": {
+ "iso8601duration": "P26Y"
+ }
+ },
+ "sex": "MALE",
+ "taxonomy": {
+ "id": "NCBITaxon:9606",
+ "label": "Homo sapiens"
+ }
+},
+"phenotypicFeatures": [
+ {
+ "type": {
+ "id": "HP:0007843",
+ "label": "Attenuation of retinal blood vessels"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0001513",
+ "label": "Obesity"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0000608",
+ "label": "Macular degeneration"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0000486",
+ "label": "Strabismus"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0001328",
+ "label": "Specific learning disability"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0000510",
+ "label": "Rod-cone dystrophy"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ },
+ {
+ "type": {
+ "id": "HP:0001263",
+ "label": "Global developmental delay"
+ },
+ "evidence": [
+ {
+ "evidenceCode": {
+ "id": "ECO:0000033",
+ "label": "author statement supported by traceable reference"
+ },
+ "reference": {
+ "id": "PMID:23559858",
+ "description": "A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T."
+ }
+ }
+ ]
+ }
+],
+"interpretations": [
+ {
+ "id": "PMID:23559858-Ajmal-2013-BBS1-IV-5/family_A",
+ "progressStatus": "SOLVED",
+ "diagnosis": {
+ "disease": {
+ "id": "OMIM:209900",
+ "label": "BARDET-BIEDL SYNDROME 1; BBS1"
+ },
+ "genomicInterpretations": [
+ {
+ "subjectOrBiosampleId": "IV-5/family A",
+ "interpretationStatus": "CAUSATIVE",
+ "variantInterpretation": {
+ "variationDescriptor": {
+ "id": "clinvar:1324292",
+ "geneContext": {
+ "valueId": "ENSG00000174483",
+ "symbol": "BBS1",
+ "alternateIds": [
+ "HGNC:966",
+ "entrez:582",
+ "ensembl:ENSG00000174483",
+ "symbol:BBS1"
+ ]
+ },
+ "vcfRecord": {
+ "genomeAssembly": "GRCh37",
+ "chrom": "11",
+ "pos": "66278178",
+ "ref": "G",
+ "alt": "T"
+ },
+ "allelicState": {
+ "id": "GENO:0000136",
+ "label": "homozygous"
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+],
+"metaData": {
+ "created": "1970-01-01T00:00:00Z",
+ "submittedBy": "HPO:probinson",
+ "resources": [
+ {
+ "id": "hp",
+ "name": "human phenotype ontology",
+ "url": "http://purl.obolibrary.org/obo/hp.owl",
+ "version": "2018-03-08",
+ "namespacePrefix": "HP",
+ "iriPrefix": "http://purl.obolibrary.org/obo/HP_"
+ },
+ {
+ "id": "pato",
+ "name": "Phenotype And Trait Ontology",
+ "url": "http://purl.obolibrary.org/obo/pato.owl",
+ "version": "2018-03-28",
+ "namespacePrefix": "PATO",
+ "iriPrefix": "http://purl.obolibrary.org/obo/PATO_"
+ },
+ {
+ "id": "geno",
+ "name": "Genotype Ontology",
+ "url": "http://purl.obolibrary.org/obo/geno.owl",
+ "version": "19-03-2018",
+ "namespacePrefix": "GENO",
+ "iriPrefix": "http://purl.obolibrary.org/obo/GENO_"
+ },
+ {
+ "id": "ncbitaxon",
+ "name": "NCBI organismal classification",
+ "url": "http://purl.obolibrary.org/obo/ncbitaxon.owl",
+ "version": "2018-03-02",
+ "namespacePrefix": "NCBITaxon",
+ "iriPrefix": "http://purl.obolibrary.org/obo/NCBITaxon_"
+ },
+ {
+ "id": "eco",
+ "name": "Evidence and Conclusion Ontology",
+ "url": "http://purl.obolibrary.org/obo/eco.owl",
+ "version": "2018-11-10",
+ "namespacePrefix": "ECO",
+ "iriPrefix": "http://purl.obolibrary.org/obo/ECO_"
+ },
+ {
+ "id": "omim",
+ "name": "Online Mendelian Inheritance in Man",
+ "url": "https://www.omim.org",
+ "version": "2018-03-08",
+ "namespacePrefix": "OMIM",
+ "iriPrefix": "https://omim.org/entry/"
+ },
+ {
+ "id": "clinvar",
+ "name": "Clinical Variation",
+ "url": "https://www.ncbi.nlm.nih.gov/clinvar/",
+ "version": "2023-04-06",
+ "namespacePrefix": "clinvar",
+ "iriPrefix": "https://www.ncbi.nlm.nih.gov/clinvar/variation/"
+ }
+ ],
+ "phenopacketSchemaVersion": "2.0.0"
+}
+}
+
Apart from clinical diagnostics, pre-coordinated phenotype terms are used in many other contexts such as model organism research (e.g. IMPC) or the curation of Genome Wide Association Studies.
+ +Post-coordinated phenotype curation simply means that the different constituents of phenotype (characteristic, bearer, modifier etc) are captured individually.
+This has certain advantages.
+For example, the phenotype space is enormous, as you can measure variations in many observable charactertics from chemical entities present in the blood, the microbiome to a host of morphological and developmental abnormalities. Instead of having individual (controlled vocabulary) terms for increased level of X
, decreased level X
, abnormal level of X
, increased level of X in blood
for thousands of chemical compounds synthesized by the human body, you just have "increased level", "blood" and all the chemical compounds.
There are at least three flavours of post-coordinated phenotype curation prevalent in the biomedical domain, four if you count quantified phenotypes:
+ + +Trait + modifier pattern is used for example by databases such as the Saccharomyces Genome Database (SGD). Here are some examples:
+dateAssigned | +evidence/publicationId | +objectId | +phenotypeStatement | +phenotypeTermIdentifiers/0/termId | +phenotypeTermIdentifiers/1/termId | +conditionRelations/0/conditions/0/chemicalOntologyId | +conditionRelations/0/conditions/0/conditionClassId | +
---|---|---|---|---|---|---|---|
2010-07-08T00:07:00-00:00 | +PMID:1406694 | +SGD:S000003901 | +abnormal RNA accumulation | +APO:0000002 | +APO:0000224 | ++ | + |
2006-05-05T00:05:00-00:00 | +PMID:785224 | +SGD:S000000854 | +decreased resistance to chemicals | +APO:0000003 | +APO:0000087 | +CHEBI:78661 | +ZECO:0000111 | +
2010-07-07T00:07:00-00:00 | +PMID:10545447 | +SGD:S000000969 | +decreased cell size | +APO:0000003 | +APO:0000052 | ++ | + |
APO:0000002
(abnormal) and APO:0000003
(decreased) are modifiers.APO:0000087
(resistance to chemicals), APO:0000224
(RNA accumulation), APO:0000052
(cell size) are biological attributes/traits.CHEBI:78661
is recorded as an experimental condition, but should probably be interpreted as part of the bearer expression.Data was obtained from the Alliance of Genome Resources on the 30.03.2023 and simplified for illustration.
+ +The bearer-only pattern is used by many databases, such as Flybase. +In the data, we only find references of bearers, such as anatomical entities or biological processes. +Instead of explicitly stating phenotypic modifiers (abnormal, morphology, changed), it is implicit in the definition of the dataset.
+dateAssigned | +evidence/crossReference/id | +evidence/publicationId | +objectId | +phenotypeStatement | +phenotypeTermIdentifiers/0/termId | +
---|---|---|---|---|---|
2024-01-05T11:54:24-05:00 | +FB:FBrf0052655 | +PMID:2385293 | +FB:FBal0016988 | +embryonic telson | +FBbt:00000184 | +
2024-01-05T11:54:24-05:00 | +FB:FBrf0058077 | +PMID:8223248 | +FB:FBal0001571 | +larva | +FBbt:00001727 | +
FBbt:00000184
(embryonic telson) and FBbt:00001727
(larva) are bearer terms.Data was obtained from the Alliance of Genome Resources on the 30.03.2023 and simplified for illustration.
+ +The most complex pattern for phenotype descriptions which essentially decomposes the entire phenotype expression into atomic consituents can be found, for example, in the The Zebrafish Information Network (ZFIN).
+Examples:
+Fish ID | +Affected Structure or Process 1 subterm ID | +Affected Structure or Process 1 subterm Name | +Post-composed Relationship ID | +Post-composed Relationship Name | +Affected Structure or Process 1 superterm ID | +Affected Structure or Process 1 superterm Name | +Phenotype Keyword ID | +Phenotype Keyword Name | +Phenotype Tag | +Affected Structure or Process 2 subterm ID | +Affected Structure or Process 2 subterm name | +Post-composed Relationship (rel) ID | +Post-composed Relationship (rel) Name | +Affected Structure or Process 2 superterm ID | +Affected Structure or Process 2 superterm name | +Publication ID | +
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ZDB-FISH-150901-29105 | +ZFA:0009366 | +hair cell | +BFO:0000050 | +part_of | +ZFA:0000051 | +otic vesicle | +PATO:0000374 | +increased distance | +abnormal | +ZFA:0009366 | +hair cell | +BFO:0000050 | +part_of | +ZFA:0000051 | +otic vesicle | +ZDB-PUB-171025-12 | +
ZDB-FISH-150901-29105 | +ZFA:0009366 | +hair cell | +BFO:0000050 | +part_of | +ZFA:0000051 | +otic vesicle | +PATO:0000374 | +increased distance | +abnormal | +ZFA:0009366 | +hair cell | +BFO:0000050 | +part_of | +ZFA:0000051 | +otic vesicle | +ZDB-PUB-171025-12 | +
ZDB-FISH-150901-11537 | ++ | + | + | + | ZFA:0000051 | +otic vesicle | +PATO:0001905 | +has normal numbers of parts of type | +normal | +ZFA:0009366 | +hair cell | +BFO:0000050 | +part_of | +ZFA:0000051 | +otic vesicle | +ZDB-PUB-150318-1 | +
ZDB-FISH-150901-18770 | ++ | + | + | + | ZFA:0000119 | +retinal inner nuclear layer | +PATO:0002001 | +has fewer parts of type | +abnormal | +ZFA:0009315 | +horizontal cell | +BFO:0000050 | +part_of | +ZFA:0000119 | +retinal inner nuclear layer | +ZDB-PUB-130222-28 | +
ZDB-FISH-190806-7 | +BSPO:0000084 | +ventral region | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +PATO:0002001 | +has fewer parts of type | +abnormal | +ZFA:0009301 | +dopaminergic neuron | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +ZDB-PUB-190216-5 | +
ZDB-FISH-190807-7 | +BSPO:0000084 | +ventral region | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +PATO:0001905 | +has normal numbers of parts of type | +normal | +ZFA:0009301 | +dopaminergic neuron | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +ZDB-PUB-190216-5 | +
ZDB-FISH-190807-8 | +BSPO:0000084 | +ventral region | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +PATO:0002001 | +has fewer parts of type | +abnormal | +ZFA:0009301 | +dopaminergic neuron | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +ZDB-PUB-190216-5 | +
ZDB-FISH-150901-29105 | ++ | + | + | + | ZFA:0000101 | +diencephalon | +PATO:0001555 | +has number of | +normal | +ZFA:0009301 | +dopaminergic neuron | +BFO:0000050 | +part_of | +ZFA:0000101 | +diencephalon | +ZDB-PUB-161120-7 | +
ZDB-FISH-210421-9 | +ZFA:0009290 | +glutamatergic neuron | +BFO:0000050 | +part_of | +ZFA:0000008 | +brain | +PATO:0040043 | +increased proportionality to | +abnormal | +ZFA:0009276 | +GABAergic neuron | +BFO:0000050 | +part_of | +ZFA:0000008 | +brain | +ZDB-PUB-191011-2 | +
ZDB-FISH-210421-9 | +ZFA:0009290 | +glutamatergic neuron | +BFO:0000050 | +part_of | +ZFA:0000008 | +brain | +PATO:0040043 | +increased proportionality to | +abnormal | +ZFA:0009276 | +GABAergic neuron | +BFO:0000050 | +part_of | +ZFA:0000008 | +brain | +ZDB-PUB-191011-2 | +
Lets break down the second to last row:
+Data was obtained from ZFIN (Phenotype of Zebrafish Genes) on the 30.03.2023 and simplified for illustration.
+As one can see in the last example, bearers can be anything from simple atomic entities to arbitrarily complex compositions:
+lysine
)lysine
part_of blood
)lysine
part_of cell
part_of (muscle
part of heart
))lysine
part_of (cytoplasm
part_of (cell
part_of (muscle
part of heart
))))Phenotype data can be standardised to varying degrees. It is not uncommon for data to be completely unstandardised. +Unfortunately, only a fraction of the available data is actually annotated using terms from controlled phenotype ontologies. +Here are some of the more "typical" kinds of data on the standardised/non-standardised spectrum:
+Qualitative and quantitative phenotype data represent two fundamental ways of describing characteristics or traits in biology, each providing different types of information:
+Qualitative Phenotype Data:
+Quantitative Phenotype Data:
+Qualitative data is descriptive and categorical, while quantitative data is numerical and measurable. Both types are essential for a comprehensive understanding of phenotypic traits, each offering unique insights into biological variation and complexity.
+ + + + + + +Phenotype ontologies use different reference ontologies for their EQs. Everything in uPheno is integrated towards a common set of reference ontologies, in particular Uberon and CL. In order to integrate species-independent anatomy ontologies we employ the following workflow for phenotype ontologies:
+When two classes are merged in uPheno based on a cross-species mapping, we assert the most general common ancestor as parent.
+ + + + + + +Cross-species data in biomedical knowledge graphs (Kids First)
+Association prediction
+Welcome to the UPHENO documentation!
It is entirely empty at the moment so look no further!
You can find descriptions of the standard ontology engineering workflows here.
"},{"location":"about/","title":"About uPheno","text":"The uPheno project aims to unify the annotation of phenotypes across species in a manner analogous to unification of gene function annotation by the Gene Ontology. uPheno 2.0 builds on earlier efforts with a strategy that directly leverages the work of the phenotype ontology development community and incorporates phenotypes from a much wider range of species. We have organised a collaborative community effort, including representatives of all major model organism databases, to document and align formal design patterns for representing phenotypes and further develop reference ontologies, such as PATO, which are used in these patterns. A common development infrastructure makes it easy to use these design patterns to generate both species-specific ontologies and a species-independent layer that subsumes them. The resulting community-curated ontology for the representation and integration of phenotypes across species serves two general purposes: - Providing a community-developed framework for ontology editors to bootstrap, maintain and extend their phenotype ontologies in a scalable and standardised manner. - Facilitating the retrieval and comparative analysis of species-specific phenotypes through a deep layer of species-independent phenotypes.
Currently, the development of uPheno is organized by a group that meets biweekly. See the meetings page for more info, including how to participate.
"},{"location":"cite/","title":"How to cite uPheno","text":""},{"location":"cite/#papers","title":"Papers","text":""},{"location":"cite/#upheno-2","title":"uPheno 2","text":"EQ definitions are powerful tools for reconciling phenotypes across species and driving reasoning. However, they are not all that useful for many \"normal\" users of our ontologies.
We have developed a little workflow extension to take care of that.
src/ontology/mp-odk.yaml
): components:\n products:\n - filename: eq-relations.owl\n
http://purl.obolibrary.org/obo/YOURONTOLOGY/components/eq-relations.owl
. For example, for MP, the IRI is http://purl.obolibrary.org/obo/mp/components/eq-relations.owl
.sh run.sh make components/eq-relations.owl\n
This command will be run automatically during a release (prepare_release
).The custom uPheno Makefile is an extension to your normal custom Makefile (for example, hp.Makefile, mp.Makefile, etc), located in the src/ontology directory of your ODK set up.
To install it:
(1) Open your normal custom Makefile and add a line in the very end:
include pheno.Makefile\n
(2) Now download the custom Makefile:
https://raw.githubusercontent.com/obophenotype/upheno/master/src/ontology/config/pheno.Makefile
and save it in your src/ontology
directory.
Feel free to use, for example, wget:
cd src/ontology\nwget https://raw.githubusercontent.com/obophenotype/upheno/master/src/ontology/config/pheno.Makefile -O pheno.Makefile\n
From now on you can simply run
sh run.sh make update_pheno_makefile\n
whenever you wish to synchronise the Makefile with the uPheno repo.
(Note: it would probably be good to add a GitHub action that does that automatically.)
"},{"location":"howto/editors_workflow/","title":"Phenotype Ontology Editors' Workflow","text":""},{"location":"howto/editors_workflow/#useful-links","title":"Useful links","text":"brew install yamllint
error line too long
yaml syntax errors for dos-dp yaml templates. You can create a custom configuration file for yamllint in your home folder: touch ~/.config/yamllint/config\n
The content of the config file should look like this: # Custom configuration file for yamllint\n# It extends the default conf by adjusting some options.\nextends: default\nrules:\nline-length:\nmax: 80 # 80 chars should be enough, but don't fail if a line is longer\n# max: 140 # allow long lines\nlevel: warning\nallow-non-breakable-words: true\nallow-non-breakable-inline-mappings: true\n
The custom config should turn the error line too long
errors to warnings.pip install dosdp
Patternisation is the process of ensuring that all entity quality (EQ) descriptions from textual phenotype term definitions have a logical definition pattern. A pattern is a standard format for describing a phenotype that includes a quality and an entity. For example, \"increased body size\" is a pattern that includes the quality \"increased\" and the entity \"body size.\" The goal of patternisation is to make the EQ descriptions more uniform and machine-readable, which facilitates downstream analysis.
"},{"location":"howto/editors_workflow/#1-identify-a-group-of-related-phenotypes-from-diverse-organisms","title":"1. Identify a group of related phenotypes from diverse organisms","text":"The first step in the Phenotype Ontology Editors' Workflow is to identify a group of related phenotypes from diverse organisms. This can be done by considering proposals from phenotype editors or by using the pattern suggestion pipeline. The phenotype editors may propose a group of related phenotypes based on their domain knowledge, while the pattern suggestion pipeline uses semantic similarity and shared Phenotype And Trait Ontology (PATO) quality terms to identify patterns in phenotype terms from different organism-specific ontologies.
"},{"location":"howto/editors_workflow/#2-propose-a-phenotype-pattern","title":"2. Propose a phenotype pattern","text":"Once a group of related phenotypes is identified, the editors propose a phenotype pattern. To do this, they create a Github issue to request the phenotype pattern template in the uPheno repository. Alternatively, a new template can be proposed at a phenotype editors' meeting which can lead to the creation of a new term request as a Github issue. Ideally, the proposed phenotype pattern should include an appropriate PATO quality term for logical definition, use cases, term examples, and a textual definition pattern for the phenotype terms.
"},{"location":"howto/editors_workflow/#3-discuss-the-new-phenotype-pattern-draft-at-the-regular-upheno-phenotype-editors-meeting","title":"3. Discuss the new phenotype pattern draft at the regular uPheno phenotype editors meeting","text":"The next step is to discuss the new phenotype pattern draft at the regular uPheno phenotype editors meeting. During the meeting, the editors' comments and suggestions for improvements are collected as comments on the DOS-DP yaml
template in the corresponding Github pull request. Based on the feedback and discussions, a consensus on improvements should be achieved. The DOS-DP yaml
template is named should start with a lower case letter, should be informative, and must include the PATO quality term. A Github pull request is created for the DOS-DP yaml
template.
---\npattern_name: ??pattern_and_file_name\npattern_iri: http://purl.obolibrary.org/obo/upheno/patterns-dev/??pattern_and_file_name.yaml\ndescription: 'A description that helps people chose this pattern for the appropriate scenario.'\n# examples:\n# - example_IRI-1 # term name\n# - example_IRI-2 # term name\n# - example_IRI-3 # term name\n# - http://purl.obolibrary.org/obo/XXXXXXXXXX # XXXXXXXX\ncontributors:\n- https://orcid.org/XXXX-XXXX-XXXX-XXXX # Yyy Yyyyyyyyy\nclasses:\nprocess_quality: PATO:0001236\nabnormal: PATO:0000460\nanatomical_entity: UBERON:0001062\nrelations:\ncharacteristic_of: RO:0000052\nhas_modifier: RO:0002573\nhas_part: BFO:0000051\nannotationProperties:\nexact_synonym: oio:hasExactSynonym\nrelated_synonym: oio:hasRelatedSynonym\nxref: oio:hasDbXref\nvars:\nvar??: \"'anatomical_entity'\" # \"'variable_range'\"\nname:\ntext: \"trait ?? %s\"\nvars:\n- var??\nannotations:\n- annotationProperty: exact_synonym\ntext: \"? of %s\"\nvars:\n- var??\n- annotationProperty: related_synonym\ntext: \"? %s\"\nvars:\n- var??\n- annotationProperty: xref\ntext: \"AUTO:patterns/patterns/chemical_role_attribute\"\ndef:\ntext: \"A trait that ?? %s.\"\nvars:\n- var??\nequivalentTo:\ntext: \"'has_part' some (\n'XXXXXXXXXXXXXXXXX' and\n('characteristic_of' some %s) and\n('has_modifier' some 'abnormal')\n)\"\nvars:\n- var??\n...\n
"},{"location":"howto/editors_workflow/#4-review-the-candidate-phenotype-pattern","title":"4. Review the candidate phenotype pattern","text":"Once a consensus on the improvements for a particular template is achieved, they are incorporated into the DOS-DP yaml
file. Typically, the improvements are applied to the template some time before a subsequent ontology editor's meeting. There should be enough time for off-line review of the proposed pattern to allow community feedback. The improved phenotype pattern candidate draft should get approval from the community at one of the regular ontology editors' call or in a Github comment. The ontology editors who approve the pattern provide their ORCIDs and they are credited as contributors in an appropriate field of the DOS-DP pattern template.
Once the community-approved phenotype pattern template is created, it is added to the uPheno Github repository. The approved DOS-DP yaml
phenotype pattern template should pass quality control (QC) steps. 1. Validate yaml syntax: yamllint 2. Validate DOS-DP Use DOSDP Validator. * To validate a template using the command line interface, execute: ```sh yamllint dosdp validate -i
After successfully passing QC, the responsible editor merges the approved pull request, and the phenotype pattern becomes part of the uPheno phenotype pattern template collection.
"},{"location":"howto/pattern-merge-replace-workflow/","title":"Pattern merge - replace workflow","text":"This document is on how to merge new DOSDP design patterns into an ODK ontology and then how to replace the old classes with the new ones.
"},{"location":"howto/pattern-merge-replace-workflow/#1-you-need-the-tables-in-tsv-format-with-the-dosdp-filler-data-download-the-tsv-tables-to","title":"1. You need the tables in tsv format with the DOSDP filler data. Download the tsv tables to","text":"$ODK-ONTOLOGY/src/patterns/data/default/\n
Make sure that the tsv filenames match that of the relevant yaml DOSDP pattern files.
"},{"location":"howto/pattern-merge-replace-workflow/#2-add-the-new-matching-pattern-yaml-filename-to","title":"2. Add the new matching pattern yaml filename to","text":"$ODK-ONTOLOGY/src/patterns/dosdp-patterns/external.txt\n
"},{"location":"howto/pattern-merge-replace-workflow/#3-import-the-new-pattern-templates-that-you-have-just-added-to-the-externaltxt-list-from-external-sources-into-the-current-working-repository","title":"3. Import the new pattern templates that you have just added to the external.txt
list from external sources into the current working repository","text":"cd ODK-ONTOLOGY/src/ontology\nsh run.sh make update_patterns\n
"},{"location":"howto/pattern-merge-replace-workflow/#4-make-definitionsowl","title":"4. make definitions.owl","text":"cd ODK-ONTOLOGY/src/ontology\nsh run.sh make ../patterns/definitions.owl IMP=false\n
"},{"location":"howto/pattern-merge-replace-workflow/#5-remove-old-classes-and-replace-them-with-the-equivalent-and-patternised-new-classes","title":"5. Remove old classes and replace them with the equivalent and patternised new classes","text":"cd ODK-ONTOLOGY/src/ontology\nsh run.sh make remove_patternised_classes\n
"},{"location":"howto/pattern-merge-replace-workflow/#6-announce-the-pattern-migration-in-an-appropriate-channel-for-example-on-the-phenotype-ontologies-slack-channel","title":"6. Announce the pattern migration in an appropriate channel, for example on the phenotype-ontologies Slack channel.","text":"For example:
I have migrated the ... table and changed the tab colour to blue. You can delete the tab if you wish.
"},{"location":"howto/run-upheno2-release/","title":"How to run a uPheno 2 release","text":"In order to run a release you will have to have completed the steps to set up s3.
cd src/scripts
sh upheno_pipeline.sh
cd ../ontology
make prepare_upload S3_VERSION=2022-06-19
make deploy S3_VERSION=2022-06-19
To be able to upload new uPheno release to the uPheno S3 bucket, you need to set yourself up for S3 first.
The most convenient way to interact with S3 is the AWS Command Line Interface (CLI). You can find the installers and install instructions on that page (different depending on your Operation System): - For Mac - For Windows
"},{"location":"howto/set-up-s3/#2-obtain-secrets-from-bbop","title":"2. Obtain secrets from BBOP","text":"Next, you need to ask someone at BBOP (such as Chris Mungall or Seth Carbon) to provide you with an account that gives you access to the BBOP s3 buckets. You will have to provide a username. You will receive: - User name - Access key ID- - Secret access key - Console link to sign into bucket
"},{"location":"howto/set-up-s3/#3-add-configuration-for-secrets","title":"3. Add configuration for secrets","text":"You will now have to set up your local system. You will create two files:
$ less ~/.aws/config \n[default]\nregion = us-east-1\n
and
$ less ~/.aws/credentials\n[default]\naws_access_key_id = ***\naws_secret_access_key = ***\n
in ~/.aws/credentials
make sure you add the correct keys as provided above.
Now, you should be set up to write to your s3 bucket. Note that in order for your data to be accessible through https
after your upload, you need to add --acl public read
.
aws s3 sync --exclude \"*.DS_Store*\" my/data-dir s3://bbop-ontologies/myproject/data-dir --acl public-read\n
If you have previously pushed data to the same location, you wont be able to set it to \"publicly readable\" by simply rerunning the sync command. If you want to publish previously private data, follow the instructions here, e.g.:
aws s3api put-object-acl --bucket s3://bbop-ontologies/myproject/data-dir --key exampleobject --acl public-read\n
"},{"location":"odk-workflows/","title":"Default ODK Workflows","text":"Historically, most repos have been using Travis CI for continuous integration testing and building, but due to runtime restrictions, we recently switched a lot of our repos to GitHub actions. You can set up your repo with CI by adding this to your configuration file (src/ontology/upheno-odk.yaml):
ci:\n - github_actions\n
When updateing your repo, you will notice a new file being added: .github/workflows/qc.yml
.
This file contains your CI logic, so if you need to change, or add anything, this is the place!
Alternatively, if your repo is in GitLab instead of GitHub, you can set up your repo with GitLab CI by adding this to your configuration file (src/ontology/upheno-odk.yaml):
ci:\n - gitlab-ci\n
This will add a file called .gitlab-ci.yml
in the root of your repo.
The editors workflow is one of the formal workflows to ensure that the ontology is developed correctly according to ontology engineering principles. There are a few different editors workflows:
This document only covers the first editing workflow, but more will be added in the future
"},{"location":"odk-workflows/EditorsWorkflow/#local-editing-workflow","title":"Local editing workflow","text":"Workflow requirements:
Ensure that there is a ticket on your issue tracker that describes the change you are about to make. While this seems optional, this is a very important part of the social contract of building an ontology - no change to the ontology should be performed without a good ticket, describing the motivation and nature of the intended change.
"},{"location":"odk-workflows/EditorsWorkflow/#2-update-main-branch","title":"2. Update main branch","text":"In your local environment (e.g. your laptop), make sure you are on the main
(prev. master
) branch and ensure that you have all the upstream changes, for example:
git checkout master\ngit pull\n
"},{"location":"odk-workflows/EditorsWorkflow/#3-create-feature-branch","title":"3. Create feature branch","text":"Create a new branch. Per convention, we try to use meaningful branch names such as: - issue23removeprocess (where issue 23 is the related issue on GitHub) - issue26addcontributor - release20210101 (for releases)
On your command line, this looks like this:
git checkout -b issue23removeprocess\n
"},{"location":"odk-workflows/EditorsWorkflow/#4-perform-edit","title":"4. Perform edit","text":"Using your editor of choice, perform the intended edit. For example:
Prot\u00e9g\u00e9
src/ontology/upheno-edit.owl
in Prot\u00e9g\u00e9TextEdit
src/ontology/upheno-edit.owl
in TextEdit (or Sublime, Atom, Vim, Nano)Consider the following when making the edit.
src/ontology/upheno-edit.owl
src/ontology/components
), see here.This step is very important. Rather than simply trusting your change had the intended effect, we should always use a git diff as a first pass for sanity checking.
In our experience, having a visual git client like GitHub Desktop or sourcetree is really helpful for this part. In case you prefer the command line:
git status\ngit diff\n
"},{"location":"odk-workflows/EditorsWorkflow/#5-quality-control","title":"5. Quality control","text":"Now it's time to run your quality control checks. This can either happen locally (5a) or through your continuous integration system (7/5b).
"},{"location":"odk-workflows/EditorsWorkflow/#5a-local-testing","title":"5a. Local testing","text":"If you chose to run your test locally:
sh run.sh make IMP=false test\n
This will run the whole set of configured ODK tests on including your change. If you have a complex DOSDP pattern pipeline you may want to add PAT=false
to skip the potentially lengthy process of rebuilding the patterns. sh run.sh make IMP=false PAT=false test\n
"},{"location":"odk-workflows/EditorsWorkflow/#6-pull-request","title":"6. Pull request","text":"When you are happy with the changes, you commit your changes to your feature branch, push them upstream (to GitHub) and create a pull request. For example:
git add NAMEOFCHANGEDFILES\ngit commit -m \"Added biological process term #12\"\ngit push -u origin issue23removeprocess\n
Then you go to your project on GitHub, and create a new pull request from the branch, for example: https://github.com/INCATools/ontology-development-kit/pulls
There is a lot of great advise on how to write pull requests, but at the very least you should: - mention the tickets affected: see #23
to link to a related ticket, or fixes #23
if, by merging this pull request, the ticket is fixed. Tickets in the latter case will be closed automatically by GitHub when the pull request is merged. - summarise the changes in a few sentences. Consider the reviewer: what would they want to know right away. - If the diff is large, provide instructions on how to review the pull request best (sometimes, there are many changed files, but only one important change).
If you didn't run and local quality control checks (see 5a), you should have Continuous Integration (CI) set up, for example: - Travis - GitHub Actions
More on how to set this up here. Once the pull request is created, the CI will automatically trigger. If all is fine, it will show up green, otherwise red.
"},{"location":"odk-workflows/EditorsWorkflow/#8-community-review","title":"8. Community review","text":"Once all the automatic tests have passed, it is important to put a second set of eyes on the pull request. Ontologies are inherently social - as in that they represent some kind of community consensus on how a domain is organised conceptually. This seems high brow talk, but it is very important that as an ontology editor, you have your work validated by the community you are trying to serve (e.g. your colleagues, other contributors etc.). In our experience, it is hard to get more than one review on a pull request - two is great. You can set up GitHub branch protection to actually require a review before a pull request can be merged! We recommend this.
This step seems daunting to some hopefully under-resourced ontologies, but we recommend to put this high up on your list of priorities - train a colleague, reach out!
"},{"location":"odk-workflows/EditorsWorkflow/#9-merge-and-cleanup","title":"9. Merge and cleanup","text":"When the QC is green and the reviews are in (approvals), it is time to merge the pull request. After the pull request is merged, remember to delete the branch as well (this option will show up as a big button right after you have merged the pull request). If you have not done so, close all the associated tickets fixed by the pull request.
"},{"location":"odk-workflows/EditorsWorkflow/#10-changelog-optional","title":"10. Changelog (Optional)","text":"It is sometimes difficult to keep track of changes made to an ontology. Some ontology teams opt to document changes in a changelog (simply a text file in your repository) so that when release day comes, you know everything you have changed. This is advisable at least for major changes (such as a new release system, a new pattern or template etc.).
"},{"location":"odk-workflows/ManageAutomatedTest/","title":"Manage automated tests","text":""},{"location":"odk-workflows/ManageAutomatedTest/#constraint-violation-checks","title":"Constraint violation checks","text":"We can define custom checks using SPARQL. SPARQL queries define bad modelling patterns (missing labels, misspelt URIs, and many more) in the ontology. If these queries return any results, then the build will fail. Custom checks are designed to be run as part of GitHub Actions Continuous Integration testing, but they can also run locally.
"},{"location":"odk-workflows/ManageAutomatedTest/#steps-to-add-a-constraint-violation-check","title":"Steps to add a constraint violation check:","text":"src/sparql
. The name of the file should end with -violation.sparql
. Please give a name that helps to understand which violation the query wants to check.src/ontology/uberon-odk.yaml
:-violation.sparql
part) to the list inside the key custom_sparql_checks
that is inside robot_report
key.If the robot_report
or custom_sparql_checks
keys are not available, please add this code block to the end of the file.
robot_report:\nrelease_reports: False\nfail_on: ERROR\nuse_labels: False\ncustom_profile: True\nreport_on:\n- edit\ncustom_sparql_checks:\n- name-of-the-file-check\n
3. Update the repository so your new SPARQL check will be included in the QC. sh run.sh make update_repo\n
"},{"location":"odk-workflows/ManageDocumentation/","title":"Updating the Documentation","text":"The documentation for UPHENO is managed in two places (relative to the repository root):
docs
directory contains all the files that pertain to the content of the documentation (more below)mkdocs.yaml
file contains the documentation config, in particular its navigation bar and theme.The documentation is hosted using GitHub pages, on a special branch of the repository (called gh-pages
). It is important that this branch is never deleted - it contains all the files GitHub pages needs to render and deploy the site. It is also important to note that the gh-pages branch should never be edited manually. All changes to the docs happen inside the docs
directory on the main
branch.
All the documentation is contained in the docs
directory, and is managed in Markdown. Markdown is a very simple and convenient way to produce text documents with formatting instructions, and is very easy to learn - it is also used, for example, in GitHub issues. This is a normal editing workflow:
.md
file you want to change in an editor of choice (a simple text editor is often best). IMPORTANT: Do not edit any files in the docs/odk-workflows/
directory. These files are managed by the ODK system and will be overwritten when the repository is upgraded! If you wish to change these files, make an issue on the ODK issue tracker.The documentation is not automatically updated from the Markdown, and needs to be deployed deliberately. To do this, perform the following steps:
cd upheno/src/ontology\n
sh run.sh make update_docs\n
Mkdocs now sets off to build the site from the markdown pages. You will be asked toIf everything was successful, you will see a message similar to this one:
INFO - Your documentation should shortly be available at: https://obophenotype.github.io/upheno/ \n
3. Just to double check, you can now navigate to your documentation pages (usually https://obophenotype.github.io/upheno/). Just make sure you give GitHub 2-5 minutes to build the pages!"},{"location":"odk-workflows/ReleaseWorkflow/","title":"The release workflow","text":"The release workflow recommended by the ODK is based on GitHub releases and works as follows:
These steps are outlined in detail in the following.
"},{"location":"odk-workflows/ReleaseWorkflow/#run-a-release-with-the-odk","title":"Run a release with the ODK","text":"Preparation:
git status
should say that there are no modified files)git pull
)git checkout -b release-2021-01-01
)docker pull obolibrary/odkfull
To actually run the release, you:
cd upheno/src/ontology
)sh run.sh make prepare_release -B
. Note that for some ontologies, this process can take up to 90 minutes - especially if there are large ontologies you depend on, like PRO or CHEBI.Release files are now in ../.. - now you should commit, push and make a release on your git hosting site such as GitHub or GitLab
.This will create all the specified release targets (OBO, OWL, JSON, and the variants, ont-full and ont-base) and copy them into your release directory (the top level of your repo).
"},{"location":"odk-workflows/ReleaseWorkflow/#review-the-release","title":"Review the release","text":"upheno.obo
- this reflects a useful subset of the whole ontology (everything that can be covered by OBO format). OBO format has that speaking for it: it is very easy to review!upheno-base.owl
- this reflects the asserted axioms in your ontology that you have actually edited.upheno-full.owl
, which may reveal interesting new inferences you did not know about. Note that the diff of this file is sometimes quite large.Once your CI checks have passed, and your reviews are completed, you can now merge the branch into your main branch (don't forget to delete the branch afterwards - a big button will appear after the merge is finished).
"},{"location":"odk-workflows/ReleaseWorkflow/#create-a-github-release","title":"Create a GitHub release","text":"upheno.obo
file and check the data-version:
property. The date needs to be prefixed with a v
, so, for example v2020-02-06
.When you are dealing with large ontologies, you need a lot of memory. When you see error messages relating to large ontologies such as CHEBI, PRO, NCBITAXON, or Uberon, you should think of memory first, see here.
"},{"location":"odk-workflows/ReleaseWorkflow/#problems-when-using-obo-format-based-tools","title":"Problems when using OBO format based tools","text":"Sometimes you will get cryptic error messages when using legacy tools using OBO format, such as the ontology release tool (OORT), which is also available as part of the ODK docker container. In these cases, you need to track down what axiom or annotation actually caused the breakdown. In our experience (in about 60% of the cases) the problem lies with duplicate annotations (def
, comment
) which are illegal in OBO. Here is an example recipe of how to deal with such a problem:
make: *** [cl.Makefile:84: oort] Error 255
you might have a OORT error. sh run.sh make IMP=false PAT=false oort -B
(assuming you are already in the ontology folder in your directory) upheno-edit.owl
in Prot\u00e9g\u00e9 and find the offending term and delete all offending issue (e.g. delete ALL definition, if the problem was \"multiple def tags not allowed\") and save. *While this is not idea, as it will remove all definitions from that term, it will be added back again when the term is fixed in the ontology it was imported from and added back in.sh run.sh make IMP=false PAT=false oort -B
and if it all passes, commit your changes to a branch and make a pull request as usual.Your ODK repositories configuration is managed in src/ontology/upheno-odk.yaml
. Once you have made your changes, you can run the following to apply your changes to the repository:
sh run.sh make update_repo\n
There are a large number of options that can be set to configure your ODK, but we will only discuss a few of them here.
NOTE for Windows users:
You may get a cryptic failure such as Set Illegal Option -
if the update script located in src/scripts/update_repo.sh
was saved using Windows Line endings. These need to change to unix line endings. In Notepad++, for example, you can click on Edit->EOL Conversion->Unix LF to change this.
You can use the update repository workflow described on this page to perform the following operations to your imports:
We will discuss all these workflows in the following.
"},{"location":"odk-workflows/RepoManagement/#add-new-import","title":"Add new import","text":"To add a new import, you first edit your odk config as described above, adding an id
to the product
list in the import_group
section (for the sake of this example, we assume you already import RO, and your goal is to also import GO):
import_group:\n products:\n - id: ro\n - id: go\n
Note: our ODK file should only have one import_group
which can contain multiple imports (in the products
section). Next, you run the update repo workflow to apply these changes. Note that by default, this module is going to be a SLME Bottom module, see here. To change that or customise your module, see section \"Customise an import\". To finalise the addition of your import, perform the following steps:
src/ontology/upheno-edit.owl
file. We suggest to do this using a text editor, by simply copying an existing import declaration and renaming it to the new ontology import, for example as follows: ...\nOntology(<http://purl.obolibrary.org/obo/upheno.owl>\nImport(<http://purl.obolibrary.org/obo/upheno/imports/ro_import.owl>)\nImport(<http://purl.obolibrary.org/obo/upheno/imports/go_import.owl>)\n...\n
src/ontology/catalog-v001.xml
, for example: <uri name=\"http://purl.obolibrary.org/obo/upheno/imports/go_import.owl\" uri=\"imports/go_import.owl\"/>\n
Note: The catalog file src/ontology/catalog-v001.xml
has one purpose: redirecting imports from URLs to local files. For example, if you have
Import(<http://purl.obolibrary.org/obo/upheno/imports/go_import.owl>)\n
in your editors file (the ontology) and
<uri name=\"http://purl.obolibrary.org/obo/upheno/imports/go_import.owl\" uri=\"imports/go_import.owl\"/>\n
in your catalog, tools like robot
or Prot\u00e9g\u00e9 will recognize the statement in the catalog file to redirect the URL http://purl.obolibrary.org/obo/upheno/imports/go_import.owl
to the local file imports/go_import.owl
(which is in your src/ontology
directory).
If you simply wish to refresh your import in light of new terms, see here. If you wish to change the type of your module see section \"Customise an import\".
"},{"location":"odk-workflows/RepoManagement/#remove-an-existing-import","title":"Remove an existing import","text":"To remove an existing import, perform the following steps:
src/ontology/upheno-edit.owl
.src/ontology/upheno-odk.yaml
, eg. - id: go
from the list of products
in the import_group
.src/imports/go_import.owl
src/imports/go_terms.txt
src/ontology/catalog-v001.xml
file.By default, an import module extracted from a source ontology will be a SLME module, see here. There are various options to change the default.
The following change to your repo config (src/ontology/upheno-odk.yaml
) will switch the go import from an SLME module to a simple ROBOT filter module:
import_group:\n products:\n - id: ro\n - id: go\n module_type: filter\n
A ROBOT filter module is, essentially, importing all external terms declared by your ontology (see here on how to declare external terms to be imported). Note that the filter
module does not consider terms/annotations from namespaces other than the base-namespace of the ontology itself. For example, in the example of GO above, only annotations / axioms related to the GO base IRI (http://purl.obolibrary.org/obo/GO_) would be considered. This behaviour can be changed by adding additional base IRIs as follows:
import_group:\n products:\n - id: go\n module_type: filter\n base_iris:\n - http://purl.obolibrary.org/obo/GO_\n - http://purl.obolibrary.org/obo/CL_\n - http://purl.obolibrary.org/obo/BFO\n
If you wish to customise your import entirely, you can specify your own ROBOT command to do so. To do that, add the following to your repo config (src/ontology/upheno-odk.yaml
):
import_group:\n products:\n - id: ro\n - id: go\n module_type: custom\n
Now add a new goal in your custom Makefile (src/ontology/upheno.Makefile
, not src/ontology/Makefile
).
imports/go_import.owl: mirror/ro.owl imports/ro_terms_combined.txt\n if [ $(IMP) = true ]; then $(ROBOT) query -i $< --update ../sparql/preprocess-module.ru \\\n extract -T imports/ro_terms_combined.txt --force true --individuals exclude --method BOT \\\n query --update ../sparql/inject-subset-declaration.ru --update ../sparql/postprocess-module.ru \\\n annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) --output $@.tmp.owl && mv $@.tmp.owl $@; fi\n
Now feel free to change this goal to do whatever you wish it to do! It probably makes some sense (albeit not being a strict necessity), to leave most of the goal instead and replace only:
extract -T imports/ro_terms_combined.txt --force true --individuals exclude --method BOT \\\n
to another ROBOT pipeline.
"},{"location":"odk-workflows/RepoManagement/#add-a-component","title":"Add a component","text":"A component is an import which belongs to your ontology, e.g. is managed by you and your team.
src/ontology/upheno-odk.yaml
components
components
section, add a new section called products
. This is where all your components are specifiedproducts
section, add a new component, e.g. - filename: mycomp.owl
Example
components:\n products:\n - filename: mycomp.owl\n
When running sh run.sh make update_repo
, a new file src/ontology/components/mycomp.owl
will be created which you can edit as you see fit. Typical ways to edit:
components/mycomp.owl:
make target in src/ontology/upheno.Makefile
and provide a custom command to generate the componentWARNING
: Note that the custom rule to generate the component MUST NOT depend on any other ODK-generated file such as seed files and the like (see issue).src/ontology/upheno-odk.yaml
, source
, to specify that this component should simply be downloaded from somewhere on the web.Since ODK 1.3.2, it is possible to simply link a ROBOT template to a component without having to specify any of the import logic. In order to add a new component that is connected to one or more template files, follow these steps:
src/ontology/upheno-odk.yaml
.use_templates: TRUE
is set in the global project options. You should also make sure that use_context: TRUE
is set in case you are using prefixes in your templates that are not known to robot
, such as OMOP:
, CPONT:
and more. All non-standard prefixes you are using should be added to config/context.json
.products
section.use_template: TRUE
. This will create an empty template for you in the templates directory, which will automatically be processed when recreating the component (e.g. run.bat make recreate-mycomp
).templates
field to add as many template names as you wish. ODK will look for them in the src/templates
directory.template_options
field. This should be a string with option from robot template. One typical example for additional options you may want to provide is --add-prefixes config/context.json
to ensure the prefix map of your context is provided to robot
, see above.Example:
components:\n products:\n - filename: mycomp.owl\n use_template: TRUE\n template_options: --add-prefixes config/context.json\n templates:\n - template1.tsv\n - template2.tsv\n
Note: if your mirror is particularly large and complex, read this ODK recommendation.
"},{"location":"odk-workflows/RepositoryFileStructure/","title":"Repository structure","text":"The main kinds of files in the repository:
Release file are the file that are considered part of the official ontology release and to be used by the community. A detailed description of the release artefacts can be found here.
"},{"location":"odk-workflows/RepositoryFileStructure/#imports","title":"Imports","text":"Imports are subsets of external ontologies that contain terms and axioms you would like to re-use in your ontology. These are considered \"external\", like dependencies in software development, and are not included in your \"base\" product, which is the release artefact which contains only those axioms that you personally maintain.
These are the current imports in UPHENO
Import URL Type go https://raw.githubusercontent.com/obophenotype/pro_obo_slim/master/pr_slim.owl None nbo http://purl.obolibrary.org/obo/nbo.owl None uberon http://purl.obolibrary.org/obo/uberon.owl None cl http://purl.obolibrary.org/obo/cl.owl None pato http://purl.obolibrary.org/obo/pato.owl None mpath http://purl.obolibrary.org/obo/mpath.owl None ro http://purl.obolibrary.org/obo/ro.owl None omo http://purl.obolibrary.org/obo/omo.owl None chebi https://raw.githubusercontent.com/obophenotype/chebi_obo_slim/main/chebi_slim.owl None oba http://purl.obolibrary.org/obo/oba.owl None ncbitaxon http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl None pr https://raw.githubusercontent.com/obophenotype/pro_obo_slim/master/pr_slim.owl None bspo http://purl.obolibrary.org/obo/bspo.owl None ncit http://purl.obolibrary.org/obo/ncit.owl None fbbt http://purl.obolibrary.org/obo/fbbt.owl None fbdv http://purl.obolibrary.org/obo/fbdv.owl None hsapdv http://purl.obolibrary.org/obo/hsapdv.owl None wbls http://purl.obolibrary.org/obo/wbls.owl None wbbt http://purl.obolibrary.org/obo/wbbt.owl None plana http://purl.obolibrary.org/obo/plana.owl None zfa http://purl.obolibrary.org/obo/zfa.owl None xao http://purl.obolibrary.org/obo/xao.owl None hsapdv-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-hsapdv.owl custom zfa-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-zfa.owl custom zfs-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-zfs.owl custom xao-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-xao.owl custom wbbt-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-wbbt.owl custom wbls-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-wbls.owl custom fbbt-uberon http://purl.obolibrary.org/obo/uberon/bridge/uberon-bridge-to-fbbt.owl custom xao-cl http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-xao.owl custom wbbt-cl http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-wbbt.owl custom fbbt-cl http://purl.obolibrary.org/obo/uberon/bridge/cl-bridge-to-fbbt.owl custom"},{"location":"odk-workflows/RepositoryFileStructure/#components","title":"Components","text":"Components, in contrast to imports, are considered full members of the ontology. This means that any axiom in a component is also included in the ontology base - which means it is considered native to the ontology. While this sounds complicated, consider this: conceptually, no component should be part of more than one ontology. If that seems to be the case, we are most likely talking about an import. Components are often not needed for ontologies, but there are some use cases:
These are the components in UPHENO
Filename URL phenotypes_manual.owl None upheno-mappings.owl None cross-species-mappings.owl None"},{"location":"odk-workflows/SettingUpDockerForODK/","title":"Setting up your Docker environment for ODK use","text":"One of the most frequent problems with running the ODK for the first time is failure because of lack of memory. This can look like a Java OutOfMemory exception, but more often than not it will appear as something like an Error 137
. There are two places you need to consider to set your memory:
robot_java_args: '-Xmx8G'
to your src/ontology/upheno-odk.yaml file, see for example here.robot_java_args
variable. You can manage your memory settings by right-clicking on the docker whale in your system bar-->Preferences-->Resources-->Advanced, see picture below.This page discusses how to update the contents of your imports, like adding or removing terms. If you are looking to customise imports, like changing the module type, see here.
"},{"location":"odk-workflows/UpdateImports/#importing-a-new-term","title":"Importing a new term","text":"Note: some ontologies now use a merged-import system to manage dynamic imports, for these please follow instructions in the section title \"Using the Base Module approach\".
Importing a new term is split into two sub-phases:
There are three ways to declare terms that are to be imported from an external ontology. Choose the appropriate one for your particular scenario (all three can be used in parallel if need be):
This workflow is to be avoided, but may be appropriate if the editor does not have access to the ODK docker container. This approach also applies to ontologies that use base module import approach.
Now you can use this term for example to construct logical definitions. The next time the imports are refreshed (see how to refresh here), the metadata (labels, definitions, etc.) for this term are imported from the respective external source ontology and becomes visible in your ontology.
"},{"location":"odk-workflows/UpdateImports/#using-term-files","title":"Using term files","text":"Every import has, by default a term file associated with it, which can be found in the imports directory. For example, if you have a GO import in src/ontology/go_import.owl
, you will also have an associated term file src/ontology/go_terms.txt
. You can add terms in there simply as a list:
GO:0008150\nGO:0008151\n
Now you can run the refresh imports workflow) and the two terms will be imported.
"},{"location":"odk-workflows/UpdateImports/#using-the-custom-import-template","title":"Using the custom import template","text":"This workflow is appropriate if:
To enable this workflow, you add the following to your ODK config file (src/ontology/upheno-odk.yaml
), and update the repository:
use_custom_import_module: TRUE\n
Now you can manage your imported terms directly in the custom external terms template, which is located at src/templates/external_import.owl
. Note that this file is a ROBOT template, and can, in principle, be extended to include any axioms you like. Before extending the template, however, read the following carefully.
The main purpose of the custom import template is to enable the management off all terms to be imported in a centralised place. To enable that, you do not have to do anything other than maintaining the template. So if you, say currently import APOLLO_SV:00000480
, and you wish to import APOLLO_SV:00000532
, you simply add a row like this:
ID Entity Type\nID TYPE\nAPOLLO_SV:00000480 owl:Class\nAPOLLO_SV:00000532 owl:Class\n
When the imports are refreshed see imports refresh workflow, the term(s) will simply be imported from the configured ontologies.
Now, if you wish to extend the Makefile (which is beyond these instructions) and add, say, synonyms to the imported terms, you can do that, but you need to (a) preserve the ID
and ENTITY
columns and (b) ensure that the ROBOT template is valid otherwise, see here.
WARNING. Note that doing this is a widespread antipattern (see related issue). You should not change the axioms of terms that do not belong into your ontology unless necessary - such changes should always be pushed into the ontology where they belong. However, since people are doing it, whether the OBO Foundry likes it or not, at least using the custom imports module as described here localises the changes to a single simple template and ensures that none of the annotations added this way are merged into the base file.
"},{"location":"odk-workflows/UpdateImports/#refresh-imports","title":"Refresh imports","text":"If you want to refresh the import yourself (this may be necessary to pass the travis tests), and you have the ODK installed, you can do the following (using go as an example):
First, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory).
cd src/ontology\n
Then, you regenerate the import that will now include any new terms you have added. Note: You must have docker installed.
sh run.sh make PAT=false imports/go_import.owl -B\n
Since ODK 1.2.27, it is also possible to simply run the following, which is the same as the above:
sh run.sh make refresh-go\n
Note that in case you changed the defaults, you need to add IMP=true
and/or MIR=true
to the command below:
sh run.sh make IMP=true MIR=true PAT=false imports/go_import.owl -B\n
If you wish to skip refreshing the mirror, i.e. skip downloading the latest version of the source ontology for your import (e.g. go.owl
for your go import) you can set MIR=false
instead, which will do the exact same thing as the above, but is easier to remember:
sh run.sh make IMP=true MIR=false PAT=false imports/go_import.owl -B\n
"},{"location":"odk-workflows/UpdateImports/#using-the-base-module-approach","title":"Using the Base Module approach","text":"Since ODK 1.2.31, we support an entirely new approach to generate modules: Using base files. The idea is to only import axioms from ontologies that actually belong to it. A base file is a subset of the ontology that only contains those axioms that nominally belong there. In other words, the base file does not contain any axioms that belong to another ontology. An example would be this:
Imagine this being the full Uberon ontology:
Axiom 1: BFO:123 SubClassOf BFO:124\nAxiom 1: UBERON:123 SubClassOf BFO:123\nAxiom 1: UBERON:124 SubClassOf UBERON 123\n
The base file is the set of all axioms that are about UBERON terms:
Axiom 1: UBERON:123 SubClassOf BFO:123\nAxiom 1: UBERON:124 SubClassOf UBERON 123\n
I.e.
Axiom 1: BFO:123 SubClassOf BFO:124\n
Gets removed.
The base file pipeline is a bit more complex than the normal pipelines, because of the logical interactions between the imported ontologies. This is solved by _first merging all mirrors into one huge file and then extracting one mega module from it.
Example: Let's say we are importing terms from Uberon, GO and RO in our ontologies. When we use the base pipelines, we
1) First obtain the base (usually by simply downloading it, but there is also an option now to create it with ROBOT) 2) We merge all base files into one big pile 3) Then we extract a single module imports/merged_import.owl
The first implementation of this pipeline is PATO, see https://github.com/pato-ontology/pato/blob/master/src/ontology/pato-odk.yaml.
To check if your ontology uses this method, check src/ontology/upheno-odk.yaml to see if use_base_merging: TRUE
is declared under import_group
If your ontology uses Base Module approach, please use the following steps:
First, add the term to be imported to the term file associated with it (see above \"Using term files\" section if this is not clear to you)
Next, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory).
cd src/ontology\n
Then refresh imports by running
sh run.sh make imports/merged_import.owl\n
Note: if your mirrors are updated, you can run sh run.sh make no-mirror-refresh-merged
This requires quite a bit of memory on your local machine, so if you encounter an error, it might be a lack of memory on your computer. A solution would be to create a ticket in an issue tracker requesting for the term to be imported, and one of the local devs should pick this up and run the import for you.
Lastly, restart Prot\u00e9g\u00e9, and the term should be imported in ready to be used.
"},{"location":"odk-workflows/components/","title":"Adding components to an ODK repo","text":"For details on what components are, please see component section of repository file structure document.
To add custom components to an ODK repo, please follow the following steps:
1) Locate your odk yaml file and open it with your favourite text editor (src/ontology/upheno-odk.yaml) 2) Search if there is already a component section to the yaml file, if not add it accordingly, adding the name of your component:
components:\n products:\n - filename: your-component-name.owl\n
3) Add the component to your catalog file (src/ontology/catalog-v001.xml)
<uri name=\"http://purl.obolibrary.org/obo/upheno/components/your-component-name.owl\" uri=\"components/your-component-name.owl\"/>\n
4) Add the component to the edit file (src/ontology/upheno-edit.obo) for .obo formats:
import: http://purl.obolibrary.org/obo/upheno/components/your-component-name.owl\n
for .owl formats:
Import(<http://purl.obolibrary.org/obo/upheno/components/your-component-name.owl>)\n
5) Refresh your repo by running sh run.sh make update_repo
- this should create a new file in src/ontology/components. 6) In your custom makefile (src/ontology/upheno.Makefile) add a goal for your custom make file. In this example, the goal is a ROBOT template.
$(COMPONENTSDIR)/your-component-name.owl: $(SRC) ../templates/your-component-template.tsv \n $(ROBOT) template --template ../templates/your-component-template.tsv \\\n annotate --ontology-iri $(ONTBASE)/$@ --output $(COMPONENTSDIR)/your-component-name.owl\n
(If using a ROBOT template, do not forget to add your template tsv in src/templates/)
7) Make the file by running sh run.sh make components/your-component-name.owl
The uPheno editors call is held every second Thursday (bi-weekly) on Zoom, provided by members of the Monarch Initiative and co-organised by members of the Alliance and Genome Resources. If you wish to join the meeting, you can open an issue on https://github.com/obophenotype/upheno/issues with the request to be added, or send an email to phenotype-ontologies-editors@googlegroups.com.
The meeting coordinator (MC) is the person charged with organising the meeting. The current MC is Ray, @rays22.
"},{"location":"organization/meetings/#meeting-preparation","title":"Meeting preparation","text":"The uPheno organises an outreach call every four weeks to listen to external stakeholders describing their need for cross-species phenotype integration.
"},{"location":"organization/outreach/#schedule","title":"Schedule","text":"Date Lesson Notes Recordings 2024/04/05 TBD TBD 2024/3/08 Computational identification of disease models through cross-species phenotype comparison Diego A. Pava, Pilar Cacheiro, Damian Smedley (IMPC) Recording 2024/02/09 Use cases for uPheno in the Alliance of Genome Resources and MGI Sue Bello (Alliance of Genome Resources, MGI) Recording"},{"location":"organization/outreach/#possible-topics","title":"Possible topics","text":"Here, we discuss the core concepts of the computational phenotype model underpinning the uPheno effort.
"},{"location":"reference/core-concepts/#table-of-contents","title":"Table of contents","text":"\"Characteristics\" or \"qualities\" refer to an inherent or distinguishing characteristic or attribute of something or someone. It represents a feature that defines the nature of an object, organism, or entity and can be used to describe, compare, and categorize different things. Characteristics can be either qualitative (such as color, texture, or taste) or quantitative (such as height, weight, or age).
The Phenotype And Trait Ontology (PATO) is the reference ontology for general characteristics in the OBO world.
Some of the most widely use characteristics can be seen in the following tables
quality description example Length (PATO:0000122) A 1-D extent quality which is equal to the distance between two points. Mass (PATO:0000128) A physical quality that inheres in a bearer by virtue of the proportion of the bearer's amount of matter. Amount (PATO:0000070) The number of entities of a type that are part of the whole organism. Morphology (PATO:0000051) A quality of a single physical entity inhering in the bearer by virtue of the bearer's size or shape or structure.Note from the authors: The descriptions above have been taken from PATO, but they are not very.. user friendly.
"},{"location":"reference/core-concepts/#biological-traitcharacteristicsattribute","title":"Biological Trait/Characteristics/Attribute","text":"Characteristics such as the one above can be used to describe a variety of entities such as biological, environmental and social. We are specifically concerned with biological traits, which are characteristics that refer to an inherent characteristic of a biological entity, such as an organ (the heart), a process (cell division), a chemical entity (lysine) in the blood.
The Ontology of Biological Attributes (OBA) is the reference ontology for biological characteristics in the OBO world. There are a few other ontologies that describe biological traits, such as the Vertebrate Phenotype Ontology and the Ascomycete Phenotype Ontology (APO), but these are more species specific, and, more importantly, are not integrated in the wider EQ modelling framework.
Property Example term Definition Length OBA:VT0002544 The length of a digit. Mass OBA:VT0001259 The mass of a multicellular organism. Level OBA:2020005 The amount of lysine in blood. Morphology OBA:VT0005406 The size of a heart. "},{"location":"reference/core-concepts/#bearer-of-biological-characteristics","title":"Bearer of Biological Characteristics","text":"In biological contexts, the term \"bearer\" refers to the entity that possesses or carries a particular characteristic or quality. The bearer can be any biological entity, such as an organism, an organ, a cell, or even a molecular structure, that exhibits a specific trait or feature. Some examples:
In each example, the \"bearer\" is the entity that has, carries, or exhibits a particular biological characteristic. This concept is fundamental in biology and bioinformatics for linking specific traits, qualities, or features to the entities that possess them, thereby enabling a clearer understanding and categorization of biological diversity and functions.
"},{"location":"reference/core-concepts/#phenotypic-change","title":"Phenotypic change","text":"A phenotypic change refers to some deviation from reference morphology, physiology, or behavior. This is the most widely used, and most complicated category of phenotype terms for data specialists to understand.
Conceptually, a phenotypic change comprises:
Biological attributes such as blood lysine amount
(OBA:2020005) have been discussed earlier in this document. The most widely used change modifier used in practice is abnormal
(PATO:0000460). This modifier signifies that the phenotypic change term describes a deviation that is abnormal, such as \"Hyperlysinemia\" (HP:0002161), which describes and increased concentration of lysine in the blood. Other modifiers include normal
(PATO:0000461), which describes a change within in the normal range (sometimes interpreted as \"no change\"). A directional modifier like increased
(PATO:0040043) or decreased
(PATO:0040042). In practice, most of our \"characteristic\" terms have specialised directional variants such as decreased amount
(PATO:0001997) which can be used to describe phenotypes.
Comparators are the most confusing aspects of phenotypic change. The first question someone has to ask when they see a concept describing is change like increased blood lysine levels
is \"compared to what?\". Depending on biological context, the assumed comparators vary widely. For example, in clinical phenotyping, it is mostly assumed that a phenotypic feature corresponds to a deviation from the normal range, see HPO docs. However, it is just just as easily imaginable that HPO terms are used to describe change compared to a previous state of the same individual (increased tumor size compared to last time we checked). In research settings such as GWAS study annotations, HPO terms are used to annotate variants where a statistically significant change was observed compared to the general population. The same is true for many model phenotyping efforts such as MGI, where the situation is even further complicated that the comparator is not \"the general population\", but a control group. In summary, comparators can be:
And the compared charactertistics could be
No matter how much we want it - concepts describing phenotypic change will be used in many creative ways, and unfortunately, once the data hits your data analysis pipeline, you will likely not know for sure the nature of the comparator. Where you can, you should try to figure it out from the metata.
This sounds like bad news. However, keep one thing in mind: Phenotype associations (to anything, including genes) are rarely strictly causal. Even if a change is observed \"compared to some non-representative control\" there is likely to be some signal useful for downstream inference - somehow, the \"gene has something to do with the phenotype\".
"},{"location":"reference/core-concepts/#the-chaotic-terminology-around-phenotype","title":"The chaotic terminology around \"phenotype\"","text":"In the clinical domain, many ontologies exist that define concepts that are very strongly related to our notion of \"phenotype\". In SNOMED, for example, \"clinical findings\" are defined as normal/abnormal observations, judgments, or assessments of patients (e.g. Abnormal urinalysis (finding)). For most analytic purposes, we think of SNOMEDs (and other medical terminologies) notion of clinical finding of something ortologous to our notion of \"phenotype\" (and their \"observale entity\" as a trait/biological attribute). However, if one gets into the weeds, many discrepencies in judgement can be observed, in particular when it comes to the separation from disease.
\"Phenotype\" is typically used in its \"singular\" form to describe the set of all observable characteristics of a subject. However, because we have over time gotten used to talking about \"cardiovascular phenotype\" and \"increased blood glucose level\", we have started using the plural form more, i.e. \"phenotypes\". We now tend to use the term \"phenotypic profile\" to describe the set of phenotypes that an organism exhibits at some point in time.
\"Phenotypic feature\" is a commonly used term that refers to the same idea, but mostly in the context of disease to describe an observable characteristic commonly associated with a disease.
\"Phenotypic abnormality\" is the formal term to describe a concept in the HPO, and is sometimes used to refer to the same idea in HPO-related papers. There is a bit of an assumption here, compared to the more general concepts described in this section, which is that the term should refer to a \"deviation from the normal range\", but, as described in the section of \"implicit comparators\", this assumption does not always hold in practice.
\"Phenotypic change\" is a recent invention by David Osumi-Sutherland in an attempt to subsume the ideas above, in particular to explicitly step back from the concept of \"deviation from normal\" to \"statistically significant deviation\" (which includes the normal range).
"},{"location":"reference/core-concepts/#examples","title":"Examples","text":"The Unified Phenotype Ontology (uPheno) is the reference ontology for biological abnormalities in the OBO world. There are a many species-specific ontologies in the OBO world, such as the Mammalian Phenotype Ontology (MP), the Human Phenotype Ontology (HPO) and the Drosophila Phenotype Ontology (DPO), see here.
Property Example term Definition Length UPHENO:0072215 Increased length of the digit. Mass UPHENO:0054299 Decreased multicellular organism mass. Level UPHENO:0034327 Decreased level of lysine in blood. Morphology UPHENO:0001471 Increased size of the heart. "},{"location":"reference/core-concepts/#diseases","title":"Diseases","text":"Diseases are among the most important concepts in the phenotype data space. Phenotypes relate One big source of confusion in our community is the seperation of \"phenotypic features\" or changes from diseases. The HPO docs provide an explanation geared at clinicians to help them distinguish between the two. The quest on developing an operational definition is still ongoing, but for now, we recommend to go with the following basic assumptions:
In biological data curation, it\u2019s essential to differentiate between traits (observable characteristics such as \"blood glucose level\") and measurements (a process to observe such characteristics, e.g. \"blood glucose level assay\", \"BMI\"). Just from the term itself this is often difficult. \"Blood glucose level\" can refer both a measurement and a trait when taken out of context, but the ontologies they appear in should differenciate cleanly between the two. Here are some ways to distinguish them: - traits are - observable characteritics of an organism - can be qualitative (\"red eye colour\") or quantitative (\"35 cm tail length\") - measurements are - activties performed by an agent (such as a researcher) - involve the quantification or qualification of a specific trait - correspond to measurement instruments / techniques (such as assays, BMIs)
In practice, it is true that a lot of data records a wild mix of the two. It is the job of (semantic) data modeling specialists to clearly distinguish the two when integrating annotate data from sources with divergent curation practices.
"},{"location":"reference/core-concepts/#putting-it-all-together","title":"Putting it all together","text":"Characteristics (A) and bearers of characteristics (B) are the core constituents of traits/biological attributes (C). Phenotypes are comprised of trait terms (C) combined with a modifier (D). Species-specific phenotypes (F), including phenotypic abnormalities defined in the Human Phenotype Ontology (HPO) are feature of diseases (G). Measurements (H), such as assays, quantify or qualify (measure) traits (C).
"},{"location":"reference/data-integration/","title":"Integrating phenotype data with the uPheno framework","text":""},{"location":"reference/data-integration/#integrating-phenotype-data-using-the-upheno-framework","title":"Integrating phenotype data using the uPheno framework","text":""},{"location":"reference/data-integration/#prerequisites","title":"Prerequisites","text":"Before we get started, let's remind ourselves of the basic structure of phenotype data.
Characteristics (A) and bearers of characteristics (B) are the core constituents of traits/biological attributes (C). Phenotypes are comprised of trait terms (C) combined with a modifier (D). Species-specific phenotypes (F), including phenotypic abnormalities defined in the Human Phenotype Ontology (HPO) are feature of diseases (G). Measurements (H), such as assays, quantify or qualify (measure) traits (C).
Integrating all kinds of phenotype data into the \"uPheno framework\" is a complex process which we will break down in the following.
"},{"location":"reference/data-integration/#level-2-integration-knowledge","title":"Level 2 integration: Knowledge","text":""},{"location":"reference/data-integration/#important-relationships-wrt-to-phenotype-data","title":"Important relationships wrt to phenotype data","text":"Imports directory:
*
http://purl.obolibrary.org/obo/upheno/imports/
Currently the imports includes:
*\u00a0imports/chebi_import.owl
\\ *\u00a0imports/doid_import.owl
\\ *\u00a0imports/go_import.owl
\\ *\u00a0imports/mpath_import.owl
\\ *\u00a0imports/pato_import.owl
\\ *\u00a0imports/pr_import.owl
\\ *\u00a0imports/uberon_import.owl
\\ *\u00a0imports/wbbt_import.owl
To avoid multiple duplicate classes for heart, lung, skin etc we map all classes to [Uberon] where this is applicable. For more divergent species such as fly and C elegans we use the appropriate species-specific ontology.
Currently there are a small number of highly specific classes in FMA that are being used and have no corresponding class in Uberon
"},{"location":"reference/imports/#methods","title":"Methods","text":"We use the OWLAPI SyntacticLocalityModularityExtractor, via [OWLTools]. See the http://purl.obolibrary.org/obo/upheno/Makefile for details
"},{"location":"reference/phenotype-data/","title":"Phenotype data","text":""},{"location":"reference/phenotype-data/#phenotype-data-in-practice","title":"Phenotype Data in practice","text":""},{"location":"reference/phenotype-data/#overview","title":"Overview","text":"The goals of this document are:
Phenotype data comes in many different shapes and forms. In the following, we will describe some of the most common features of such data:
Pre-coordinated phenotype data is popular in the clinical domain, where a lot of observations are taken by a clinician and recorded as \"phenotypic abnormalities\" with the goal of eventual diagnosis.
Phenopackets such as the one below are an emerging standard to capture and sharing disease and phenotype information. Phenotypic features in particular are captured as so called \"pre-coordinated phenotype terms\" such as \"Attenuation of retinal blood vessels\" (HP:0007843). \"Pre-coordinated\" in this context means that the various aspects of the phenotype term, such as the bearer (\"retinal blood vessels\") and the characteristic (\"Attenuation\", or \"thinning/narrowing\"), and the modifier (in the case of HPO terms, simply abnormal), are combined (\"coordinated\") into a single term.
Phenopacket{\n\"id\": \"PMID:23559858-Ajmal-2013-BBS1-IV-5/family_A\",\n\"subject\": {\n \"id\": \"IV-5/family A\",\n \"timeAtLastEncounter\": {\n \"age\": {\n \"iso8601duration\": \"P26Y\"\n }\n },\n \"sex\": \"MALE\",\n \"taxonomy\": {\n \"id\": \"NCBITaxon:9606\",\n \"label\": \"Homo sapiens\"\n }\n},\n\"phenotypicFeatures\": [\n {\n \"type\": {\n \"id\": \"HP:0007843\",\n \"label\": \"Attenuation of retinal blood vessels\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0001513\",\n \"label\": \"Obesity\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0000608\",\n \"label\": \"Macular degeneration\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0000486\",\n \"label\": \"Strabismus\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0001328\",\n \"label\": \"Specific learning disability\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0000510\",\n \"label\": \"Rod-cone dystrophy\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n },\n {\n \"type\": {\n \"id\": \"HP:0001263\",\n \"label\": \"Global developmental delay\"\n },\n \"evidence\": [\n {\n \"evidenceCode\": {\n \"id\": \"ECO:0000033\",\n \"label\": \"author statement supported by traceable reference\"\n },\n \"reference\": {\n \"id\": \"PMID:23559858\",\n \"description\": \"A family was reported in which two affected members had a splicing variant in BBS1, c.47+1G>T.\"\n }\n }\n ]\n }\n],\n\"interpretations\": [\n {\n \"id\": \"PMID:23559858-Ajmal-2013-BBS1-IV-5/family_A\",\n \"progressStatus\": \"SOLVED\",\n \"diagnosis\": {\n \"disease\": {\n \"id\": \"OMIM:209900\",\n \"label\": \"BARDET-BIEDL SYNDROME 1; BBS1\"\n },\n \"genomicInterpretations\": [\n {\n \"subjectOrBiosampleId\": \"IV-5/family A\",\n \"interpretationStatus\": \"CAUSATIVE\",\n \"variantInterpretation\": {\n \"variationDescriptor\": {\n \"id\": \"clinvar:1324292\",\n \"geneContext\": {\n \"valueId\": \"ENSG00000174483\",\n \"symbol\": \"BBS1\",\n \"alternateIds\": [\n \"HGNC:966\",\n \"entrez:582\",\n \"ensembl:ENSG00000174483\",\n \"symbol:BBS1\"\n ]\n },\n \"vcfRecord\": {\n \"genomeAssembly\": \"GRCh37\",\n \"chrom\": \"11\",\n \"pos\": \"66278178\",\n \"ref\": \"G\",\n \"alt\": \"T\"\n },\n \"allelicState\": {\n \"id\": \"GENO:0000136\",\n \"label\": \"homozygous\"\n }\n }\n }\n }\n ]\n }\n }\n],\n\"metaData\": {\n \"created\": \"1970-01-01T00:00:00Z\",\n \"submittedBy\": \"HPO:probinson\",\n \"resources\": [\n {\n \"id\": \"hp\",\n \"name\": \"human phenotype ontology\",\n \"url\": \"http://purl.obolibrary.org/obo/hp.owl\",\n \"version\": \"2018-03-08\",\n \"namespacePrefix\": \"HP\",\n \"iriPrefix\": \"http://purl.obolibrary.org/obo/HP_\"\n },\n {\n \"id\": \"pato\",\n \"name\": \"Phenotype And Trait Ontology\",\n \"url\": \"http://purl.obolibrary.org/obo/pato.owl\",\n \"version\": \"2018-03-28\",\n \"namespacePrefix\": \"PATO\",\n \"iriPrefix\": \"http://purl.obolibrary.org/obo/PATO_\"\n },\n {\n \"id\": \"geno\",\n \"name\": \"Genotype Ontology\",\n \"url\": \"http://purl.obolibrary.org/obo/geno.owl\",\n \"version\": \"19-03-2018\",\n \"namespacePrefix\": \"GENO\",\n \"iriPrefix\": \"http://purl.obolibrary.org/obo/GENO_\"\n },\n {\n \"id\": \"ncbitaxon\",\n \"name\": \"NCBI organismal classification\",\n \"url\": \"http://purl.obolibrary.org/obo/ncbitaxon.owl\",\n \"version\": \"2018-03-02\",\n \"namespacePrefix\": \"NCBITaxon\",\n \"iriPrefix\": \"http://purl.obolibrary.org/obo/NCBITaxon_\"\n },\n {\n \"id\": \"eco\",\n \"name\": \"Evidence and Conclusion Ontology\",\n \"url\": \"http://purl.obolibrary.org/obo/eco.owl\",\n \"version\": \"2018-11-10\",\n \"namespacePrefix\": \"ECO\",\n \"iriPrefix\": \"http://purl.obolibrary.org/obo/ECO_\"\n },\n {\n \"id\": \"omim\",\n \"name\": \"Online Mendelian Inheritance in Man\",\n \"url\": \"https://www.omim.org\",\n \"version\": \"2018-03-08\",\n \"namespacePrefix\": \"OMIM\",\n \"iriPrefix\": \"https://omim.org/entry/\"\n },\n {\n \"id\": \"clinvar\",\n \"name\": \"Clinical Variation\",\n \"url\": \"https://www.ncbi.nlm.nih.gov/clinvar/\",\n \"version\": \"2023-04-06\",\n \"namespacePrefix\": \"clinvar\",\n \"iriPrefix\": \"https://www.ncbi.nlm.nih.gov/clinvar/variation/\"\n }\n ],\n \"phenopacketSchemaVersion\": \"2.0.0\"\n}\n}\n
Apart from clinical diagnostics, pre-coordinated phenotype terms are used in many other contexts such as model organism research (e.g. IMPC) or the curation of Genome Wide Association Studies.
"},{"location":"reference/phenotype-data/#post-coordinated","title":"Post-coordinated","text":"Post-coordinated phenotype curation simply means that the different constituents of phenotype (characteristic, bearer, modifier etc) are captured individually. This has certain advantages. For example, the phenotype space is enormous, as you can measure variations in many observable charactertics from chemical entities present in the blood, the microbiome to a host of morphological and developmental abnormalities. Instead of having individual (controlled vocabulary) terms for increased level of X
, decreased level X
, abnormal level of X
, increased level of X in blood
for thousands of chemical compounds synthesized by the human body, you just have \"increased level\", \"blood\" and all the chemical compounds.
There are at least three flavours of post-coordinated phenotype curation prevalent in the biomedical domain, four if you count quantified phenotypes:
Trait + modifier pattern is used for example by databases such as the Saccharomyces Genome Database (SGD). Here are some examples:
dateAssigned evidence/publicationId objectId phenotypeStatement phenotypeTermIdentifiers/0/termId phenotypeTermIdentifiers/1/termId conditionRelations/0/conditions/0/chemicalOntologyId conditionRelations/0/conditions/0/conditionClassId 2010-07-08T00:07:00-00:00 PMID:1406694 SGD:S000003901 abnormal RNA accumulation APO:0000002 APO:0000224 2006-05-05T00:05:00-00:00 PMID:785224 SGD:S000000854 decreased resistance to chemicals APO:0000003 APO:0000087 CHEBI:78661 ZECO:0000111 2010-07-07T00:07:00-00:00 PMID:10545447 SGD:S000000969 decreased cell size APO:0000003 APO:0000052APO:0000002
(abnormal) and APO:0000003
(decreased) are modifiers.APO:0000087
(resistance to chemicals), APO:0000224
(RNA accumulation), APO:0000052
(cell size) are biological attributes/traits.CHEBI:78661
is recorded as an experimental condition, but should probably be interpreted as part of the bearer expression.Data was obtained from the Alliance of Genome Resources on the 30.03.2023 and simplified for illustration.
The bearer-only pattern is used by many databases, such as Flybase. In the data, we only find references of bearers, such as anatomical entities or biological processes. Instead of explicitly stating phenotypic modifiers (abnormal, morphology, changed), it is implicit in the definition of the dataset.
dateAssigned evidence/crossReference/id evidence/publicationId objectId phenotypeStatement phenotypeTermIdentifiers/0/termId 2024-01-05T11:54:24-05:00 FB:FBrf0052655 PMID:2385293 FB:FBal0016988 embryonic telson FBbt:00000184 2024-01-05T11:54:24-05:00 FB:FBrf0058077 PMID:8223248 FB:FBal0001571 larva FBbt:00001727FBbt:00000184
(embryonic telson) and FBbt:00001727
(larva) are bearer terms.Data was obtained from the Alliance of Genome Resources on the 30.03.2023 and simplified for illustration.
The most complex pattern for phenotype descriptions which essentially decomposes the entire phenotype expression into atomic consituents can be found, for example, in the The Zebrafish Information Network (ZFIN).
Examples:
Fish ID Affected Structure or Process 1 subterm ID Affected Structure or Process 1 subterm Name Post-composed Relationship ID Post-composed Relationship Name Affected Structure or Process 1 superterm ID Affected Structure or Process 1 superterm Name Phenotype Keyword ID Phenotype Keyword Name Phenotype Tag Affected Structure or Process 2 subterm ID Affected Structure or Process 2 subterm name Post-composed Relationship (rel) ID Post-composed Relationship (rel) Name Affected Structure or Process 2 superterm ID Affected Structure or Process 2 superterm name Publication ID ZDB-FISH-150901-29105 ZFA:0009366 hair cell BFO:0000050 part_of ZFA:0000051 otic vesicle PATO:0000374 increased distance abnormal ZFA:0009366 hair cell BFO:0000050 part_of ZFA:0000051 otic vesicle ZDB-PUB-171025-12 ZDB-FISH-150901-29105 ZFA:0009366 hair cell BFO:0000050 part_of ZFA:0000051 otic vesicle PATO:0000374 increased distance abnormal ZFA:0009366 hair cell BFO:0000050 part_of ZFA:0000051 otic vesicle ZDB-PUB-171025-12 ZDB-FISH-150901-11537 ZFA:0000051 otic vesicle PATO:0001905 has normal numbers of parts of type normal ZFA:0009366 hair cell BFO:0000050 part_of ZFA:0000051 otic vesicle ZDB-PUB-150318-1 ZDB-FISH-150901-18770 ZFA:0000119 retinal inner nuclear layer PATO:0002001 has fewer parts of type abnormal ZFA:0009315 horizontal cell BFO:0000050 part_of ZFA:0000119 retinal inner nuclear layer ZDB-PUB-130222-28 ZDB-FISH-190806-7 BSPO:0000084 ventral region BFO:0000050 part_of ZFA:0000101 diencephalon PATO:0002001 has fewer parts of type abnormal ZFA:0009301 dopaminergic neuron BFO:0000050 part_of ZFA:0000101 diencephalon ZDB-PUB-190216-5 ZDB-FISH-190807-7 BSPO:0000084 ventral region BFO:0000050 part_of ZFA:0000101 diencephalon PATO:0001905 has normal numbers of parts of type normal ZFA:0009301 dopaminergic neuron BFO:0000050 part_of ZFA:0000101 diencephalon ZDB-PUB-190216-5 ZDB-FISH-190807-8 BSPO:0000084 ventral region BFO:0000050 part_of ZFA:0000101 diencephalon PATO:0002001 has fewer parts of type abnormal ZFA:0009301 dopaminergic neuron BFO:0000050 part_of ZFA:0000101 diencephalon ZDB-PUB-190216-5 ZDB-FISH-150901-29105 ZFA:0000101 diencephalon PATO:0001555 has number of normal ZFA:0009301 dopaminergic neuron BFO:0000050 part_of ZFA:0000101 diencephalon ZDB-PUB-161120-7 ZDB-FISH-210421-9 ZFA:0009290 glutamatergic neuron BFO:0000050 part_of ZFA:0000008 brain PATO:0040043 increased proportionality to abnormal ZFA:0009276 GABAergic neuron BFO:0000050 part_of ZFA:0000008 brain ZDB-PUB-191011-2 ZDB-FISH-210421-9 ZFA:0009290 glutamatergic neuron BFO:0000050 part_of ZFA:0000008 brain PATO:0040043 increased proportionality to abnormal ZFA:0009276 GABAergic neuron BFO:0000050 part_of ZFA:0000008 brain ZDB-PUB-191011-2Lets break down the second to last row:
Data was obtained from ZFIN (Phenotype of Zebrafish Genes) on the 30.03.2023 and simplified for illustration.
As one can see in the last example, bearers can be anything from simple atomic entities to arbitrarily complex compositions:
lysine
)lysine
part_of blood
)lysine
part_of cell
part_of (muscle
part of heart
))lysine
part_of (cytoplasm
part_of (cell
part_of (muscle
part of heart
))))Phenotype data can be standardised to varying degrees. It is not uncommon for data to be completely unstandardised. Unfortunately, only a fraction of the available data is actually annotated using terms from controlled phenotype ontologies. Here are some of the more \"typical\" kinds of data on the standardised/non-standardised spectrum:
Qualitative and quantitative phenotype data represent two fundamental ways of describing characteristics or traits in biology, each providing different types of information:
Qualitative Phenotype Data:
Quantitative Phenotype Data:
Qualitative data is descriptive and categorical, while quantitative data is numerical and measurable. Both types are essential for a comprehensive understanding of phenotypic traits, each offering unique insights into biological variation and complexity.
"},{"location":"reference/phenotype-ontology-alignment/","title":"Aliging species specific phenotype ontologies","text":"Phenotype ontologies use different reference ontologies for their EQs. Everything in uPheno is integrated towards a common set of reference ontologies, in particular Uberon and CL. In order to integrate species-independent anatomy ontologies we employ the following workflow for phenotype ontologies:
When two classes are merged in uPheno based on a cross-species mapping, we assert the most general common ancestor as parent.
"},{"location":"reference/qc/","title":"uPheno Quality Control","text":""},{"location":"reference/use-cases/","title":"Use Cases","text":""},{"location":"reference/use-cases/#use-cases","title":"Use Cases","text":"Cross-species data in biomedical knowledge graphs (Kids First)
Association prediction
*
The
Drosophila
phenotype
ontology
Osumi-Sutherland\u00a0et\u00a0al,\u00a0J\u00a0Biomed\u00a0Sem.
The DPO is formally a subset of FBcv, made available from http://purl.obolibrary.org/obo/fbcv/dpo.owl
Phenotypes in FlyBase may either by assigned to FBcv (dpo) classes, or they may have a phenotype_manifest_in to FBbt (anatomy).
For integration we generate the following ontologies:
*
http://purl.obolibrary.org/obo/upheno/imports/fbbt_phenotype.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/uberon_phenotype.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/go_phenotype.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/cl_phenotype.owl
(see Makefile)
This includes a phenotype class for every anatomy class - the IRI is suffixed with \"PHENOTYPE\". Using these ontologies, Uberon and CL phenotypes make the groupings.
We include
*
http://purl.obolibrary.org/obo/upheno/dpo/dpo-importer.owl
Which imports dpo plus auto-generated fbbt phenotypes.
The dpo-importer is included in the [MetazoanImporter]
"},{"location":"reference/components/dpo/#additional-notes","title":"Additional Notes","text":"We create a local copy of fbbt that has \"Drosophila \" prefixed to all labels. This gives us a hierarchy:
*\u00a0eye\u00a0phenotype\u00a0(defined\u00a0using\u00a0Uberon)
\\ *\u00a0compound\u00a0eye\u00a0phenotype\u00a0\u00a0(defined\u00a0using\u00a0Uberon)
\\ *\u00a0drosophila\u00a0eye\u00a0phenotype\u00a0(defined\u00a0using\u00a0FBbt)
*
http://code.google.com/p/cell-ontology/issues/detail?id=115
ensure\u00a0all\u00a0CL\u00a0to\u00a0FBbt\u00a0equiv\u00a0axioms\u00a0are\u00a0present\u00a0(we\u00a0have\u00a0good\u00a0coverage\u00a0for\u00a0Uberon)
*\u00a0project\u00a0page\u00a0-
https://sourceforge.net/apps/trac/pombase/wiki/FissionYeastPhenotypeOntology
\\ *
FYPO:\u00a0the\u00a0fission\u00a0yeast\u00a0phenotype\u00a0ontology
Harris\u00a0et\u00a0al,\u00a0Bioinformatics
Note that the OWL axioms for FYPO are managed directly in the FYPO project repo, we do not duplicate them here
"},{"location":"reference/components/hp/","title":"Human Phenotype Ontology","text":"*
http://www.human-phenotype-ontology.org/
\\ *\u00a0K\u00f6hler\u00a0S,\u00a0Doelken\u00a0SC,\u00a0Mungall\u00a0CJ,\u00a0Bauer\u00a0S,\u00a0Firth\u00a0HV,\u00a0Bailleul-Forestier\u00a0I,\u00a0Black\u00a0GC,\u00a0Brown\u00a0DL,\u00a0Brudno\u00a0M,\u00a0Campbell\u00a0J,\u00a0FitzPatrick\u00a0DR,\u00a0Eppig\u00a0JT,\u00a0Jackson\u00a0AP,\u00a0Freson\u00a0K,\u00a0Girdea\u00a0M,\u00a0Helbig\u00a0I,\u00a0Hurst\u00a0JA,\u00a0J\u00e4hn\u00a0J,\u00a0Jackson\u00a0LG,\u00a0Kelly\u00a0AM,\u00a0Ledbetter\u00a0DH,\u00a0Mansour\u00a0S,\u00a0Martin\u00a0CL,\u00a0Moss\u00a0C,\u00a0Mumford\u00a0A,\u00a0Ouwehand\u00a0WH,\u00a0Park\u00a0SM,\u00a0Riggs\u00a0ER,\u00a0Scott\u00a0RH,\u00a0Sisodiya\u00a0S,\u00a0Van\u00a0Vooren\u00a0S,\u00a0Wapner\u00a0RJ,\u00a0Wilkie\u00a0AO,\u00a0Wright\u00a0CF,\u00a0Vulto-van\u00a0Silfhout\u00a0AT,\u00a0de\u00a0Leeuw\u00a0N,\u00a0de\u00a0Vries\u00a0BB,\u00a0Washingthon\u00a0NL,\u00a0Smith\u00a0CL,\u00a0Westerfield\u00a0M,\u00a0Schofield\u00a0P,\u00a0Ruef\u00a0BJ,\u00a0Gkoutos\u00a0GV,\u00a0Haendel\u00a0M,\u00a0Smedley\u00a0D,\u00a0Lewis\u00a0SE,\u00a0Robinson\u00a0PN.\u00a0The\u00a0Human\u00a0Phenotype\u00a0Ontology\u00a0project:\u00a0linking\u00a0molecular\u00a0biology\u00a0and\u00a0disease\u00a0through\u00a0phenotype\u00a0data.
Nucleic\u00a0Acids\u00a0Res.
2014\u00a0Jan;
42
(Database\u00a0issue):D966-74\u00a0[
pubmed
]
*
HPO
browser
\\ *
HP
in
OntoBee
\\ *
HP
in
OLSVis
The OWL axioms for HP are in the src/ontology/hp directory on this site.
The structure is analagous to that of the [MP].
"},{"location":"reference/components/hp/#status","title":"Status","text":"The OWL axiomatization is updated frequently to stay in sync with changes in the MP
"},{"location":"reference/components/hp/#editing-the-axioms","title":"Editing the axioms","text":"The edit file is currently:
*
http://purl.obolibrary.org/obo/hp/hp-equivalence-axioms-subq-ubr.owl
Edit this in protege.
"},{"location":"reference/components/mp/","title":"Mammalian Phenotype Ontology","text":"*
The
Mammalian
Phenotype
Ontology:
enabling
robust
annotation
and
comparative
analysis
Smith\u00a0CL,\u00a0Eppig\u00a0JT
\\ *
MP
browser
at
MGI
\\ *
MP
in
OntoBee
\\ *
MP
in
OLSVis
The OWL axioms for MP are in the src/ontology/mp directory on this site.
*
http://purl.obolibrary.org/obo/mp.owl
-\u00a0direct\u00a0conversion\u00a0of\u00a0MGI-supplied\u00a0obo\u00a0file
\\ *
http://purl.obolibrary.org/obo/mp/mp-importer.owl
-\u00a0imports\u00a0additional\u00a0axioms,\u00a0including\u00a0the\u00a0following\u00a0ones\u00a0below:
\\ *
http://purl.obolibrary.org/obo/mp.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/chebi_import.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/uberon_import.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/pato_import.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/go_import.owl
\\ *
http://purl.obolibrary.org/obo/upheno/imports/mpath_import.owl
\\ *
http://purl.obolibrary.org/obo/mp/mp-equivalence-axioms-subq-ubr.owl
\\ \\
The OWL axiomatization is updated frequently to stay in sync with changes in the MP
"},{"location":"reference/components/mp/#editing-the-axioms","title":"Editing the axioms","text":"The edit file is currently:
*
http://purl.obolibrary.org/obo/mp/mp-equivalence-axioms-edit.owl
Edit this in protege.
The file mp-equivalence-axioms.obo is DEPRECATED!
"},{"location":"reference/components/mp/#termgenie","title":"TermGenie","text":"*
http://mp.termgenie.org/
\\ *
http://mp.termgenie.org/TermGenieFreeForm
*\u00a0Schindelman,\u00a0Gary,\u00a0et\u00a0al.
Worm
Phenotype
Ontology:
integrating
phenotype
data
within
and
beyond
the
C.
elegans
community.
BMC\u00a0bioinformatics\u00a012.1\u00a0(2011):\u00a032.
\\ *
WBPhenotype
in
OntoBee
\\ *
WBPhenotype
in
OLSVis
The OWL axioms for WBPhenotype are in the src/ontology/wbphenotype directory on this site.
*
http://purl.obolibrary.org/obo/wbphenotype.owl
-\u00a0direct\u00a0conversion\u00a0of\u00a0WormBase-supplied\u00a0obo\u00a0file
\\ *
http://purl.obolibrary.org/obo/wbphenotype/wbphenotype-importer.owl
-\u00a0imports\u00a0additional\u00a0axioms.
The structure roughly follows that of the [MP]. The worm anatomy is used.
"},{"location":"reference/components/wbphenotype/#editing-the-axioms","title":"Editing the axioms","text":"Currently the source is wbphenotype/wbphenotype-equivalence-axioms.obo, the OWL is generated from here. We are considering switching this around, so the OWL is edited, using Protege.
"},{"location":"reference/components/zp/","title":"Introduction","text":"This page describes the generation of the zebrafish phenotype ontology
"},{"location":"reference/components/zp/#details","title":"Details","text":"The ZP differs considerably from [HP], [MP] and others. ZFIN do not annotate with a pre-composed phenotype ontology - all annotations compose phenotypes on-the-fly using a combination of PATO, ZFA, GO and other ontologies.
We use these combinations to construct ZP on the fly, by naming each distinct combination, assigning it an ID, and placing it in the hierarchy.
The process is described here:
The OWL formalism for ZFIN annotations is described here:
The java implementation is here:
The OWL axioms for ZP are in zp.owl that is build on our hudson server.
"},{"location":"reference/imports/pato/","title":"PATO","text":"PATO is an ontology of phenotypic qualities. We use PATO to compose phenotypic descriptions. See [OWLAxiomatization]
"},{"location":"reference/imports/pato/#details","title":"Details","text":"See https://code.google.com/p/pato/
"},{"location":"reference/modelling/abnormal/","title":"Abnormal phenotypes","text":"The current design patterns are such that the abnormal qualifier is only added when the quality class in the definition is neutral.
However, we still need to be able to infer
*\u00a0Hyoplasia\u00a0of\u00a0right\u00a0ventricle\u00a0SubClassOf\u00a0Abnormality\u00a0of\u00a0right\u00a0ventricle
Because the latter class definition includes qualifier some abnormal, the SubClassOf axiom will not be entailed unless the qualifier is explicitly stated or inferred
"},{"location":"reference/modelling/abnormal/#details","title":"Details","text":"We achieve this by including an axiom to PATO such that decreased sizes etc are inferred to be qualifier some abnormal.
We do this with an exiom in imports/extra.owl
*\u00a0'deviation(from\u00a0normal)'\u00a0SubClassOf\u00a0qualifier\u00a0some\u00a0abnormal
Anything under 'increased', 'decreased' etc in PATO is pre-reasoned in PATO to be here.
See the following explanation:
http://phenotype-ontologies.googlecode.com/svn/trunk/doc/images/has-qualifier-inference.png
"},{"location":"reference/modelling/abnormal/#limitations","title":"Limitations","text":"For this strategy to work it requires the PATO classes themselves to be classified under deviation from normal. This may not always be the case
"},{"location":"reference/modelling/abnormal/#notes","title":"Notes","text":"Do not be distracted by the fact the has-qualifier relation is named has-component at the moment
https://code.google.com/p/phenotype-ontologies/issues/detail?id=45
"},{"location":"reference/modelling/abnormal/#notes_1","title":"Notes","text":""},{"location":"reference/modelling/absence/","title":"Absence modelling","text":"Much has been written on the subject of representing absence. Before diving into the logical issues it is worth examining patterns in existing phenotype ontologies to understand what user expectations may typically be for absence.
"},{"location":"reference/modelling/absence/#background","title":"Background","text":"*
Absence_Phenotypes_in_OWL
(Phenoscape\u00a0Wiki)
\\ *\u00a0(outdated)\u00a0material\u00a0on\u00a0the\u00a0old
PATO
wiki
.
It is not uncommon to see patterns such as
From a strict logical perspective, this is inverted. \"absent incisors\" surely means \"absence of all incisors\", or put another way \"the animal has no incisors\". Yet it would be possible to have an animal with *absent* lower incisors and *present* upper incisors, yielding what seems a contradiction (because the subClass axiom would say this partial-incisor animal lacked all incisors).
If the ontology were in fact truly modeling \"absence of *all* S\" then it would lead to a curious ontology structure, with the typical tree structure of the anatomy ontology representing S inverted into a polyhierarchical fan in the absent-S ontology.
From this it can be cautiously inferred that the intent of the phenotype ontology curator and user is in fact to model \"absence of *some* S\" rather than \"absence of *all* S\". This is not necessarily a universal rule, and the intent may vary depending on whether we are talking about a serially repeated structure or one that typically occurs in isolation. The intent may also be to communicate that a *significant number* of S is missing.
"},{"location":"reference/modelling/absence/#absence-as-a-type-of-morphology","title":"Absence as a type of morphology","text":"It is also not uncommon to see patterns such as:
Again, from a strict logical perspective this is false. If the spleen is absent then what does the \"morphology\" of the parent refer to?
However, this inference is clearly a desirable one from the point of view of the phenotype ontology editors and users, as it is common in ontologies for a variety of structures. For example:
And:
These patterns can be formally defended on developmental biology grounds. \"absence\" here is _not_ equivalent to logical absence. It refers specifically to developmental absence.
Furthermore, strict logical absence leads to undesirable inferences. It would be odd to include a nematode worm as having the phenotype \"spleen absent\", because worms have not evolved spleens. But the logical description of not having a spleen as part fets a worm.
Similarly, if the strict cardinality interpretation were intended, we would expect to see:
i.e. if you're missing your entire hindlegs, you're *necessarily* missing your femurs. But it must be emphatisized that this is *not* how phenotype ontologies are classified. This goes for a wide range of structures and other relationship types. In MP, \"absent limb buds\" are *not* classified under \"absent limbs\", even though it is impossible for a mammal to have limbs without having had limb buds.
"},{"location":"reference/modelling/absence/#absence-as-part-of-a-size-morphology-spectrum","title":"Absence as part of a size-morphology spectrum","text":"The existing treatment of absence can be formally defended morphologically by conceiving of a morphological value space, with \"large\" at one end and \"small\" at the other. As we get continuously smaller, there may come an arbitrary point whereby we say \"surely this is no longer a limb\" (and of course, we are not talking about a pure geometrical size transformation here - as a limb reaches extreme edges of a size range various other morphological changes necessarily happen). But this cutoff is arguably arbitrary, and the resulting discontinuity causes problems. It is simpler to treat absence as being one end of a size scale.
"},{"location":"reference/modelling/absence/#summary","title":"Summary","text":"This is barely touching the subject, and is intended to illustrate that things may be more subtle than naively treating words like \"absent\" as precisely equivalent to cardinality=0. An understanding of the medical, developmental and evolutionary contexts are absolutely required, together with an understanding of the entailments of different logical formulations.
Even though existing phenotype ontologies may not be conceived of formally, it is implicit than they do not model absence as being equivalent to cardinality=0 / not(has_part), because the structure of these ontologies would look radically different.
"},{"location":"reference/modelling/absence/#todo","title":"TODO","text":"Link to Jim Balhoff's PhenoDay paper and discussion
Here's the link: http://phenoday2014.bio-lark.org/pdf/11.pdf
"},{"location":"reference/qc/odk_checks/","title":"ODK: Basic Quality Control","text":""},{"location":"tutorials/analysis/","title":"Using uPheno in Data Analysis","text":""},{"location":"tutorials/analysis/#using-oba-and-upheno-for-data-analysis","title":"Using OBA and uPheno for data analysis","text":"Authors:
Last update: 27.03.2024.
"},{"location":"tutorials/analysis/#training","title":"Training","text":"Authors:
Last update: 27.03.2024.
"},{"location":"tutorials/curation/#training","title":"Training","text":"