-
Notifications
You must be signed in to change notification settings - Fork 3
UniProtToOWL
Robert Hoehndorf edited this page Jan 12, 2016
·
9 revisions
Given protein XYZ, we generate
- class XYZ (instances are one protein)
- class XYZ_all (instance is the set of all XYZ proteins in the universe)
- class XYZ_isoform for all isoforms of XYZ
- class XYZ_generic (the 'generic' form of the protein, orthology group)
We also generate the following axioms:
XYZ SubClassOf: XYZ_generic
XYZ_isoform SubClassOf: XYZ
XYZ_isoform SubClassOf: isoform-of some XYZ
XYZ SubClassOf: member-of some XYZ_all
-
XYZ_all SubClassOf: { xyz }
[XYZ_all is a singleton class] -
XYZ_all SubClassOf: has-member only XYZ
[XYZ_all is homogenic; not in EL!]
Then, we can distinguish the following:
- Every isoform of XYZ has function GO:123:
isoformOf some XYZ SubClassOf: has-function some GO:123
- Some isoform of XYZ has function GO:123:
XYZ SubClassOf: has-isoform some (has-function some GO:123)
- All generic XYZ proteins have function GO:123:
XYZ_generic SubClassOf: has-function some GO:123
- Current/standard approach:
XYZ SubClassOf: has-function some (realized-by only (located-in some GO:321))
, or, for processes:XYZ SubClassOf: has-function some (realized-by only GO:321)
- The problem: not in OWL-EL, will never, ever, scale to size of UniProt!
- Suggested solution: Transform statement in: at least one member of XYZ_all is located in membrane (or participates in process, etc.):
XYZ_all SubClassOf: has-member some (located-in some GO:321)
, or for processes:XYZ_all SubClassOf: has-member some (participates-in some GO:321)
- This is in OWL-EL!
- given a DL Query
Q
, how can we decide whether to rewrite the query (to use the set representation of proteins) or use the class for individual proteins?
- Small example conversion of two protein entries in UniProt is at http://aber-owl.net/aber-owl/diseasephenotypes/drugs/uniprot-test.owl
- Github repo [Warning: hacky, buggy]
- complete conversion:
- build a complete data model, able to capture all of UniProt
- decide on annotation properties vs. object properties
- test conversion of SwissProt and measure reasoner performance
- convert all of UniProt and test reasoner performance [maybe in the next 10 years]