HPOSet

HPOSet instances contains a set of HPO terms. This class is useful to represent a patient’s clinical information.

It provides analytical helper functions to narrow down the actual provided clinical information.

HPOSet class

class pyhpo.set.HPOSet(items)[source]

child_nodes

HPOSet.child_nodes()[source]

Return a new HPOSet tha contains only the most specific HPO term for each subtree

It basically will return only HPO terms that do not have descendant HPO terms present in the set

Returns:HPOSet instance that contains only the most specific child nodes of the current HPOSet
Return type:HPOSet

remove_modifier

HPOSet.remove_modifier()[source]

Removes all modifier terms. By default, this includes

  • Mode of inheritance: 'HP:0000005'
  • Clinical modifier: 'HP:0012823'
  • Frequency: 'HP:0040279'
  • Clinical course: 'HP:0031797'
  • Blood group: 'HP:0032223'
  • Past medical history: 'HP:0032443'
Returns:HPOSet instance that contains only Phenotypic abnormality HPO terms
Return type:HPOSet

replace_obsolete

HPOSet.replace_obsolete(verbose=False)[source]

Replaces obsolete terms with the replacement term

Warning

Not all obsolete terms have a replacement

Parameters:verbose (bool, default: False) – Print warnings if an obsolete term does not have a replacement.
Returns:A new HPOSet
Return type:HPOSet

all_genes

HPOSet.all_genes()[source]

Calculates the union of the genes attached to the HPO Terms in this set

Returns:Set of all genes associated with the HPOTerms in the set
Return type:set of annotations.Gene

omim_diseases

HPOSet.omim_diseases()[source]

Calculates the union of the Omim diseases attached to the HPO Terms in this set

Returns:Set of all Omim diseases associated with the HPOTerms in the set
Return type:set of annotations.Omim

information_content

HPOSet.information_content(kind=None)[source]

Gives back basic information content stats about the HPOTerms within the set

Parameters:kind (str, default: omim) – Which kind of information content should be calculated. Options are [‘omim’, ‘orpha’, ‘decipher’, ‘gene’]
Returns:Dict with the following items
  • mean - float - Mean information content
  • max - float - Maximum information content value
  • total - float - Sum of all information content values
  • all - list of float - List with all information content values
Return type:dict

variance

HPOSet.variance()[source]

Calculates the distances between all its term-pairs. It also provides basic calculations for variances among the pairs.

Returns:Tuple with the variance metrices
  • int Average distance between pairs
  • int Smallest distance between pairs
  • int Largest distance between pairs
  • list of int List of all distances between pairs
Return type:tuple of (int, int, int, list of int)

combinations

HPOSet.combinations()[source]

Helper generator function that returns all possible two-pair combination between all its terms

This function is direction dependent. That means that every pair will appear twice. Once for each direction

Yields:

Tuple of term.HPOTerm – Tuple containing the follow items

  • HPOTerm instance 1 of the pair
  • HPOTerm instance 2 of the pair

Examples

ci = HPOSet([term1, term2, term3])
ci.combinations()

# Output:
[
    (term1, term2),
    (term1, term3),
    (term2, term1),
    (term2, term3),
    (term3, term1),
    (term3, term2)
]

combinations_one_way

HPOSet.combinations_one_way()[source]

Helper generator function that returns all possible two-pair combination between all its terms

This methow will report each pair only once

Yields:

Tuple of term.HPOTerm – Tuple containing the follow items

  • HPOTerm instance 1 of the pair
  • HPOTerm instance 2 of the pair

Example

ci = HPOSet([term1, term2, term3])
ci.combinations()

# Output:
[
    (term1, term2),
    (term1, term3),
    (term2, term3)
]

similarity

HPOSet.similarity(other, kind='omim', method=None)[source]

Calculates the similarity to another HPOSet According to Robinson et al, American Journal of Human Genetics, (2008) and Pesquita et al, BMC Bioinformatics, (2008)

Parameters:
  • other (HPOSet) – Another HPOSet to measure the similarity to
  • kind (str, default omim) – Which kind of information content should be calculated. Options are [‘omim’, ‘orpha’, ‘decipher’, ‘gene’]
  • method (string, default resnik) –

    The method to use to calculate the similarity.

    Available options:

    • resnik - Resnik P, Proceedings of the 14th IJCAI, (1995)
    • lin - Lin D, Proceedings of the 15th ICML, (1998)
    • jc - Jiang J, Conrath D, ROCLING X, (1997) Implementation according to R source code
    • jc2 - Jiang J, Conrath D, ROCLING X, (1997) Implementation according to paper from R hposim library Deng Y, et. al., PLoS One, (2015)
    • rel - Relevance measure - Schlicker A, et.al., BMC Bioinformatics, (2006)
    • ic - Information coefficient - Li B, et. al., arXiv, (2010)
    • dist - Distance between HPO terms
    • equal - Calculates exact matches between both sets
Returns:

The similarity score to the other HPOSet

Return type:

float

toJSON

HPOSet.toJSON(verbose=False)[source]

Creates a JSON-like object of the HPOSet

Parameters:verbose (bool, default False) – Include extra properties of the HPOTerm
Returns:a list of HPOTerm dict objects
Return type:list of dict

serialize

HPOSet.serialize()[source]

Creates a string serialization that can be used to rebuild the same HPOSet via pyhpo.set.HPOSet.from_serialized()

Returns:A string representation of the HPOSet
Return type:str

BasicHPOSet class

class pyhpo.set.BasicHPOSet(items)[source]

Child of HPOSet that automatically:

  • removes parent terms
  • removes modifier terms
  • replaces obsolete terms

Class methods

from_queries

classmethod HPOSet.from_queries(queries)[source]

Builds an HPO set by specifying a list of queries to run on the pyhpo.ontology.Ontology

Parameters:queries (list of (string or int)) – The queries to be run the identify the HPOTerm from the ontology
Returns:A new HPOset
Return type:pyhpo.set.HPOSet

Examples

ci = HPOSet([
    'Scoliosis',
    'HP:0001234',
    12
])

from_serialized

classmethod HPOSet.from_serialized(pickle)[source]

Re-Builds an HPO set from a serialized HPOSet object

Parameters:pickle (str) – The serialized HPOSet object
Returns:A new HPOset
Return type:pyhpo.set.HPOSet

Examples

ci = HPOSet(ontology, '12+24+66628')