Annotations

The Annotations section contains various metadata annotations for HPO terms.

GeneSingleton class

class pyhpo.annotations.GeneSingleton(idx: Optional[int], name: str)[source]

This class represents a single gene.

Note

GeneSingleton should never be initiated directly, but only via GeneDict to ensure that every gene is only created once.

id

HGNC gene ID

Type

int

name

HGNC gene synbol

Type

str

symbol

HGNC gene symbol (alias of GeneSingleton.name)

Type

str

hpo

all HPOTerms associated to the gene

Type

set of pyhpo.term.HPOTerm

Parameters
  • idx (int) – HGNC gene ID

  • name (str) – HGNC gene synbol

toJSON

GeneSingleton.toJSON(verbose: bool = False) → Dict[source]

JSON (dict) representation of Gene

Parameters

verbose (bool, default: False) – Return all associated HPOTerms

Returns

A dict with the following keys

  • id - The HGNC ID

  • name - The gene symbol

  • symbol - The gene symbol (same as name)

  • hpo - (If verbose == True): set of pyhpo.term.HPOTerm

Return type

dict

GeneDict class

class pyhpo.annotations.GeneDict[source]

An associative dict of all genes

Ensures that every gene is a single GeneSingleton instance and no duplicate instances are generated during parsing of the Gen-Pheno-HPO associations.

This class is initilized once and genes are created by calling the instance of GeneDict to ensure that the same gene exists only once.

For example

Gene = GeneDict()
gba = Gene(symbol='GBA')
ezh2 = Gene(symbol='EZH2')
gba_2 = Gene(symbol='GBA')

gba is ezh2
>> False
gba is gba_2
>> True
Parameters
  • cols (list, default: None) –

    Only used for backwards compatibility reasons. Should have the following entries

    • None

    • None

    • HGNC-ID

    • Gene symbol

  • hgncid (int) – The HGNC ID

  • symbol (str) – The gene symbol (alternative to name)

Returns

Return type

GeneSingleton

DiseaseSingleton class

class pyhpo.annotations.DiseaseSingleton(idx: int, name: str)[source]

This class represents a single disease.

Note

DiseaseSingleton should never be initiated directly, but only via the appropriate disease dictionary, e.g. OmimDict (DiseaseDict) to ensure that every disease is only created once.

id

Disease ID

Type

int

name

disease name

Type

str

hpo

all HPOTerms associated to the disease

Type

set of pyhpo.term.HPOTerm

Parameters
  • idx (int) – Disease ID

  • name (str) – Disease name

toJSON

DiseaseSingleton.toJSON(verbose: bool = False) → Dict[source]

JSON (dict) representation of Disease

Parameters

verbose (bool, default: False) – Return all associated HPOTerms

Returns

A dict with the following keys

  • id - The Disease ID

  • name - The disease name

  • hpo - (If verbose == True): set of pyhpo.term.HPOTerm

Return type

dict

DiseaseDict class

class pyhpo.annotations.DiseaseDict[source]

An associative dict of all Omim Diseases

Ensures that every Omim Disease is a single OmimDisease instance and no duplicate instances are generated during parsing of the Gen-Pheno-HPO associations.

This class is initilized once and diseases are created by calling the instance of DiseaseDict to ensure that the same disease exists only once.

For example

Disease = OmimDict()
gaucher = Disease(diseaseid=1)
fabry = Disease(diseaseid=2)
gaucher_2 = Disease(diseaseid=1)

gaucher is fabry
>> False
gaucher is gaucher_2
>> True
Parameters
  • cols (list, default: None) –

    Only used for backwards compatibility reasons. Should have the following entries

    • None

    • Disease ID

    • Disease Name

  • diseaseid (int) – The Disease ID

  • name (str) – The disease name

Returns

Return type

DiseaseSingleton

Omim

Instance of pyhpo.annotations.DiseaseDict to handle Omim diseases. Ensures that diseases are not duplicated through use of Singletons.

Orpha

Instance of pyhpo.annotations.DiseaseDict to handle Orphanet diseases. Ensures that diseases are not duplicated through use of Singletons.

Decipher

Instance of pyhpo.annotations.DiseaseDict to handle Decipher diseases. Ensures that diseases are not duplicated through use of Singletons.

HPO_Gene class

class pyhpo.annotations.HPO_Gene(filename: Optional[str] = None, path: str = './')[source]

Associative dict to link an HPO term to a Gene

Parameters
  • filename (str) – Filename of HPO-Gene association file. Defaults to filename from HPO

  • path (str) – Path to data files. Defaults to ‘./’

Methods

parse_pheno_file

pyhpo.annotations.parse_pheno_file(filename: Optional[str] = None, path: str = './', delimiter: str = '\t') → Tuple[Any, ][source]

Parses OMIM-HPO assoation file and generates a positive and negative annotation dictionary

Parameters
  • filename (str) – Filename of HPO-Gene association file. Defaults to filename from HPO

  • path (str) – Path to data files. Defaults to ‘./’

Returns

  • omim_dict (dict) – Dictionary containing all HPO-OMIM associations. HPO-ID is the key

  • negative_omim_dict (dict) – Dictionary containing all negative HPO-OMIM associations. HPO-ID is the key

remove_outcommented_rows

pyhpo.annotations.remove_outcommented_rows(fh: Iterator[str], ignorechar: str = '#') → Iterator[str][source]

Removes all rows from a filereader object that start with a comment character

Parameters
  • fh (iterator) – any object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable

  • ignorechar (str, defaults: #) – All lines starting with this character will be ignored

Yields

row (str) – One row of the fh iterator