Statistics

This module provides helper function for statistical analysis of HPOSet or annotations.

HPOEnrichment class

class pyhpo.stats.HPOEnrichment(category: str)[source]

Calculates the enrichment of HPO Terms in an Annotation set.

You can use this class for the following example use cases:

  • You have a list of genes and want to see if some HPO terms are enriched in that group. (e.g. RNAseq differential gene expression)

  • You have a list of OMIM diseases and want to see if they have some underlying HPO symptom.

Parameters

category (str) –

String to declare if enrichment is done for genes or for OMIM diseases

Options are:

  • gene

  • omim

enrichment

HPOEnrichment.enrichment(method: str, annotation_sets: List[pyhpo.annotations.Annotation]) → List[dict][source]

Calculates the enrichment of HPO terms in the provided annotation set

Parameters
  • method (str) –

    The statistical test for enrichment

    • hypergeom Hypergeometric distribution test

  • annotation_sets (list of annoation) – Every annotation item in the list must have an attribute hpos, being a list of HPO-Term indicies

Returns

The enrichment of every HPO term in the annotation_sets list, sorted by descending enrichment. Every dict has the following keys:

  • hpo: HPOTerm

  • count: Number of appearances in the sets

  • enrichment: Enrichment score

Return type

list of dict

EnrichmentModel class

class pyhpo.stats.EnrichmentModel(category: str)[source]

Calculates the enrichment of annotations in an HPOSet.

You can use this class for the following example use cases:

  • You have a set of HPOTerms and want to find the most likely causative gene

  • You have a set of HPOTerms and want to find the underlying disease

Parameters

category (str) –

String to declare if enrichment is done for genes or for OMIM diseases

Options are:

  • gene

  • omim

  • orpha

  • decipher

enrichment

EnrichmentModel.enrichment(method: str, hposet: pyhpo.set.HPOSet) → List[dict][source]

Calculates the enrichment of annotations in the provided HPOSet

Parameters
  • method (str) –

    The statistical test for enrichment

    • hypergeom Hypergeometric distribution test

  • hposet (HPOSet) –

Returns

The enrichment of every annotation item sorted by descending enrichment. Every dict has the following keys:

  • item: Gene or OMIM or Decipher annotation item

  • count: Number of appearances in the sets

  • enrichment: Enrichment score

Return type

list of dict