API for HPO (The Human Phenotype Ontology)
Per their website, "The Human Phenotype Ontology (HPO) provides a standardized vocabulary of phenotypic abnormalities encountered in human disease." The HPO "is a flagship product of the Monarch Initiative, an NIH-supported international consortium dedicated to semantic integration of biomedical and model organism data with the ultimate goal of improving biomedical research."
This service is provided "as is" and free of charge. Please see the Frequently Asked Questions page for more details on terms of service, etc.
API Demo
The following demo shows how this API might be used with an autocompleter we've developed.
For further experimentation with the autocompleter and this API, try the autocompleter demo page.
API Documentation
API Base URL: https://clinicaltables.nlm.nih.gov/api/hpo/v3/search (+ query string parameters)
This data set may also be accessed through the FHIR ValueSet $expand operation.
In addition to the base URL, you will need to specify other parameters. See the query string parameters section below for details.
Query String Parameters and Default Values
At a minimum, when using the above base URL, you will need to specify the "terms" parameter containing a word or partial word to match.
Parameter Name | Default Value | Description |
---|---|---|
terms | (Required.) The search string (e.g., just a part of a word) for which to find matches in the list. More than one partial word can be present in "terms", in which case there is an implicit AND between them. | |
maxList | 7 | Optional, with a default of 7. Specifies the number of results requested, up to the upper limit of 500. If present but the value is empty, 500 will be used. Note that this parameter does not support pagination, see "count" and "offset" below for details on pagination support. |
count | 7 | The number of results to retrieve (page size). The maximum count allowed is 500, see "offset" below on pagination support. |
offset | 0 | The starting result number (0-based) to retrieve. Use offset and count together for pagination. Note that the current limit on the total number of results that can be retrieved (offset + count) is 7,500. We reserve the right to decrease or increase this limit based on system capacity and/or other factors. Please see the FAQ page on how to sign up to our email list to be notified of any changes or new features. |
q | An optional, additional query string used to further constrain the results returned by the "terms" field. Unlike the terms field, "q" is not automatically wildcarded, but can include wildcards and can specify field names. See the Elasticsearch query string page for documentation of supported syntax. | |
df | id,name | A comma-separated list of display
fields (from the fields section below) which are
intended for the user to see when looking at the results. The parameter "ef" (see below) may also be used to specify the data fields to retrieve. The main difference is that the value of "df" is always a string (for display), while the value for "ef" could be a json object when the field value has a complex structure. |
sf | id,name,synonym.term | A comma-separated list of fields to be searched. |
cf | id | A field to regard as the "code" for the returned item data. |
ef | A comma-separated list of additional fields to be
returned for each retrieved list item. (See the Output format section for how the data for fields
is returned.) If you wish the keys in the returned data hash to be something
other than the field names, you can specify an alias for the field name by
separating it from its field name with a colon, e.g.,
"ef=field_name1:alias1,field2,field_name3:alias3,etc. Note that not
every field specified in the ef parameter needs to have an alias. The parameter "df" (see above) may also be used to specify the data fields to retrieve. The main difference is that the value of "df" is always a string (for display), while the value for "ef" could be a json object when the field value has a complex structure. |
HPO Field Descriptions
Field | Field Description |
---|---|
id | The HPO term id. |
name | The HPO term name. |
definition | The HPO term definition. |
def_xref | A list of xrefs (references to other ids such as the MeSH ids) from the HPO definition field. |
created_by | The "created_by" in the HPO data records, which may be a person's name or some kind of id. |
creation_date | The "creation_date" in the HPO data records. |
comment | The comment field for the record. |
is_obsolete | A boolean flag (true or false) indicating whether the record is obsolete. |
replaced_by | The replaced_by (id) as provided in the HPO data. |
consider | A list of ids (as an array) from the "consider" field of the HPO data. |
alt_id | A list of alternative ids for the record. |
synonym | A list of synonyms, each of which is an object (structure) with the following four fields: term, relation, type, and xref. |
synonym.term | The synonym term. |
synonym.relation | The relationship between the synonym term and the concept term itself, e.g., EXACT, BROAD. |
synonym.type | The type of the synonym, e.g., uk_spelling, layperson. |
synonym.xref | The xref of the synonym as in the HPO data records. |
is_a | A list of super concept ids, each of which is an object (structure) with the following two fields: id and name. |
is_a.id | The id of the super concept. |
is_a.name | The name of the super concept. |
xref | A list of xrefs, each of which is an object (structure) with the following two fields: id and name. |
xref.id | The id of the xref. |
xref.name | The name of the xref. |
property | A list of properties, each of which is an object (structure) with the following four fields: name, value, data_type, and xref. |
property.name | The name of the property. |
property.value | The value of the property. |
property.data_type | The data type of the property. |
property.xref | The xref of the property as provided in the HPO data. |
Output format
Output for an API query is an array of the following elements:
- The total number of results on the server, which can be more than the number of results returned. This reported total number of results may also be significantly less than the actual number of results and is limited to 10,000, which may significantly improve the service response time.
- An array of codes for the returned items. (This is the field specified with the cf query parameter above.)
- A hash of the "extra" data requested via the "ef" query parameter above. The keys on the hash are the fields (or their requested aliases) named in the "ef" parameter, and the value for a field is an array of that field's values in the same order as the returned codes.
- An array, with one element for each returned code, where each element is an array of the display strings specified with the "df" query parameter.
- An array, with one element for each returned code, where each element is the "code system" for the returned code. Note that only code-system aware APIs will return this array.
Sample API Queries
Query | Result | Description |
---|---|---|
https://clinicaltables.nlm.nih.gov/api/hpo/v3/search?terms=bloo | [342,["HP:0001871","HP:0001898","HP:0002632","HP:0002971","HP:0003111","HP:0003138","HP:0004421"],null,[ [ "Abnormality of blood and blood-forming tissues" ], [ "Increased red blood cell mass" ], [ "Low-to-normal blood pressure" ], [ "Absent microvilli on the surface of peripheral blood lymphocytes" ], [ "Abnormal blood ion concentration" ], [ "Increased blood urea nitrogen" ], [ "Elevated systolic blood pressure" ]]] | Returns a list of 7 (out of 342 total) HPO records that match (or start with) "bloo". |