API for COSMIC Structural Genomic Rearrangements data
This API provides information from the COSMIC Structural Genomic Rearrangements data set provided by the Sanger Institute, which includes both GRCh37 and GRCh38.
Source files (requires a login account to access) are available from the COSMIC download page. Release version:This service is provided "as is" and free of charge. Please see the Frequently Asked Questions page for more details on terms of service, etc.
The following demo shows how this API might be used with an autocompleter we've developed. (Example: Try typing 123.)
For further experimentation with the autocompleter and this API, try the autocompleter demo page.
API Base URL: https://clinicaltables.nlm.nih.gov/api/cosmic_struct/v3/search (+ query string parameters)
This data set may also be accessed through the FHIR ValueSet $expand operation.
In addition to the base URL, you will need to specify other parameters. See the query string parameters section below for details.
Query String Parameters and Default Values
At a minimum, when using the above base URL, you will need to specify the "terms" parameter containing a word or partial word to match.
|Parameter Name||Default Value||Description|
|terms||(Required.) The search string (e.g., just a part of a word) for which to find matches in the list. More than one partial word can be present in "terms", in which case there is an implicit AND between them.|
|maxList||Optional, with a default of 7. Specifies the number of results requested, up to the upper limit of 500. If present but the value is empty, 500 will be used.|
|q||An optional, additional query string used to further constrain the results returned by the "terms" field. Unlike the terms field, "q" is not automatically wildcarded, but can include wildcards and can specify field names. See the Elasticsearch query string page for documentation of supported syntax.|
|grchv||37||Genome Reference Consortium Human build version, either 37 or 38.|
|df||MutationID, Description, MutationType||A comma-separated list of display fields (from the fields section below) which are intended for the user to see when looking at the results.|
|sf||All fields||A comma-separated list of fields to be searched.|
|cf||MutationID||A field to regard as the "code" for the returned item data.|
|ef||A comma-separated list of additional fields to be returned for each retrieved list item. (See the Output format section for how the data for fields is returned.) If you wish the keys in the returned data hash to be something other than the field names, you can specify an alias for the field name by separating it from its field name with a colon, e.g., "ef=field_name1:alias1,field2,field_name3:alias3,etc. Note that not every field specified in the ef parameter needs to have an alias.|
COSMIC Field Descriptions
|Field||Field Description (Description text taken from the COSMIC website.)|
|BreakPointOrder||For variants involving multiple breakpoints, the predicted order along chromosome(s).Otherwise '0'.|
|ChromFrom||The chromosome where the first variant/breakpoint occurs.|
|ChromTo||The chromosome where the last variant/breakpoint occurs.|
|Description||A description of the change that occurred.|
|ID_STUDY||Lists the unique Ids of studies that have involved this structural mutation.|
|LocationFromMax||The last position in breakpoint range.|
|LocationFromMin||The first position in breakpoint range.|
|LocationToMax||The last position in breakpoint range.|
|LocationToMin||The first position in breakpoint range.|
|MutationID||Unique mutation identifier.|
|MutationType||Intra/Inter (chromosomal), tandem duplication, deletion, inversion, complex substitutions, complex amplicons.|
|NonTemplatedInsSeq||Non Templated Sequence (if any) which is inserted at the breakpoint. The sequence is not encoded.|
|PrimaryHistology||The histological classification of the sample.|
|PrimarySite||The primary tissue/cancer from which the sample originated.|
|Site||PrimarySite and PrimaryHistology.|
|StrandFrom||positive or negative.|
|StrandTo||positive or negative.|
|GRChVer||The Genome Reference Consortium Human build version, can be either 37 or 38.|
Output for an API query is an array of the following elements:
- The total number of results on the server (which can be more than the number returned). For APIs in which there are millions of records, this number might be a lower bound due to early termination if there are more than a hundred thousand results.
- An array of codes for the returned items. (This is the field specified with the cf query parameter above.)
- A hash of the "extra" data requested via the "ef" query parameter above. The keys on the hash are the fields (or their requested aliases) named in the "ef" parameter, and the value for a field is an array of that field's values in the same order as the returned codes.
- An array, with one element for each returned code, where each element is an array of the display strings specified with the "df" query parameter.
- An array, with one element for each returned code, where each element is the "code system" for the returned code. Note that only code-system aware APIs will return this array.
Sample API Queries
|https://clinicaltables.nlm.nih.gov/api/cosmic_struct/v3/search?terms=16457||[12,["16457","164576","164578","164574","164575","164577","164570"],null,[["16457","chr19:g.12475768_14154446dup","intrachromosomal tandem duplication"],["164576","chrX:g.(80880718_80880738)_(80881080_80881100)del","intrachromosomal deletion"],["164578","chrX:g.(49637676_49637696)_(49638184_49638204)inv","intrachromosomal inversion"],["164574","chr7:g.(19266662_19266682)_(19270752_19270772)del","intrachromosomal deletion"],["164575","chr22:g.(35538021_35538041)_(35538288_35538308)del","intrachromosomal deletion"],["164577","chr21:g.(34533877_34533897)_(39182253_39182273)inv","intrachromosomal inversion"],["164570","chr3:g.(178636392_178636412)_(178637611_178637631)del","intrachromosomal deletion"]]]||Finds Mutation Ids containing "16457" from the COSMIC Structural Genomic Rearrangements data set. Seven of twelve mutation ids are returned as code fields, no extra data was requested ("ef" was not specified in the URL), and finally the three (default) display fields for each record are returned.|