This page is part of the Genetic Reporting Implementation Guide (v1.1.0: STU 2 Ballot 1) based on FHIR R4. The current version which supercedes this version is 2.0.0. For a full list of available versions, see the Directory of published versions
Use this operation to retrieve variants with precise endpoints from a specified genomic region for a specified patient. If the range in question has been studied, the operation returns a FHIR Parameters resource containing variants overlapping the region. If the patient or the specified region has not been studied, the operation returns a 404 error.
OPERATION: FindSubjectVariants
The official URL for this operation definition is:
http://hl7.org/fhir/uv/genomics-reporting/OperationDefinition/find-subject-variants
Parameters
Use | Name | Cardinality | Type | Binding | Documentation |
IN | subject | 1..1 | string (reference) | The subject of interest. | |
IN | region | 1..1 | Range | Region of interest is specified as a 0-based integer interval range. Variants that overlap the range are returned. | |
IN | genomicRefSeq | 1..1 | string (token) | Genomic reference sequence is a valid NCBI chromosome-level ('NC_') build 37 or build 38 identifier, or a valid mitochondrion identifier (NC_012920.1, NC_001807.4) | |
OUT | regionStudied | 0..* | canonical | [Profile: http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/region-studied] Must include 1..* component:ranges-examined; 1..1 component:coordinate-system (valued with '0-based interval counting'); 1..1 component:genomic-ref-seq. | |
OUT | variant | 0..* | canonical | [Profile: http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant] Must include valueCodeableConcept; component:genomic-source-class; component:genomic-ref-seq; component:allelic-state; component:ref-allele; component:alt-allele; component:coordinate-system (valued with '0-based interval counting'); component:exact-start-end. | |
OUT | sequencePhaseRelationship | 0..* | canonical | [Profile: http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/sequence-phase-relationship] Must include valueCodeableConcept; 2..2 derivedFrom:variant. |
Optionality in the Variant profile allows for different implementations to represent variants in different ways. For instance, the following variant representations are synonymous:
While the specific implementation of an Operation is outside the scope of HL7, we do provide guidance on how the above representations might be normalized such that any would be found and returned in a request for variants overlapping, say, the region of the LDLR gene (region=11,089,362-11,133,830; genomicRefSeq=NC_000019.10). There are likely other effective normalization strategies beyond what is described here.
One approach to normalization is to convert all representations to a canonical form, such as the NCBI's Sequence Position Deletion Insertion (SPDI) format. Variant queries then only need to query a single format.
'SPDI' is the NCBI's variation notation for variants with known breakpoints. The notation represents an observed variant sequence using deleted and inserted sequences at a given position in a reference sequence. SPDI notation uses four fields and is written out as four elements delimited by colons S:P:D:I, where S=SequenceId; P=Position, a 0-based coordinate for where the Deleted Sequence starts; D=DeletedSequence, and I=InsertedSequence. Variation Services only support variants where the coordinates of both the upstream and downstream breakpoints are known (e.g. single nucleotide change, deletions at precise coordinates). Such variants can be encoded precisely using the SPDI notation.
NCBI Variation Services provide a rich set of APIs that can be used to normalize variants from many formats (e.g. HGVS, VCF) into SPDI, and to normalize variants in SPDI into a canonical SPDI format. The variant above, in canonical SPDI format, resolves to this: NC_000019.10:11089559:G:A, where it can easily be determined that it overlaps the requested region (NC_000019.10:11,089,362-11,133,830).
LiftOver is a process whereby a genome position is converted from one genome assembly to another genome assembly. It is the process that, among other things, allows us to determine that these two variants are the same:
Several groups have identified edge cases that pose genome assembly conversion challenges (e.g. see PharmGKB's PharmCAT posting; Biostars posting). For example, NC_000001.11:145923295:C:C (build 38 representation) does not convert to a corresponding build 37 representation using NCBI Variation Services. As a result, there is no requirement that servers normalize all variants against a single build.
Rather, where a server is storing variants aligned to multiple builds (and hasn't normalized all variants against a single build), it will be necessary for the server to lift over the query region into corresponding regions in other builds. For example, a query for variants in NC_000001.11:145507556-145513536 (build 38 range) will also need to query for variants in NC_000001.10:145921556-145927537 (build 37 range) in order to gather variants expressed against build 38 and build 37, respectively.
Many efficient and open source lift over tools exist (e.g. many are listed here). As with variant lift over, translating a region between builds can also fail. For example, attempting to liftover NC_000001.11:145923295-145923296 (build 38 range) into a build 37 range with the UCSD Lift Over tool fails, because the region is partially deleted in build 37. In the (very uncommon) case of a failed lift over, a server should widen the query region as necessary in order to have a successful lift over. For example, the widened build 38 range NC_000001.11:145923285-145923306 will successfully translate into the build 37 range NC_000001.10:145511787-145511807.
NOTE: Imprecise implementations are allowed, where results contain some records outside the requested range. This is necessary to support many bioinformatics indexing schemes.
Valid response codes are shown in the following figure and described further in the table. Additional response codes (e.g. 5xx server error) may also be encountered.
Response Code | Description |
---|---|
200 | Successfully executed request (region was studied, variants may or may not have been found) |
400 | ERROR: Invalid query parameters |
404 | ERROR: Data not found (e.g. no data found for patient, no data present for requested region for patient) |
Scenario: Retrieve all variants for patient HG00403 that overlap NC_000001.10:1-500.
IN parameters
{ "resourceType": "Parameters", "parameter": [ { "name": "subject", "valueString": "HG00403" }, { "name": "region", "valueRange": { "low": { "value": 1 }, "high": { "value": 500 } } }, { "name": "genomicRefSeq", "valueString": "NC_000001.10" } ] }
OUT parameters
{ "resourceType": "Parameters", "parameter": [ { "name": "regionStudied", "resource": { "resourceType": "Observation", "id": "rs-a43751ad52c94", "meta": { "profile": [ "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/region-studied" ] }, "status": "final", "category": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/observation-category", "code": "laboratory" } ] } ], "code": { "coding": [ { "system": "http://loinc.org", "code": "53041-0", "display": "DNA region of interest panel" } ] }, "subject": { "reference": "Patient/HG00403" }, "component": [ { "code": { "coding": [ { "system": "http://loinc.org", "code": "51959-5", "display": "Range(s) of DNA sequence examined" } ] }, "valueRange": { "low": { "value": 1 }, "high": { "value": 500 } } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "92822-6", "display": "Genomic coord system" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA30100-4", "display": "0-based interval counting" } ] } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "48013-7", "display": "Genomic reference sequence ID" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://www.ncbi.nlm.nih.gov/nuccore", "code": "NC_000001.10" } ] } } ] } }, { "name": "variant", "resourceType": "Observation", "id": "dv-5a7f781e83514", "meta": { "profile": [ "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant" ] }, "status": "final", "category": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/observation-category", "code": "laboratory" } ] } ], "code": { "coding": [ { "system": "http://loinc.org", "code": "69548-6", "display": "Genetic variant assessment" } ] }, "subject": { "reference": "Patient/HG00403" }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA9633-4", "display": "Present" } ] }, "component": [ { "code": { "coding": [ { "system": "http://loinc.org", "code": "48002-0", "display": "Genomic source class [Type]" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA6683-2", "display": "Germline" } ] } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "62374-4", "display": "Human reference sequence assembly version" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA14029-5", "display": "GRCh37" } ] } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "48013-7", "display": "Genomic reference sequence ID" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://www.ncbi.nlm.nih.gov/nuccore", "code": "NC_000001.10" } ] } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "53034-5", "display": "Allelic state" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA6706-1", "display": "heterozygous" } ] } }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "69547-8", "display": "Genomic Ref allele [ID]" } ] }, "valueString": "A" }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "69551-0", "display": "Genomic Alt allele [ID]" } ] }, "valueString": "G" }, { "code": { "coding": [ { "system": "http://loinc.org", "code": "92822-6", "display": "Genomic coord system" } ] }, "valueCodeableConcept": { "coding": [ { "system": "http://loinc.org", "code": "LA30100-4", "display": "0-based interval counting" } ] } }, { "code": { "coding": [ { "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes", "code": "exact-start-end", "display": "Variant exact start and end" } ] }, "valueRange": { "low": { "value": 300 } } } ] }, { "name": "sequencePhaseRelationship", "resourceType": "Observation", "id": "sid-dc2a23f0322c4", "meta": { "profile": [ "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/sequence-phase-relationship" ] }, "status": "final", "category": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/observation-category", "code": "laboratory" } ] } ], "code": { "coding": [ { "system": "http://loinc.org", "code": "82120-7", "display": "Allelic phase" } ] }, "subject": { "reference": "Patient/HG00403" }, "valueCodeableConcept": { "coding": [ { "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/seq-phase-relationship", "code": "Cis", "display": "Cis" } ] }, "derivedFrom": [ { "reference": "#dv-5a7f781e83514" }, { "reference": "#dv-5a7f781e83514" } ] } ] }