This page is part of the Genetic Reporting Implementation Guide (v2.0.0: STU 2) based on FHIR R4. This is the current published version in its permanent home (it will always be available at this URL). For a full list of available versions, see the Directory of published versions
Contents:
This guide introduces Observation profiles to report results of tests that involve sequencing the DNA, RNA or amino acid chains of specimens. This includes direct sequencing, shotgun sequencing, array-based variant testing, and other mechanisms. See the scope and overview sections for more details.
Currently, there is one profile used to model most variant information: Variant. In future versions of this implementation guide, HL7 may subdivide Variant into multiple sub-profiles with more specific purpose.
The purpose of the Variant Observation profile is to capture genomic data elements fulfilling the following three purposes:
It is NOT intended to support other Observations that are clinically relevant on their own and have separate profiles defined in this guide, including:
This Implementation Guide supports two reporting patterns for defining variants:
For each variant reporting pattern, different components MUST be used to properly define the variant where possible. Other components MAY be used to provide additional information for cross referencing external sources or increasing human readability of the instance.
Additional resources that implementers may want to leverage when reporting variant information include NCBI’s ClinVar, a freely accessible public archive of reports of the relationships among human variations and phenotypes, and NCBI’s Variation Services that relies on a common data model described as Sequence Position Deletion Insertion (SPDI).
This pattern describes the observed nucleotide sequence or configuration using HGVS or ISCN statement strings. Care should be taken to follow nomenclature guidelines and properly distinguish variants with the degree of precision needed for clinical use. Note that synonyms may arise within these nomenclatures so downstream validation and normalization may be required.
Defining Component | Example Value | Note |
---|---|---|
genomic-hgvs (LOINC 81290-9) OR coding-hgvs (LOINC 48004-6) | { "system" : "http://varnomen.hgvs.org", "code" : "NM_022787.3:c.769G>A" } |
Proper usage of HGVS contains the reference sequence identifier followed by ‘:g.’ for genomic or ‘:c.’ for a coding sequence. In HGVS notation, the “=” (equals) is used to indicate a sequence was tested but found unchanged [ref]. |
cyogenomic-nomenclature (LOINC 81291-7) | { "system" : "urn:oid:2.16.840.1.113883.6.299", "code" : "46,XX,t(9;22)(q34;q4)" } |
more information on formatting structural variations below. |
This representation leverages multiple component slices to communicate an allele within the context of a specific location on a reference sequence. It is the most accurate definitional representation in FHIR, but is limited to variations with known breakpoints, and alleles should be normalized per VCF specifications. Note that VCF representations often specify genome build and chromosome identifiers rather than explicit reference sequences. Build and chromosome identifier may optionally be included for cross reference.
Defining Component | Example Value | Note |
---|---|---|
genomic-ref-seq (LOINC 48013-7) | { "system" : "http://www.ncbi.nlm.nih.gov/nuccore", "code" : "NC_000010.10" } |
must send at least this or transcript refseq |
transcript-ref-seq (LOINC 51958-7) | { "system" : "http://www.ncbi.nlm.nih.gov/refseq", "code" : "NM_000044.3" } |
must send at least this or genomic refseq |
coordinate-system (LOINC 92822-6) | { "system" : "http://loinc.org", "code" : "LA30102-0", "display" : "1-based character counting" } |
Common coordinate systems are described by LOINC. |
exact-start-end (LOINC 81254-5) | { "valueRange" : { "low" : { "value" : 96527334 } } } |
Interpretation of this number requires the reference sequence and coordinate system. |
ref-allele (LOINC 69547-8) | "valueString" : "C" | This string should be normalized per the VCF standard. |
alt-allele (LOINC 69551-0) | "valueString" : "A" | If the reference allele is tested and found unchanged, this string should be equal to the REF allele string. |
Clinical context for observing the variant are often important in understanding the variant. Often, multiple tests are done to determine the values for these elements.
Note that many other properties associated with contextualizing the observed variant are stored in other FHIR resources. For example, ethnicity/race/biological sex are on the Patient resource (referenced by Observation.subject) and the tissue/tumor type are on Observation.specimen.
Contextual Attribute or Component | Example Value | Note |
---|---|---|
Observation.value | { "system" : "http://loinc.org", "code" : "LA9633-4", "display" : "Present" } |
If not searching for specific variations and merely reporting what's found, the Observation’s value should be set to "Present". Details about the specific variant in consideration must be populated using one of the above patterns of components. |
Observation.method | { "system" : "http://loinc.org", "code" : "LA26398-0", "display" : "Sequencing" } |
Indicates the method of variant analysis. The bound list is currently extensible, allowing for codification of other method types here at varying degrees of granularity. The work group is considering additional approaches to modeling testing methods and is requesting implementer feedback at this time. |
component[allelic-read-depth] (LOINC 82121-5) | { "valueQuantity" : { "value" : 120, "unit" : "reads per base pair", "system" : "http://unitsofmeasure.org", "code" : "{reads}/{base}" } } |
Specifies the number of reads that identifies the allele in question, whether it consists of one or more contiguous nucleotides. Different methods and purposes require different numbers of reads to be acceptable. Often >400, sometimes as few as 2-4. |
component[allelic-state] (LOINC 53034-5) | { "system" : "http://loinc.org", "code" : "LA6705-3", "display" : "homozygous" } |
|
component[sample-allelic-frequency] (LOINC 81258-6) | 0.44 | |
component[genomic-source-class] (LOINC 48002-0) | { "system" : "http://loinc.org", "code" : "LA6684-0", "display" : "Somatic" } |
|
component[variant-inheritance] (TBD-variant-inheritance) | maternal | paternal | unknown | |
component[confidence] | high | intermediate | low |
This implementation guide defines several components on Variant that can be used to provide helpful cross references and low-level annotations - annotations completely determined by the molecular change and calculable for example by NCBI.
Annotational Component | Example Value | Note |
---|---|---|
cytogenetic-location (LOINC 48001-2) | derivable from refseq | |
variation-code (LOINC 81252-9) | { "system": "http://www.ncbi.nlm.nih.gov/clinvar", "code":"619728", "display" : "NC_000019.8:g.1171707G>A" } |
Optional to include. Multiple database code systems are used across specialties and carry different levels of information. |
chromosome-identifier (LOINC 48000-4), reference-sequence-assembly (LOINC 62374-4) | (eg Chr2, b38) | Optional to include. Commonly used in VCF as shorthand but can introduce ambiguity without a fully versioned reference sequence. |
gene-studied (LOINC 48018-6) | { "system": "http://www.genenames.org/geneId", "code" : "HGNC:644", "display" : "AR" } |
derivable from reference sequence |
molecular-consequence | { "system" : "http://sequenceontology.org", "code" : "SO:0001583", "display" : "missense_variant" } |
available from NCBI, tied to representative transcript |
coding-change-type (LOINC 48019-4) | { "system" : "http://sequenceontology.org", "code" : "SO:1000002", "display" : "substitution" } |
may be required for certain types of structural variants. See below. |
protein-hgvs (LOINC 48005-3) | { "system": "http://varnomen.hgvs.org", "code" : "NP_006209.2:p.(His1047Arg)", "display" : "p.H1047R" } |
Proper usage of HGVS contains the reference sequence and usage of parenthesis to denote the amino acid change is calculated rather than directly observed. |
The Variant profile is capable of representing a subset of structural variants (e.g. copy number variants), whereas more complex structural variants such as translocations will require additional model development. We provide guidance here on how the Variant profile can be used to communicate common structural variants. (The cytogenomic-nomenclature component can also be used to communicate more complex structural variants, in ISCN notation).
While sources vary in how they differentiate simple vs. structural variants, here we differentiate based on end point precision and by variant length. All variants with imprecise end points are treated as structural variants. Variant length is more arbitrary - consider for example, a precisely located deletion of one base pair, of one thousand base pairs, of one hundred thousand base pairs. As variant length increases, it can become onerous to include REF and ALT alleles. Therefore, where end points of a variant are precisely known, users can decide whether to represent the variant as a simple or a structural variant, although a variant length of 50 bases is often used as a cutoff.
Suggested minimal representations of common structural variants are shown in the table. Data originating from NGS or DNA Chip testing will (and data originating from FISH testing will not) generally be amenable to these recommendations. HGVS representations, and any other components in the Variant profile not mentioned, can also be communicated where appropriate.
Structural Variant Component | CNV | DUP | DEL | INV | INS |
---|---|---|---|---|---|
[1..1] component: coding-change-type (LOINC 48019-4) | SO:0001019 | copy_number_variation | http://sequenceontology.org | SO:1000035 | duplication | http://sequenceontology.org | SO:0000159 | deletion | http://sequenceontology.org | SO:1000036 | inversion | http://sequenceontology.org | SO:0000667 | insertion | http://sequenceontology.org |
[1..1] component: genomic-ref-seq (LOINC 48013-7) | Should be populated with chromosome-level reference sequence (e.g. NCBI RefSeq “NC_000001.10”) | ||||
genomic-source-class (LOINC 48002-0) | should be populated | ||||
[0..1] component: allelic-state (LOINC 53034-5) | not used | should be populated where genomic-source-class is germline | should be populated where genomic-source-class is germline | should be populated where genomic-source-class is germline | should be populated where genomic-source-class is germline |
[0..1] component: sample-allelic-frequency (LOINC 81258-6) | not used | often not used, optional if genomic-source-class is somatic | often not used, optional if genomic-source-class is somatic | not used | not used |
[0..1] component: copy-number (LOINC 82155-3) | should be populated | optional | optional | not used | not used |
[1..1] component: ref-allele (LOINC 69547-8) | Generally not used or useful due to the length of the change. In some cases, a ref allele may be reported for a deletion. | ||||
[0..1] component: alt-allele (LOINC 69551-0) | Generally, structural variants will not report Alt Alleles. In some cases, an Alt Allele may be reported for an Insertion. | ||||
[1..1] component: coordinate-system (LOINC 92822-6) | Should be populated | ||||
[0..1] component: outer-start-end (LOINC 81301-4) and [0..1] component: inner-start-end (LOINC 81302-2) | Outer and/or inner start should be populated, as well as outer and/or inner end. |
Compound heterozygosity is the presence of two different heterozygous mutations in a gene, one on each chromosome of a pair.
While a single HGVS expression within a single Variant observation can be used to represent a compound heterozygous state (e.g. ‘LRG_199t1:c.[2376G>C];[3103del]’), we recommend that compound heterozygotes be represented using two Variant observations, each showing one of the heterozygous variants.
The special case where the two different variants are at the same base pair location can also be represented by two Variant observations. For example:
It is important to note in this example that the Variant observation indicates the presence or absence of a variant with the given ALT allele, and does not indicate presence of the REF allele on the other chromosome of the pair. Even though it is almost always the case for heterozygous variants that one chromosome has the variant and the other chromosome has the reference allele, one should not assume this is the case from a single heterozygous Variant without separate knowledge that no other variants were detected at that position.
If there is a need to be explicit about both alleles at a given location, one can use the Genotype observation profile. Here is a bundle showing a compound heterozygous example with two Variant instances grouped together by one Genotype instance.
Variant confidence status provides a way to indicate the reporting organization’s confidence that the variant is truly positive. Noting the confidence level may be important in the overall interpretation of the variant and related implications. Confidence is noted as High, Intermediate or Low. The following example shows how to express a high variant confidence:
{ "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/variant-confidence-status-cs", "code" : "high", "display" : "High" }
It is important to note that variant confidence status is not a required component. If the reporting organization’s confidence levels are not structured in a way that it can be reported using this standard coding system, implementers must determine other ways to ensure that confidence in a variant call is understood. For example, a laboratory might only send structured variants when the confidence is 'high' that they are truly positive.