R5 Final QA
This page is part of the FHIR Specification (v5.0.0-draft-final: Final QA Preview for R5 - see ballot notes). The current version which supercedes this version is 5.0.0. For a full list of available versions, see the Directory of published versions . Page versions: R5 R4
Substanceprotein.shex

Biomedical Research and Regulation Work Group
Maturity Level: N/A
Standards Status: Informative
Compartments: Not linked to any defined compartments
Raw ShEx
ShEx statement for substanceprotein
PREFIX fhir: <http://hl7.org/fhir/> 
PREFIX fhirvs: <http://hl7.org/fhir/ValueSet/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

IMPORT <string.shex>
IMPORT <integer.shex>
IMPORT <Attachment.shex>
IMPORT <Identifier.shex>
IMPORT <DomainResource.shex>
IMPORT <CodeableConcept.shex>
IMPORT <BackboneElement.shex>


start=@<SubstanceProtein> AND {fhir:nodeRole [fhir:treeRoot]}

# A SubstanceProtein is defined as a single unit of a linear amino acid sequence, or a combination of subunits that are either covalently linked or have a defined invariant stoichiometric relationship. This includes all synthetic, recombinant and purified SubstanceProteins of defined sequence, whether the use is therapeutic or prophylactic. This set of elements will be used to describe albumins, coagulation factors, cytokines, growth factors, peptide/SubstanceProtein hormones, enzymes, toxins, toxoids, recombinant vaccines, and immunomodulators
<SubstanceProtein> EXTENDS @<DomainResource> CLOSED {   

    a [fhir:SubstanceProtein]?;
    fhir:nodeRole [fhir:treeRoot]?;

    fhir:sequenceType @<CodeableConcept>?;  # The SubstanceProtein descriptive 
                                            # elements will only be used when a 
                                            # complete or partial amino acid 
                                            # sequence is available or derivable 
                                            # from a nucleic acid sequence 
    fhir:numberOfSubunits @<integer>?;      # Number of linear sequences of 
                                            # amino acids linked through peptide 
                                            # bonds. The number of subunits 
                                            # constituting the SubstanceProtein 
                                            # shall be described. It is possible 
                                            # that the number of subunits can be 
                                            # variable 
    fhir:disulfideLinkage @<OneOrMore_string>?;  # The disulphide bond between two 
                                            # cysteine residues either on the 
                                            # same subunit or on two different 
                                            # subunits shall be described. The 
                                            # position of the disulfide bonds in 
                                            # the SubstanceProtein shall be 
                                            # listed in increasing order of 
                                            # subunit number and position within 
                                            # subunit followed by the 
                                            # abbreviation of the amino acids 
                                            # involved. The disulfide linkage 
                                            # positions shall actually contain 
                                            # the amino acid Cysteine at the 
                                            # respective positions 
    fhir:subunit @<OneOrMore_SubstanceProtein.subunit>?;  # This subclause refers to the 
                                            # description of each subunit 
                                            # constituting the SubstanceProtein. 
                                            # A subunit is a linear sequence of 
                                            # amino acids linked through peptide 
                                            # bonds. The Subunit information 
                                            # shall be provided when the 
                                            # finished SubstanceProtein is a 
                                            # complex of multiple sequences; 
                                            # subunits are not used to delineate 
                                            # domains within a single sequence. 
                                            # Subunits are listed in order of 
                                            # decreasing length; sequences of 
                                            # the same length will be ordered by 
                                            # decreasing molecular weight; 
                                            # subunits that have identical 
                                            # sequences will be repeated 
                                            # multiple times 
}  

# This subclause refers to the description of each subunit constituting the SubstanceProtein. A subunit is a linear sequence of amino acids linked through peptide bonds. The Subunit information shall be provided when the finished SubstanceProtein is a complex of multiple sequences; subunits are not used to delineate domains within a single sequence. Subunits are listed in order of decreasing length; sequences of the same length will be ordered by decreasing molecular weight; subunits that have identical sequences will be repeated multiple times
<SubstanceProtein.subunit> EXTENDS @<BackboneElement> CLOSED {   
    fhir:subunit @<integer>?;               # Index of primary sequences of 
                                            # amino acids linked through peptide 
                                            # bonds in order of decreasing 
                                            # length. Sequences of the same 
                                            # length will be ordered by 
                                            # molecular weight. Subunits that 
                                            # have identical sequences will be 
                                            # repeated and have sequential 
                                            # subscripts 
    fhir:sequence @<string>?;               # The sequence information shall be 
                                            # provided enumerating the amino 
                                            # acids from N- to C-terminal end 
                                            # using standard single-letter amino 
                                            # acid codes. Uppercase shall be 
                                            # used for L-amino acids and 
                                            # lowercase for D-amino acids. 
                                            # Transcribed SubstanceProteins will 
                                            # always be described using the 
                                            # translated sequence; for synthetic 
                                            # peptide containing amino acids 
                                            # that are not represented with a 
                                            # single letter code an X should be 
                                            # used within the sequence. The 
                                            # modified amino acids will be 
                                            # distinguished by their position in 
                                            # the sequence 
    fhir:length @<integer>?;                # Length of linear sequences of 
                                            # amino acids contained in the 
                                            # subunit 
    fhir:sequenceAttachment @<Attachment>?;  # The sequence information shall be 
                                            # provided enumerating the amino 
                                            # acids from N- to C-terminal end 
                                            # using standard single-letter amino 
                                            # acid codes. Uppercase shall be 
                                            # used for L-amino acids and 
                                            # lowercase for D-amino acids. 
                                            # Transcribed SubstanceProteins will 
                                            # always be described using the 
                                            # translated sequence; for synthetic 
                                            # peptide containing amino acids 
                                            # that are not represented with a 
                                            # single letter code an X should be 
                                            # used within the sequence. The 
                                            # modified amino acids will be 
                                            # distinguished by their position in 
                                            # the sequence 
    fhir:nTerminalModificationId @<Identifier>?;  # Unique identifier for molecular 
                                            # fragment modification based on the 
                                            # ISO 11238 Substance ID 
    fhir:nTerminalModification @<string>?;  # The name of the fragment modified 
                                            # at the N-terminal of the 
                                            # SubstanceProtein shall be 
                                            # specified 
    fhir:cTerminalModificationId @<Identifier>?;  # Unique identifier for molecular 
                                            # fragment modification based on the 
                                            # ISO 11238 Substance ID 
    fhir:cTerminalModification @<string>?;  # The modification at the C-terminal 
                                            # shall be specified 
}  

#---------------------- Cardinality Types (OneOrMore) -------------------

<OneOrMore_string> CLOSED {
    rdf:first @<string>  ;
    rdf:rest [rdf:nil] OR @<OneOrMore_string> 
}

<OneOrMore_SubstanceProtein.subunit> CLOSED {
    rdf:first @<SubstanceProtein.subunit>  ;
    rdf:rest [rdf:nil] OR @<OneOrMore_SubstanceProtein.subunit> 
}
Usage note: every effort has been made to ensure that the ShEx files are correct and useful, but they are not a normative part of the specification.