Publish-box (todo)
FHIR Infrastructure Work Group | Maturity Level: N/A | Standards Status: Informative |
This tutorial introduces the FHIR mapping language.
To start with, we're going to consider a very simple case: mapping between two structures that have the same definition, a single element with the same name and the same primitive type:
Source Structure | Target Structure |
TLeft a : string [0..1] |
TRight a : string [0..1] |
The left instance is transformed to the right instance by copying a to a |
Note that for clarity in this tutorial, all the types are prefixed with T.
The first task to do is to set up the mapping context on a default group. All mappings are divided up into a set of groups. For now, we just set up a group named "tutorial" - the same as the name of the mapping. For this tutorial, we also declaring the source and target models, and specify that an application invokes this with a copy of the left (source) instance, and also an empty copy of the right (target) instance:
/// url = "http://hl7.org/fhir/StructureMap/tutorial" /// name = "Tutorial" uses "http://hl7.org/fhir/StructureDefinition/tutorial-left" as source uses "http://hl7.org/fhir/StructureDefinition/tutorial-right" as target group tutorial(source src : TLeft, target tgt : TRight) { // rules go here }
Note that the way the input variables are set up is a choice: we choose to provide the underlying type definitions on which both source and target models are based, and we choose to specify that the invoking application most provide both the source and the target instance trees. Other options are possible; these are discussed further below. The rest of the tutorial examples use the same setup for the group.
Having set up the context, we now need to define the relationships between the source and target structures:
src.a as a -> tgt.a = a "rule_a";
This simple statement says that:
"rule_a" is a purely arbitrary name associated with the rule that appears in logs, error messages, trace files, etc. It has no other meaning in the mapping statements. Mostly, in fact, it is simply automatically generated by the engine. It will not be specified anymore in this tutorial.
Note that there is no types explicitly in this mapping statement, but if the underlying system has types, then the types will have to be correct. If the underlying source and target trees are strongly typed, and the mapping groups have explicit types, then a short hand form is possible:
src.a -> tgt.a;
How this works is described below.
Now consider the case where the elements have different names:
Source Structure | Target Structure |
TLeft a1 : string [0..1] |
TRight a2 : string [0..1] |
The left instance is transformed to the right instance by copying a1 to a2 |
This relationship is a simple variation of the last:
src.a1 as b -> tgt.a2 = b;
Note that the choice of variable name is purely arbitrary. It does not need to be the same as the element name.
Still sticking with very simple mappings, let's consider the case where there is a length restriction on the target model that is shorter than the one on the source model - in this case, 20 characters.
Source Structure | Target Structure |
TLeft a2 : string [0..1] |
TRight a2 : string [0..1] {maxlength = 20} |
The left instance is transformed to the right instance by copying a2 to a2, but tgt.a2 can only be 20 characters long |
There are 3 different ways to express this mapping, depending on what should happen when the length of src.a is > 20 characters:
src.a2 as a -> tgt.a2 = truncate(a, 20); // just cut it off at 20 characters src.a2 as a where a2.length <= 20 -> tgt.a2 = a; // ignore it src.a2 as a check a2.length <= 20 -> tgt.a2 = a; // error if it's longer than 20 characters
Note that it is implicit here that the transformation engine is not required to expected to validate the output against that underlying structure definitions that may apply to it. An application may - and usually should - validate the outputs after the transforms, but the transform engine itself does not automatically validate the output (e.g. it does not assume that it's the final step in the process).
Now for the case where there is a simple type conversion between the primitive types on the left and right, in this case from a string to an integer.
Source Structure | Target Structure |
TLeft a21 : string [0..1] |
TRight a21 : integer [0..1] |
The left instance is transformed to the right instance by copying a21 to a21, but a21 is converted to an integer |
There are 3 different ways to express this mapping, depending on what should happen when a is not an integer:
src.a21 as a -> tgt.a21 = cast(a, "integer"); // error if it's not an integer src.a21 as a where a.convertsToInteger() -> tgt.a21 = cast(a, "integer"); // ignore it src.a21 as a where at1.convertsToInteger().not() -> tgt.a21 = 0; // just assign it 0
More than one of these mapping rules may be present to handle all possible cases - e.g. rule_a21b combined with rule_a21c.
Note that the mapping language does not itself define which primitive types exist. Typically, primitive types are defined by the underlying type system for the source and target trees, and the implementation layer makes these types available to the mapping language using the FHIRPath primitive types. The mapping language uses the FHIRPath syntax for primitive constants.
Back to the simple case where src.a22 is copied to tgt.a22, but in this case, a22 can repeat (in both source and target):
Source Structure | Target Structure |
TLeft a22 : string [0..*] |
TRight a22 : string [0..*] |
The left instance is transformed to the right instance by copying a22 to a22, once for each copy of a22 |
The transform rule simply asserts that a22 maps to a22. The engine will apply the rule once for each instance of a22:
src.a22 as a -> tgt.a22 = a;
This will create one a22 in TRight for each a22 in TLeft.
A more difficult case is where the source allows multiple repeats, but the target doesn't:
Source Structure | Target Structure |
TLeft a23 : string [0..*] |
TRight a23 : integer [0..1] |
The left instance is transformed to the right instance by copying a23 to a23, but there can only be one copy of a23 |
Again, there are multiple different ways to write this, depending on out desired outcome if there is more than one copy of a23:
src.a23 as a -> tgt.a23 = a; // leave it to the transform engine src.a23 only_one as a -> tgt.a23 = a; // transform engine throws an error if there is more than one src.a23 first as a -> tgt.a23 = a; // Only use the first one src.a23 last as a -> tgt.a23 = a; // Only use the last one
Leaving the outcome to the transform engine is not recommended; it might not always know whether a property is confined to a single value, and exactly what happens is unpredictable. However, there are some circumstances where the appropriate action is to defer resolution, so this is allowed.
Most transformations involve nested content. Let's start with a simple case, where element aa contains ab:
Source Structure | Target Structure |
TLeft aa : [0..*] ab : string [1..1] |
TRight aa : [0..*] ab : string [1..1] |
The left instance is transformed to the right instance by copying aa to aa, and within aa, ab to ab |
Note that there is no specified type for the element aa. Some structure definitions (FHIR resources) do leave these elements as anonymously typed, while others explicitly type them. However, since the mapping does not refer to the type, its literal type is not important.
src.aa as s_aa -> tgt.aa as t_aa then { // make aa exist s_aa.ab as ab -> t_aa.ab = ab; // copy ab inside aa };
This situation is handled by a pair of rules: the first rule establishes that relationship between src.aa and tgt.aa, and assigns 2 variable names to them. Then, the rule contains an additional set of rules (though only one in this example) to map with the context of s_aa and t_aa.
An alternate approach is to move the dependent rules to their own group:
src.aa as s_aa -> tgt.aa as t_aa then ab_content(s_aa, t_aa); // make aa exist group ab_content(source src, target tgt) { src.ab as ab -> tgt.ab = ab; // copy ab inside aa }
Note that variables are divided into source and target; source variables are read-only, and cannot have their properties changed. Variable names may be reused in different contexts - they are only valid within the group or rule that defines them, and any dependent rules or groups.
A common translation pattern is to perform a translation e.g. from one set of codes to another
Source Structure | Target Structure |
TLeft d : code [0..1] |
TRight d : code [0..1] |
The left instance is transformed to the right instance by translating src.d from one set of codes to another |
The key to this transformation is the ConceptMap resource, which actually specifies the mapping from one set of codes to the other:
src.d as d -> tgt.d = translate(d, 'uri-of-concept-map', 'code');
This asks the mapping engine to use the $translate operation on the terminology server to translate the code using a specified concept map, and then to put the code value of the return translation in tgt.d.
Another common translation is where the target mapping for one element depends on the value of another element.
Source Structure | Target Structure |
TLeft i : string [0..1] m : integer [1..1] |
TRight j : [0..1] k : [0..1] |
How the left instance is transformed to the right instance depends on the value of m: if m < 2, then i maps to j, else it maps to k |
This is managed using FHIRPath conditions on the mapping statements:
src.i as i where m < 2 -> tgt.j = i; src.i as i where m >= 2 -> tgt.k = i;
Many/most trees are fully and strongly typed. In these cases, the mapping language can make use of the typing system to simplify the mapping statements.
Source Structure | Target Structure |
TLeft aa : TLeftInner [0..*] TLeftInner ab : string [1..1] |
TRight aa : : TRightInner [0..*] TRightInner ab : string [1..1] |
The left instance is transformed to the right instance by copying aa to aa, and within aa, ab to ab |
This is the same case as Step 7 above, but the mapping statements take advantage of the types:
src.aa -> tgt.aa; group ab_content(source src : TLeftInner, target tgt : TRightInner) <<types>> { src.ab -> tgt.ab; }
There are 2 different things happening in this short form:
group for types
- the "for types" indicates that this group is the default group to apply any time an element of type TLeftInner is mapped to a TRightInner
In the case of the first rule (rule_aa), the engine finds a need to map aa
to aa
and determines that it must map from TLeftInner to a TRightInner.
Since a group is defined for this purpose, it creates a TRightInner in tgt.aa, and then applies the discovered rule as a dependency rule. Inside
that rule that instructs the mapping engine to make tgt.ab from src.ab. It knows that both are primitive types, and compatible, and can apply this correctly.
This short form is only applicable when there is only one source and target, when the types of both are known, and when no other dependency rules
are nominated.
If the target element is polymorphic (can have more than one type), then the correct type of the target can only be inferred from the source type:
group ab_content(source src : TLeftInner, target tgt : TRightInner) <<type+>> { src.ab -> tgt.ab; }
Not only is this group the default for (TLeftInner:TRightInner), if the engine has a TLeftInner with an unknown target type, it should create a TRightInner, and proceed as above.
It is an error if the engine locates more than one group of rules claiming to be the correct group for a type pair of a single source type.
It's now time to start moving away from relatively simple cases to some of the harder ones to manage mappings for. The first mixes list management, and converting from a specific structure to a general structure:
Source Structure | Target Structure |
TLeft e : string [0..*] f : string [1..1] |
TRight e : [0..*] f : string [1..1] g : code [1..1] |
The left instance is transformed to the right instance by adding one instance of tgt.e for each src.e, where the value goes into tgt.e.f, and the value of tgt.e.g is 'g1'. src.f is also transformed into the same structure, but the value of tgt.e.g is 'g2'. As an added complication, the value for src.f must come first |
This leads to some more complex mapping statements:
src.e as s_e -> tgt.e as t_e then { for s_e -> t_e.f = s_e, t_e.g = 'g1'; }; src.f as s_f -> tgt.e as t_e first then { s_f -> t_e.f = s_f, t_e.g = 'g2'; };
The second example for reworking structure moves cardinality around the hierarchy. in this case, the source has an optional structure that contains a repeating structure, while the target puts the cardinality at the next level up:
Source Structure | Target Structure |
TLeft az1 :[0..1] az2 : string [1..1] az3 : string [0..*] |
TRight az1 :[0..*] az2 : string [1..1] az3 : string [0..1] |
The left instance is transformed to the right instance creating on tgt.az1 for every src.az1.az3, and then populating each az1 with the matching value of az3, and copying the value of az2 to each instance |
The key to setting this mapping up is to create a variable context for src.az1, and then carry it down, performing the actual mappings at the next level down:
// setting up a variable for the parent src.az1 as s_az1 then { // one tgt.az1 for each az3 s_az1.az3 as s_az3 -> tgt.az1 as t_az1 then { // value for az2. Note that this refers to a previous context in the source s_az1.az2 as az2 -> t_az1.az2 = az2; // value for az3 s_az3 -> tgt_az1.az3 = src_az3; }; };
Simple mappings, such as we've dealt with so far, where the source and target structure both have the same scope, and there is only one of each, are all well and good, but there are many mappings where this is not the case. There is a set of complications when dealing with multiple instances:
For our first example, we're going to look at creating multiple output structures from a single input structure.
Source Structure | Target Structure |
TLeft f1 : String [0..*]; |
TRight ptr : Resource(TRight2) [0..*] TRight2 f1 : String [1..1]; |
The left instance is transformed to the right instance creating a copy of TRight2 for each f1 in the source, and then putting the value of src.f1 in TRight2.f1 |
The key to setting this mapping up is to create a variable context for src.az1, and then carry it down, performing the actual mappings at the next level down:
src.f1 as s_f1 -> create("TRight2") as rr, tgt.ptr = reference(rr) then { s_f1 -> rr.f2 = srcff; };
This mapping statement makes use a special known value "null" for the target context to indicate that the created element/object of type "TRight2" doesn't get added to any existing target context. Instead, it will only be available as a context in which to perform further mappings - as rule f1a does.
The mapping engine passes the create request through to the host application, which is using the mapping. It must create a valid instance of TRight, and identify it as appropriate for the technical context in which the mapping is being used. The reference transform is also passed back to the host application for it to determine how to represent the reference - but this is usually some kind of URL.
For our second example, we're going to look at the reverse: where multiple input structures create a single input structure.
Source Structure | Target Structure |
TLeft ptr : Resource(TLeft2) [0..*] TLeft2 f2 : String [0..*]; |
TRight f2 : String [1..*]; |
The left instance is transformed to the right instance finding each ptr reference, getting its value for f1, and adding this in tgt.f2 |
The first task of the map is to ask the application host to find the structure identified by src.ptr, and create a variable for it
src.ptr as t then { t.f2 -> tgt.f2 }
This specification includes transforms that map between version R4B and this version (R5). These map files exercise quite a bit of the mapping language grammar, and can be found at Transforms between R4 and R5