XML Mask
The XML mask enables the masking of XML element values and attributes. This mask is available only for Oracle, SQL Server and Azure SQL DB.
The XML mask defines one or more XPath definitions. Each XPath definition can have only one mask; however, XPath provides conditional expressions so that only some XPath targets can be masked while others are preserved. If the XPath conditional expression isn't adequate for your needs or you need to use different masks for the same XPath for different rows or under different conditions then you can specify multiple XML masks for the column and specify a Where condition in the Where tab. Therefore, it is possible to define multiple conditional XML masks for a column just as with other DataVeil masks as described in Mask Execution Order.
Some common and useful XPath expressions are described later in this XML Mask topic although a detailed explanation of XPath is beyond the scope of this documentation. There are many detailed XPath references available online. An introduction can be found at https://www.w3schools.com/xml/xpath_intro.asp . There are also many online XPath expression testers that are useful to validate that your XPath expressions will work as expected. We have noted that many such online testers are provided by enthusiasts and these do not always appear to work correctly so you are cautioned about this possibility. One of the better XPath expression testers can be found at http://xpather.com/
Create a mask for an XPath
An unlimited number of XPath definitions can be added to a single XML mask. Each XPath definition can specify only one mask.
To define an XPath and an associated mask, click on the 'Add...' button as shown in the XML mask form above.
An XPath form shall then appear as shown in the example below.
In the example above, we have specified that all element values (using the recursive operator //) called 'amt' are to be masked.
Note that the Modify combo box above shows Value. If you wanted to mask attributes then you would choose Attribute. For example, to modify the attribute 'bonus' the Modify section would appear as shown below. Note that the attribute name must be preceded with the '@' character.
Required minimum number of matching nodes in each XML document. The default is 1. If DataVeil does not match on at least this many attributes or non-empty values in each XML record for the XPath expression then DataVeil shall report a runtime error. This is a useful verification to help ensure that the XPath expression is working as expected. For example, if it is known that there should be at least 5 values masked for a given XPath expression then this value should be set to 5. If it is possible that some records will not match on any nodes then this value should be set to 0. Please note that empty node values are not counted, refer to the section "Empty Values and Nulls" below for more details.
To define the Mask for this XPath, click on the Create... button. A popup menu showing the masks that are available for use within the XML mask shall be shown. i.e. Note that the Shuffle mask is not available for use within an XML mask and therefore does not appear in the available masks list.
Execution Order
Each of the XPaths and associated masks are executed sequentially in order, from top to bottom, as they appear in the XML mask's 'XML' tab shown above.
The masking performed for each XPath is cumulative. Therefore, the masked XML result from the first XPath mask is passed to the second XPath mask, and the masked XML result from the second XPath mask is passed to the third XPath mask, and so on for the remainder of all XPath masks.
Empty Values and Nulls
Oracle
Empty node attributes and values shall be recognized and processed by masks normally.
Therefore, empty node attributes and values are included in the count of matching paths found that is used in the validation of the 'Required minimum number of matching nodes in each XML document' parameter.
SQL Server
Empty node attributes and values shall be skipped by the XML processor and therefore such values shall always be preserved.
However, empty attributes are counted whereas empty values are not counted for the validation of the 'Required minimum number of matching nodes in each XML document' parameter.
Limitations
Some of the above masks will operate with limited capability due to the nature of these masks being designed to operate on relational data but being invoked from within a hierarchical data structure such as an XML record. Details of the limitations of such masks when used within an XML mask are:
DataSet mask
A simplified version of the DataSet mask is available within an XML mask. In most cases it shall perform exactly the same as a regular DataSet mask. The exception is when using the non-deterministic Sequential option.
Please refer to The DataSet Mask within an XML Mask for details.
User SQL Value mask
A simplified version of the User SQL Value mask is available within an XML mask. This mask is limited to processing literal expressions, calling other functions and SQL queries that return a single value.
Please refer to The User SQL Value Mask within an XML Mask for details.
Unsupported Masks
The following masks are not supported as path masks of an XML mask:
* DateTime (use Randomize DateTime instead)
* Shuffle
Determinism
Each path mask's determinism shall operate normally as described in The Determinism Option. This applies across element values and attributes.
For example, if deterministic mode is used and the original name "John" and "Susan" are masked to "Frank" and "Margaret" respectively, then all matched paths' occurrences of "John" shall be replaced with "Frank" and all matched paths' occurrences of "Susan" shall be replaced with "Margaret".
Namespaces
If namespaces are used then the referenced namespaces must be defined in the Namespaces field in the XML mask's form as shown in the example XML mask at the top of this page. Note that this can be easily done with a copy/paste of the relevant namespace declarations from an XML record to the mask's form.
There are some important difference in namespace declaration requirements on Oracle vs SQL Server/Azure SQL DB.
Oracle
* Do not define any Default Namespace in the XML mask. If it is used then it is expected to be in the root node and DataVeil shall load it automatically.
* If a namespace wildcard is used in the XPath specification then the namespace of every path that satisfies the XPath specification must be defined in the mask's Namespaces field. If a matched path's namespace is not defined then an error message similar to the following shall be reported: ORA-19228: XPST0008 - undeclared identifier: prefix ...
SQL Server and Azure SQL DB
* If the Default Namespace is defined in XML records then it must be defined in the XML mask's Namespace section otherwise the values/attributes in the XPath's Default Namespace will not be found.
* If a namespace wildcard is used in the XPath specification then the namespaces of each path that satisfies the XPath specification do not need to be defined in the mask's Namespaces field, unless a namespace is explcitly referenced by another XPath in the same XML mask.
XPath pattern to find all values that match an element tag name
A detailed explanation of XPath is beyond the scope of this documentation. However, a common requirement is simply to mask all values that match a specific tag name, such as 'FamilyName' or 'amt'.
An easy way to achieve such a requirement is by using the XPath recursive descent operator '//'. Therefore, if you wanted to match on all element tag names called 'FamilyName' that appear anywhere in the XML record then the XPath expression would be '//FamilyName'. If there are namespaces used in the XML record then this would match only on the tags in the default namespace.
If you want to match on the tag name in any namespace then you can include the namespace wildcard '*'. Therefore the XPath expression would be '//*:FamilyName'. You can include the namespace wildcard even if there are no namespaces defined in the XML record.
If namespaces are used in the XML record (anywhere you see the 'xmlns' declaration) then the namespaces of the tags that the XPath expression is expected to find must be declared in the Namespaces section of the XML mask.
Examples without Namespaces
Consider the following XML record without namespaces:
<myrec>
<amt>1</amt>
<main>
<amt bonus="10">2</amt>
<amt>3</amt>
<rows>
<row>
<amt bonuses="999">4</amt>
</row>
<row>
<amt> 5</amt>
</row>
<row>
<misc>
<amt bonus="22">6</amt>
<keep> opt</keep>
</misc>
</row>
<row>
<amt> 7</amt>
</row>
</rows>
</main>
</myrec>
|
XPath to Element mask param |
Attribute mask param |
Finds |
|
|
|
|
|
/myrec/*/amt |
|
2, 3 |
|
//amt |
|
1, 2, 3, 4, 5, 6, 7 |
|
//amt[text() > 5] |
|
6, 7 |
|
//amt[@bonus] |
|
2, 6 |
|
//amt[@bonus = 22] |
|
6 |
|
|
|
|
|
//amt |
@* |
10, 999, 22 |
|
//amt[@bonus < 20] |
@bonus |
10 |
Examples with Namespaces
Consider the following XML record with namespaces:
<myrec xmlns="http://www.xyz.com/dflt" xmlns:ns1="http://www.xyz.com/a" xmlns:ns2="http://www.xyz.com/b">
<amt>1</amt>
<main>
<ns1:amt ns1:bonus="10" ns2:bonus="20">2</ns1:amt>
<ns2:amt>3</ns2:amt>
<rows xmlns:ns3="http://www.xyz.com/c">
<row>
<amt bonuses="999">4</amt>
</row>
<row>
<ns3:amt> 5</ns3:amt>
</row>
<row>
<misc>
<amt bonus="22">6</amt>
<ns1:keep> opt</ns1:keep>
</misc>
</row>
<row>
<ns1:amt> 7</ns1:amt>
</row>
</rows>
</main>
</myrec>
Namespaces mask param:
For Oracle: xmlns:ns1="http://www.xyz.com/a" xmlns:ns2="http://www.xyz.com/b" xmlns:ns3="http://www.xyz.com/c"
For SQL Server/Azure: xmlns="http://www.xyz.com/dflt" xmlns:ns1="http://www.xyz.com/a" xmlns:ns2="http://www.xyz.com/b" xmlns:ns3="http://www.xyz.com/c"
|
XPath to Element mask param |
Attribute mask param |
Finds |
|
|
|
|
|
//*:amt |
|
1, 2, 3, 4, 5, 6, 7 |
|
//ns1:* |
|
2, opt, 7 |
|
//ns2:amt |
|
3 |
|
/myrec/main//*:amt |
|
2, 3, 4, 5, 6, 7 |
|
//ns3:amt |
|
5 |
|
//(ns2:amt|ns3:amt) |
|
3, 5 |
|
//ns1:amt[text() > 5] |
|
7 |
|
//*:amt[@*:bonus > 15] |
|
2, 6 |
|
|
|
|
|
//amt |
@* |
999, 22 |
|
//*:amt |
@bonus |
22 |
|
//*:amt |
@*:bonus |
10, 20, 22 |
|
//ns1:*[@ns1:bonus = 10] |
@ns2:bonus |
20 |
Troubleshooting
A common cause of errors encountered in processing an XML mask are related to the XPath to Element or Attribute parameters.
Depending on where such related exceptions are caught the DataVeil runtime error message may be rather cryptic.
On Oracle, it has been possible to detect such issues and return more meaningful error messages that indicate the nature of the problem and the column for which the XML mask is defined. An example of such an error message is "Column "DVTEST.CUST.XR", problem when processing XML mask for path "//*:amt/*@*:bonus" while scanning for matching paths. Please check that the XPath to element or attribute is correct. ORA-19114: XPST0003 - error during parsing the XQuery expression: LPX-00801: XQuery syntax error at ....".
However, on SQL Server, errors related to incorrect XPath parameters may return rather cryptic messages during the DataVeil 'Setup database' phase. For example, something like "Error code 2205. XQuery [value()]: ")" was expected. Problem when attempting to upload DataVeil Objects to DBMS." In this case, DataVeil is unable to specify precisely in which XML mask and for which column the error is related. Therefore, it is suggested to perform Preview Runs frequently while authoring the masking project, such as after creating each XML mask to verify that the parameters are correct and that the mask is behaving as expected.
You can also verify that the XPath parameters are working as expected by using third-party XPath expression testers such as the one found at http://xpather.com/ Note that if the Attribute parameter is specified in the DataVeil XPath mask, then the full XPath would consist of the Xpath to Element value, followed by '/', followed by the Attribute parameter value, such as //amt/@bonus.
Another frequent source of error is that a namespace referenced by an XML mask's XPath has not been defined in the mask's Namespaces field.