Redact Mask
The Redact mask replaces characters with a redaction character.
This mask operates on text-type fields (eg. VARCHAR, NVARCHAR, CHAR, NCHAR) and numeric-type fields (eg. INT, NUMERIC, BIGINT, etc).
You can specify full or partial masking of the target field.
By default, the Redact mask shall preserve Symbols and Spaces and mask everything else as shown in the Redact mask above.
Specifically, the Redact character selection options are:
Alphabetic characters - Upper and lower case English characters ('A'..'Z', 'a'..'z', '0'..'9'). Half-width (U+0041 .. U+005A, U+0061 .. U+007A) and full-width (U+FF21 .. U+FF3A, U+FF41 .. U+FF5A) characters are considered.
Numeric characters - Digit characters ('0'..'9'). Half-width (U+0030 .. U+0039) and full-width (U+FF10 .. U+FF19) characters are considered.
Symbols - Characters in the half-width range U+0021 .. U+007E and full-width range U+FF01 .. U+FF5E except for those that are in the Alphabetic and Numeric ranges above.
Spaces - Half-width U+0020 and full-width U+3000.
Other characters - All other characters that are not within the ranges described above. This will mostly correspond to non-English alphabetic characters. E.g. Chinese, French, German, etc.
If an original character to be redacted is half-width (e.g. an ASCII character) then the character specified in the Replace half-width characters with field shall be used. Similarly, if an original character is full-width (e.g. a Chinese character) then the character specified in the Replace full-width characters with field shall be used.
Masking Range
This panel allows you to specify a partial range of an original value to be selected for masking. The default, as shown in the mask above, shall select the entire value.
Select Range
This defines what part of each value shall be selected for masking (subject further to the First and Last parameters below). Everything outside of this range shall be preserved.
Entire Field - The entire value is selected.
Before Substring - Only that part of the value from the appears before the specified substring in the value shall be selected for masking. The substring and remainder of the value shall be preserved.
After Substring - Only that part of the value from the appears after the specified substring in the value shall be selected for masking. The substring and part of the value that appears before the substring shall be preserved.
Substring
The substring that shall be considered the delimiter of the masking range.
Every character in this field is significant including quotes and spaces. Therefore, unless you want to search the field for a quotes and spaces do not include them in this field.
Preserve or Mask:
If Preserve is selected from the combo box then the First and Last parameters shall describe how many characters from the original value shall be preserve in the masked value.
If Mask is selected from the combo box then the First and Last parameters shall describe how many characters in the original value shall be masked. All other characters shall be preserved.
First
This specifies how many of the first count of characters (of the adjacent character unit) shall be preserved or masked (as specified by "Preserve or Mask" described above).
The character unit combo box offers the selection:
All - Every character is counted, including non-alphanumerics. ie. This yields a fixed offset.
Alphabetic - Only English alphabetic characters are counted ('A'..'Z', 'a'..'z')
Numeric - Only numeric characters are counted
AlphaNumeric - Only alphanumeric characters are counted
Last
This specifies how many of the last count of characters (of the adjacent character unit) shall be preserved or masked (as specified by "Preserve or Mask" described above).
Numeric Fields
The Redact mask is capable of masking numeric-type fields. In this case, the redaction character must be a digit ('0'..'9').
For example, you can specify that the first few digits of each number in a column should be preserved and the remainder set to zeroes. The sign character shall not be affected. i.e. Positive numbers shall remain positive. Negative numbers shall remain negative, unless the masking sets the value to zero.
Determinism
The Determinism tab shall always show that the Redact mask is Deterministic. i.e. The masked value value always be the same for a given input value. Therefore, the settings on the Deterministic tab cannot be modified.
Examples
Example:
Mask the last 4 digits of phone numbers with zeroes.
Since we're only replacing the last 4 numeric characters (digits) this has the benefit of preserving not only any formatting of original values, but also any other general but significant information, such as country codes and area codes, shall be preserved; however, you must consider whether this is sufficiently anonymous for your purposes. Note: Generally speaking, the Randomize Mask is better suited for masking telephone numbers because it has the capability to generate distinct masked values for distinct original values (rather than generating duplicates as would likely happen if the last 4 digits of all telephone numbers were set to '0000')
Example:
Mask the first 6 characters with '*'.
Example:
Preserve only the first 2 alphabetic characters and the last 3 digits while masking all other alphanumeric characters using the '@' character (preserving non-alphanumeric characters).
Example:
Redact telephone numbers after the first hyphen ('-') from the left; however, within this range preserve only the first digit and last two digits.
Example:
Redact a Chinese address. Mask all characters except for symbols and spaces. Each half-width character should be replaced with 'X' and each full-width character with '事'.
Size Limitations
The following are the maximum character lengths per value to be masked:
MySQL: 65,535
Oracle: 2,000
SQL Serve/Azure: 2GB