Single-Byte and Multi-Byte Character Redaction Options in Redact Mask
DataVeil 4.4.0 introduces two enhancements to the Redact mask.
- Two distinct redaction characters can be specified. One is used if the original character to be redacted is half-width (single byte) and the other for all other characters (full-width or multi-byte).
- The previous redaction scope checkbox option ‘Non-Alphanumeric characters’ has been split into two new options: ‘Symbols’ and ‘Other’.
Therefore, the new default Redact mask’s selection and redaction character options are now as shown below:
By default, as shown above, Symbols and Spaces shall be preserved but everything else shall be redacted. The Other characters shall refer mostly to non-English language characters such as Chinese, French, German, Cyrillic, etc.
Of the characters that shall be redacted, those that are single byte (half-width) shall be replaced with the ‘X’ character. Note that next to the specified redaction character the hex value is displayed. As shown, the value 0x58 confirms that this redaction character is the single-byte version of ‘X’ whereas the value of 0xFF38 for the full-width character confirms that the full-width ‘Ｘ’ has been specified.
DataVeil does not place any restriction on which characters can be specified in each of the 'Replace half-width/full-width character' fields. For instance, you could specify ‘X’ in one and ‘@’ in the other. You could also specify a full-width character in the ‘Replace half-width’ field or vice-versa although it may be best to use a half-width character for redacting half-width characters and a full-width character for redacting full-width characters as that will best preserve the original data length which may be a more useful form of redacted data.
‘Half-Width & 123.Ｆｕｌｌ－Ｗｉｄｔｈ ＆ １２３．Велика Васильківська 456, кв 78.’
using @ for single-byte characters and Ж for multibyte characters, while preserving symbols and spaces, the configured mask options are shown below:
The before & after masking result is: