Partial Masking

Partial data masking is a very useful DataVeil feature. It provides the ability to preserve part of an original value and mask the remainder.

One of the benefits is that this can easily preserve the statistical distribution of the original data.

For example, suppose that you have a column of 16 – 20 digit account numbers and it’s important to accurately maintain the same distribution of prefixes in the masked data as in the original data. In this case, the prefix is the first six digits of the account number. As a side note, if you are masking credit card numbers or other primary account numbers then you should use the Primary Account Number mask because it shall also automatically calculate and include the correct Luhn check digit.

Using DataVeil, it’s a very simple matter to configure a mask to preserve the first six digits of each account number and mask the remaining digits. The configuration of this option is shown below.

preserve_6

This has the effect of preserving the exact distribution of all prefixes in the masked data and not just an approximation. The simplicity of this configuration makes it effortless for the user. It also means that if you need to mask another database that has a different distribution of data then this option shall automatically adapt to precisely preserve the new distribution too.

This was just the simplest example of DataVeil partial masking. Let’s now consider a more powerful feature of the partial masking option – dynamic masking range selection.

This means that DataVeil shall scan an original value for the occurrence of a character, or substring, and then masks (fully or partially) only the data that appears before or after the substring.

For example, suppose that you wanted to mask email addresses but to preserve the domain names. The option shown below accomplishes this. i.e. Select for masking consideration everything that appears before the first ‘@’ character and then mask everything within this selected range (shown as Preserve 0 below). Therefore ‘John.Smith@somedomain.com’ could get masked to something like ‘Haje.Klath@somedomain.com’.

partial_email

Finally, let’s consider a more complex example. Suppose that the end-user has provided the following requirements to partially mask serial numbers. Each row may contain differently formatted serial numbers (hyphens in different positions). Everything prior to the second hyphen from the left must be preserved, but after this hyphen only the next two alphabetic characters and the last digit are to be preserved. Everything else is to be masked. The exact format of every individual value must be preserved.

Two sample data rows are shown below. The parts to be preserved are in bold and the parts to be masked are underlined.

31-4353-34-A709K998-908901
522-92-4X3W-W0980L2

Although this requirement may seem rather complex, if not near impossible, it is actually very simple to achieve in DataVeil. It literally takes seconds to configure and is shown below.

partial_hyphen2

A ‘before’ and ‘after’ masking result is shown below.

partial_browser

Note: The before and after masking result shown side-by-side above is a screen capture from DataVeil’s built-in Data Browser.

Hopefully, this article has demonstrated the versatility of the DataVeil partial masking option.

If you are interested in exploring this feature further the please refer to the DataVeil Randomize mask as an example of a suitable mask for such a task. Also consider the Redact and Randomize Hex masks.