Spoofing Realistic Credit Card Data for your Test Systems using DataVeil

As mentioned in a previous post, we are often asked about how DataVeil differs to other data masking software. In short, DataVeil has been crafted to be as simple to use with as much versatility and smart masking to provide the most utility out of the masked data.

That all sounds very nice, but let’s see a practical example. Let’s consider how the same masking task can be achieved using DataVeil compared to using Redgate’s Data Masker. We’ll choose a task that Redgate has already described on their blog.

The task is to generate valid credit card numbers (Primary Account Numbers) that preserve the same distribution of prefixes in masked data that is found in the original data. E.g. Visa 50%, MasterCard 25.6% and so on for other card types.

Redgate Data Masker

A summary of the steps required is given below. The details of each step can be found at Redgate Hub Spoofing Realistic Credit Card Data for your Test Systems using Data Masker.

1)   Perform a statistical distribution analysis of the original data or use similarly relevant report.

  • This could be by running a query on the original sensitive data or using an approximation from third-party sources such as the Nilson Report.

2)   Create a data masking ruleset for every credit card type that you want to include in the masked data.

  • Therefore, for five credit card types you shall need to create five different rules, each specifying the percentage distribution to generate.
  • Care needs to be taken to ensure that these rules are specified in the correct order or some rows may not be masked.
  • The approach described also means that many rows shall be masked and overwritten multiple times unnecessarily.
  • The masked credit card numbers generated shall all be of a pre-defined format. If there are any different formats in your database (perhaps causing an error that you would have liked to have preserved for debugging) then they shall be overwritten by the mask’s pre-defined format.

3)   Further refinements to masking rules are made in an attempt to achieve a closer approximation to the statistical distribution of the original data.

4)   If consistent data masking is required then an additional synchronization rule is required (not described in original article).

DataVeil

Here is how the same task can be achieved using DataVeil:

1)   Select the Primary Account Number mask and click OK.

That’s all. Finished.

In fact, not only is the DataVeil solution much simpler it also provides a better result because:

  • An exact distribution of data (same percentage of every credit card type) shall be contained in the masked data because the prefixes of every original value have been preserved. Whereas the Redgate approach may miss some card types if the user hasn’t explicitly created a rule for each of them. DataVeil shall automatically include all of them.
  • Every credit card number will be masked. Guaranteed.
  • Each row gets masked only once. DataVeil is more efficient.
  • By default, DataVeil will mask these values consistently automatically (deterministic mode). i.e. same original card numbers will all get masked to the same masked value.

DataVeil offers many other features such as sensitive data discovery, preview masking in an integrated data browser and more. Many of these features are even free to use, forever.