Primary Account Number (PAN) Mask
This mask was designed to mask Primary Account Numbers (PANs) which includes credit cards.
Specifically, this mask will randomize the numeric digits of a PAN and correctly set the check digit according to the Luhn algorithm.
The PAN length and format of the original value is always preserved.
You can specify how many digits from the left shall be preserved. This counts only digits and not any formatting characters. The default is 6 which typically corresponds to the Institution Identification Number or sometimes simply referred to as the "card prefix".
For example, if the preserve digit count is 6 and the original value is '4685-2421-1836-5024' then '4685-24' shall be preserved and the remaining digits shall be randomized.
Preserving PAN prefixes is very useful because it will ensure that the exact same issuers (e.g. Visa, MasterCard, American Express, etc) and their statistical distributions that exist in your original data shall be preserved in your masked data. If these are not preserved then it has the potential to alter the performance characteristics of your application that uses the masked data; for example, some issuer gateways may be less efficient than others and some may not get exercised at all.
Preserve invalid PAN's
If his option is enabled then any value that is not a valid PAN shall be preserved.
A valid PAN contains 14 or 16 digits and the last digit must be a valid Luhn check digit.
This setting can be very useful to ensure that any invalid PAN's are preserved so that they can be detected in case they are causing problems in a production environment. Invalid PAN's are not sensitive because they are not valid and therefore not useable in the real world.
When the PAN mask is operating in a deterministic manner every distinct original PAN will be masked to a distinct masked PAN. This shall be repeatable every time the PAN mask is executed provided that the same configuration and deterministic seed is used.
This means that a specific original PAN value shall be masked consistently across different databases or upon repeated masking executions.
The 'Non-Deterministic parameters' section of the mask panel are not applicable and will therefore be disabled.
Non-deterministic PAN's are generated on the basis of a sequence number.
Specifically, the PAN is assembled using: the preserved prefix (if any, see "Preserve Prefix" topic above), followed by zero-padded sequence number, followed by a Luhn check digit calculated for all preceding digits.
The first non-deterministic PAN mask shall start from the specified 'Default sequence number'. Any subsequent non-deterministic PAN mask in the same Column shall restart from its own 'Default sequence number' values unless its 'Continue sequence number' checkbox has been selected in which case it will start from the next number after the last number used by the previous non-deterministic PAN mask.
For example, if there are two consecutive non-deterministic PAN masks (because, say, the first mask specified a subset or rows using its Where condition) and the first mask masked 1,000,000 rows starting from sequence number 500, then the next non-deterministic mask (specifying 'Continue sequence numbers') shall start from 1,000,500. If the original PAN length was 16 and the first 6 digits being preserved were '468524', then the first PAN generated by the second mask would be '4685240010005003' (the last digit '3' is the automatically generated Luhn check digit). The sequence number portion is zero-padded on the left if necessary to preserve the original PAN length.
The masked PAN value for a specific original PAN value is unpredictable when non-deterministic; even on repeated execution for the same database. If you are masking multiple databases where each contains an identical PAN then it is almost certain that the masked PAN values will be different for the same original PAN value in each masked database. If you need a specific original PAN to be masked consistently to the same masked value (or a one-to-one relationship) then deterministic mode should be used.
The original format of every individual PAN value shall be automatically preserved.
For example, if a PAN value has its digits separated by hyphens then the masked value shall also be separated by hyphens in precisely the same way. Every individual row's format is examined and preserved. Therefore an original PAN "4685-2421-1836-5024" could be masked to "4685-2466-8495-3279" (hyphens preserved).
This preserves the idiosyncrasies of the original data and therefore maximizes the usefulness of the masked data.
For example, if your production data contains some unusual PAN formatting in some rows that are causing problems then these shall be preserved in the masked data so that these problems can be reproduced in a development or test environment.