The Person Full Name Mask

 

This mask shall generate full names that consist of one or more given names and a family name.

The longest given name generated is 11 characters and the longest family name is 13 characters. If a receiving column width is shorter than a generated full name then the name shall be truncated.

Every given and family name generated consists of only alphabetic characters. In other words, there are no hyphenated names and there are no apostrophes.

DataVeil does not preserve any punctuation or special symbols such as the characters ; : , . | / \ The only exception is that the Separator Character shall be included in the masked Full Name if the 'Expect Separator Character...' option is selected and the separator was found in the original value.

The default Person Full Name mask is shown below.

 

Person full name data mask

 

Family Name appears at:

This specifies the relative position of the Family Name within the field.

First Position should be selected when the Family Name is expected to appear first. Example: "Smith John"

Last Position should be selected when the Family Name is expected to appear last. Example: "John Smith"

Please see "Why Locating Given and Family Names May Be Important To You" below.

 

If field contains a single name then accept it as a:

This deals with the potential situation that a Full Name field contains only a single name. E.g. "Prince"

Given Name specifies that the single name is replaced using a DataVeil-generated Given Name.

Family Name specifies that the single name is replaced using a DataVeil-generated Family Name.

 

Expect Separator Character between Family Name and Given Name(s)

If this option is selected then DataVeil shall scan the field for the first occurrence of the Separator Character ("," is the default.)  

If the separator character is absent then the handling options are:

Use normal parsing... DataVeil shall attempt to locate the Given Name(s) and Family Name as if this 'Expect Separator Character...' option was not selected.

Process as only Given Name(s) or Family Name... The field value shall be considered as containing either only Given Names (therefore the parameters under the Family Name tab shall be ignored) or only a Family Name (therefore the parameters under the Given Names tab shall be ignored) according to the setting of the 'If field contains a single name...' parameter. Using this option shall assume that every word is a name and therefore the field shall not be scanned for any prefixes (e.g. "Mr") or suffixes (e.g. "PhD) that may be defined in under the Prefixes/Suffixes tab.

 

How DataVeil Parses a Field to Locate the Given Name(s) and Family Name

A hyphenated name is always regarded as a single name. Spaces between names and hyphens are ignored. E.g. "Smith-Williams" shall be processed exactly the same as "Smith  -  Williams".

A field can contain zero or more Given Names.

A field can contain zero or one Family Name.

If a field contains more than two words then DataVeil shall check for prepositions (e.g. "de") and articles (e.g. "le") preceding the Last Name. If prepositions or articles are found then they shall be considered as part of the Last Name. Example "de la Rosa" shall be considered as a single Last Name. DataVeil recognizes the following prepositions and articles: al, bin, binti, da, de, del, della, des, di, du, el, la, le, van, von.

If a field contains exactly two words then prepositions and articles are not considered. Eg. "Van Chow" would be considered as consisting of a Given Name and a Family Name, whereas "John James van Huesen" would be considered as consisting of two Given Names ("John" and "James") and a Family Name ("van Huesen") assuming that 'Family Name appears at' was configured for 'Last Position'.

If the Separator option is specified then the first occurrence of the separator character shall be assumed to divide the Given Name(s) from the Family Name. If multiple words are found at the position of the Family Name (according to the separator and 'Family Name appears at' parameter) then all of those words shall be considered as consisting of a single Family Name.

Here are some examples to clarify:

Sample 1 - Default parameters

      

"Mary" is consistently masked as "Ariana" except:

* on row 5 because "Mary-Jo" is hyphenated and DataVeil considers the complete hyphenated name as a distinct single name. In this case "Mary-Jo" is masked as "Maribeth".

Also note on row 4 that since "Mary Jo" is not hyphenated it was considered as two distinct Given Names.

Similarly, the Family Name "Williams" is consistently masked as "Araiza" except:

* on row 2 "Williams" appears as a Given Name and is therefore masked as a Given Name ("Ronny".)

* on row 3 "Williams-Smith" is hyphenated and is therefore considered as a complete single Family Name distinct to "Williams".

 

Sample 2 - Family Name in First Position, expect Separator ','

     

This mask is configured to expect the Family Name at the beginning of the field ('First Position'). As you can see, "Williams" is masked by "Araiza" which is consistent with the previous examples in Sample 1 where Family Name was expected in 'Last Position'.

This mask, however, uses the Separator option. On row 3 the separator is found which means that everything preceding it ("WILLIAMS SMITH") is considered a single Family Name. In this case it is equivalent to a hyphenated name. Therefore "Williams-Smith" on row 4 is also masked by the same Family Name "Kapp" as on row 3.

Also note that if the Separator Character is found then it shall be included in the masked Full Name as shown in rows 3 and 6. If it is not found in the original value then it shall also be omitted from the masked value.

 

Why Locating Given and Family Names May Be Important To You

You may be wondering 'Does it really matter if Given Names get replaced with Family Names and vice-versa?'

The answer is it may be inconsequential, or it may be important if consistency with other person name fields in your database is required.

For example, suppose that you have a Full Name field that contains the value "John Smith" and elsewhere in your database you have a corresponding Family Name field that contains "Smith".

Therefore, if your application is expecting consistency between these fields and if the Full Name is masked with "Peter Jones", then the corresponding Family Name field should also be masked with "Jones".

This consistent masking is only possible if Deterministic mode is used for the masks of both fields (the default) and that the Full Name mask correctly identifies the Family Name in the Full Name field. i.e. If you incorrectly specified the 'Family Name appears at' parameter as 'First Position' then then Family Name would be considered as "John" which would not synchronize with the other field's value of "Smith" which would result in inconsistent masking.

 

The Given Names Tab

This specifies how DataVeil shall generate masked given names.

Its parameters and operation is the same as for the Person Given Name mask.

If Deterministic mode masking (the default) is used then given names masked by the 'Person Given Name' mask and the 'Person Full Name' masks shall be masked consistently. For example "John" will be replaced with the same masked value by both masks.

 

The Family Names Tab

This specifies how DataVeil shall generate a masked Family Name.

Its parameters and operation is the same as for the Person Family Name mask.

If Deterministic mode masking (the default) is used then family names masked by the Person Family Name and the Person Full Name masks shall be masked consistently. For example "Smith" will be replaced with the same masked value by both masks.

 

The Prefixes / Suffixes Tab

 

 

For the purposes of this DataVeil mask, a 'prefix' is a word that can appear only as the first word regardless of the order of given and family names. E.g. "Mr John Smith" and "Mr Smith, John"

A 'suffix' is a word that can appear only as the last word regardless of the order of given and family names. E.g. "John Smith Jr" and "Smith, John Jr".

Please note that a field is scanned for prefixes and suffixes only if at least two words are present. E.g. "Dr Smith" would be scanned whereas a field value of only "Dr" would not.

Prefixes

The screen capture above shows that 'Mr Ms Mrs Miss' are prefixes that shall be preserved in the masked output. E.g. "Mr. John Smith" could get masked to "Mr William Jones".

The screen also shows that 'Dr Prof' are prefixes that shall be stripped prior to masking. E.g. "Prof John Smith" could get masked to "William Jones".

If a prefix is encountered that is not defined in either the Preserve Prefixes or Strip Prefixes panels then the prefix shall be treated as a regular name. E.g. "Prof John Smith" could get masked to "Henry William Jones".
 

Suffixes

The screen capture above shows that 'MD PhD Sr Snr Jr Jnr Esq' are suffixes that shall be stripped from the masked output. E.g. "John Smith Jnr" could get masked to "William Jones".

If a suffix is encountered that is not defined in either the Preserve Suffixes or Strip Suffixes panels then the suffix shall be treated as a regular name. E.g. "John Smith Sr" could get masked to "William Frank Baker". In this example, note that "Sr" has been assumed to be the family name, therefore "Smith" in this case would be treated as a given name. That is why "Smith" was masked with "Frank" unlike the other preceding examples where it was masked with "Jones". 
 

CAUTION on Preserving Prefixes and Suffixes

It is generally recommended that only the most common prefixes are specified to be preserved (such as Mr and Ms). Preserving common prefixes adds realism to the appearance of masked data. This is also true of suffixes although it may be safest not to preserve any suffixes because their usage tends to be inherently less common. Therefore, the presence of less-common prefixes or suffixes may offer hints as to the true identity of the original name. E.g. Prefixes such as "Sir", "Dr", "Prof", "Capt" may provide significant hints to the original name's identity. Ultimately, it is a matter for the user/data analyst to decide.

If a prefix or suffix is not defined to be preserved or stripped then it shall be treated as if it was a regular name. i.e. a masked value shall be generated for it.

If a prefix or suffix is defined to be stripped then it has the benefit of not adding an additional name to the full-name field and not including the prefix/suffix in the masked value means that no hint is provided to the original name's identity.
 

Case and Punctuation

Prefixes and suffixes are not case-sensitive for comparisons during masking. If a prefix or suffix is preserved then the case of the original text shall also be preserved; however any punctuation shall be stripped.

E.g. If a mask defines the suffix "md" to be preserved and the original value is "Ted Smith M.D." then the masked result could be "William Jones MD".

Prefixes and suffixes should be entered into these text panels as space separated. Do not enter punctuation; any punctuation shall be stripped. Therefore, if you try to define 'Mr. Mr mr Ms. Ms' you will find that DataVeil shall automatically strip the punctuation and keep the first occurrence of each prefix. i.e. When you next open the form you will see that it only contains 'Mr Ms'.

 

Null and Empty Value Handling

Deterministic mode: A null or empty string is preserved.

Non-Deterministic mode: A null or empty string is overwritten with a random full name value consisting of one given name and a family name.