Discovery

When you first open the Discovery tab you are presented with a view as shown below.
  

 

Quick Start
  

The easiest way to configure and run a discovery search is to click on the 'Configure...' button, select a configuration as described in Query Configuration section below, then click on the Search button.

After the discovery search has completed, the result will be shown in the Result tab.

A PDF report is also automatically generated and can be viewed in the Reports tab.

 

Query Configuration
  

It is generally recommended that you use the configuration feature to configure the discovery query. To do this, click on the 'Configure...' button and the dialog below shall be displayed.
  

 

Choose only the region(s) that is applicable to your needs otherwise you may increase the number of false-positives detected.

For example, by selecting 'United States' (as shown above) DataVeil shall load all search patterns to detect data that is typically considered sensitive in the United States. This includes data that is specific to the United States (such as social security numbers, USA towns and states) and it shall also load patterns that may apply to other regions (even globally) but are nonetheless applicable to the United States, such as email addresses, device identifiers, bank account numbers, etc.

However, if multiple regions do apply to you then you should select them. For example, if you do business with Canada and the United States, and you believe that your database may contain social security numbers from the USA and social insurance numbers from Canada then you should select both 'Canada' and 'United States'.

If you are in a European country or region, other than the United Kingdom, choose Europe 'Other'.

If you are in any other country, such as Australia, choose Worldwide 'Other' .

After clicking 'OK', this dialog will replace the current Query (if any patterns are already present) with the relevant built-in DataVeil system patterns.

The Query is saved together with the current DataVeil project.

Please note: The DataVeil built-in system patterns consider only the English language and perform very effectively for the United States and Canada. They are also effective in other English-speaking regions such as the United Kingdom, Australia and New Zealand.

In regions that do not use the English language DataVeil shall use general patterns that are typically numeric-based, such as telephone numbers, serial numbers and bank account numbers. Patterns to detect internationally accepted identifiers such as email and web addresses shall also be used. In such cases it is suggested that you consider creating Custom regular expression patterns that are specific for your needs. This can be done in the tab Discovery->Search->Custom Patterns using the pop-up menu (right-click).

Manual Query Configuration
  

If you prefer, you can manually configure the Query feature simply by selecting a search pattern in the System Patterns or Custom Patterns tab, right-click and select "Load into Query".

You can also do this after using the configuration feature (above) by adding additional patterns to the Query.

 

Search Options
  

Please refer to the Search Options section for details.

 

Search
      

After your Query is configured, click on the "Search" button to start the search process.

A progress panel shall be displayed.

When the search has completed the Result Summary panel shall be displayed.

The Reports tab also allows you to configure the content of PDF reports that are automatically generated and you can also configure optional email notifications.

 

Patterns
  

All of the available search patterns are shown in the left panel. It contains two tabs: System Patterns and Custom Patterns.

Within each of these tabs, patterns are organized in two tiers.

The first tier, shown as folders, represent the general classification of patterns. E.g. "Internet".

The second tier represents specific patterns that belong to the classification. E.g. "Email Address" is a pattern that belongs under the classification of "Internet".

 

System Patterns
  

System patterns are DataVeil built-in patterns.

They have been specifically designed to detect their intended data domains. Depending on the data domain being detected, these patterns may employ regular expressions, lookup tables and other validation algorithms. For example, the Luhn check digit of Primary Account Numbers are validated. Therefore, if a 16 digit number may look like a credit card number but has an invalid check digit then DataVeil shall not report that 16 digit number as a Primary Account Number (whereas typical simple regular expression matching would return a false positive).

 You can examine the system patterns by expanding the classification folders.

You can right-click on a pattern to add it to the Query, and you can also right-click on a classification folder to add all of its patterns into the Query.
  

 

Custom Patterns
  

Custom patterns are user-defined regular expressions patterns.
  

 

DataVeil does not provide any custom patterns, with the exception of one example classification folder containing one example custom pattern.

This shows that custom classifications are stored and represented as ordinary disk folders. Patterns are stored as XML files within their classification folders.

Note: When you add a custom pattern into the Query, its classification icon in the Query will be the same as a system pattern (blue); however, the pattern icon will remain as a yellow circle to visually differentiate from system patterns.

You can use the same classification names as those used by system patterns without any problems. You can also use entirely different classification (folder) names to those used by system patterns.

 

New Custom Classification
  

To create a new classification folder, right-click on the root node and select "New Classification Folder...".
  

 

Here we are creating a new custom classification called "Corporate".
  

 

New Custom Pattern
  

To create a new custom pattern, right-click on the applicable classification folder and select "New Pattern...".

 

 

 

After selecting "New Pattern..." an empty Pattern Definition dialog will appear.

Here you specify the custom pattern name, a regular expression to test column names, a regular expression to test column data and some test cases. The example shows that we have created a very simple custom pattern to detect business names. (This custom pattern is for illustrative purposes only. DataVeil provides a more sophisticated system pattern called "Business Name" under the "Organization" classification.)
  

 

You can click on the "Test" button and DataVeil shall execute your regular expressions against any test data that you specify in the panels on the right side, as shown below.

 

 

DataVeil saves custom pattern definitions as an XML file with the same name as the pattern and with a ".ddp" file name extension.

If you are unfamiliar with regular expressions and would like to learn how to create them there are countless references available online and as reference manuals. To get started quickly you could search the internet for "regular expression tutorial" or a similar phrase. One such site is www.regexone.com . There are also many other sites that provide sample regular expressions that you can use, such as at www.regexlib.com . Please be careful with regular expressions that you find on the internet because the author may not have tested them thoroughly or may have simply intended them for a purpose that is different to yours.

Import
  

You can load the discovery Query from another DataVeil project file or from a discovery query file that was created by using Export (clicking on the "Export..." button), see below..

 

Export
  

If you would like to save your discovery Query so that you can reuse it in other DataVeil projects then you can export it to a file. DataVeil saves this as an XML file with a ".ddq" file name extension.

Note: DataVeil saves the Query with the current project anyway, and you can import discovery queries from other DataVeil project files, so unless you have a specific need to create a separate query file this function may be of limited use.