Getting Started
Introduction
FileMasker is a data masking tool specifically designed to mask flat files.
Currently, the supported file types are text-delimited files (CSV) and files containing JSON records. You can find more information in the File Formats section.
The FileMasker GUI is used to create a masking definition project file (.fmp) that can be run directly from the FileMasker GUI or command line to mask files on the local file system.
FileMasker can also be installed as an Amazon AWS Lambda function that will mask files on S3.
Sample and Demo Files
Sample CSV and JSON files have been provided with the delivered software. These can be found in the directory filemasker/demo. Sample command-line batch files have also been included. Please refer to the README.txt in that directory for an explanation of each file.
Start the FileMasker GUI
Let's start this quick tutorial by creating a FileMasker project using the GUI.
To start the FileMasker GUI: On Windows: For 64 bit Java environments run bin/filemasker64.exe For 32 bit Java environments run bin/filemasker.exe
On Linux/Unix: Run the script bin/filemasker
The main screen shall appear:

1. Define a Field
The first thing you will need to do is to define a field to be masked.
To create a field definition, right-click in the Masks Summary tab. A popup menu shall appear.
Click on the 'Add Field...' command.
A dialog to create a field definition shall appear, shown below.

There are a few options that you can set here and you can read about those in the Field Definition section.
For now, lets keep this simple and we'll set only the field name path.
Note: In this tutorial we shall use the sample CSV file that is included with the delivered FileMasker software in the folder called demo. The CSV file in this folder is called 'demo_all.csv' and an equivalent file in JSON format is also provided called 'demo_all.json'.
Let's mask the 'first_name' field that contains person given names.
Therefore, since this CSV file contains a heading row with the field name 'first_name', we can just enter 'first_name' as the field name in the FileMasker field definition form:

Now click OK.
The field shall be listed in the Masks Summary tab:

Note: If the CSV did not have a heading row we would have entered '2' as the field name because the field appears as the second column in the CSV file.
2. Create a Mask
Now we need to create a mask for the defined field.
Right-click on the field name and a popup menu shall appear:

Click on the 'Add Mask...' menu item and a list of available masks shall be displayed:

Note: the free Community license allows you to mask an unlimited amount of data using the free masks. For your convenience, these are shown in the mask list in bold text. Please refer to Licensing for more details.
Select 'Person Given Name' and click OK. The Person Given Name mask configuration dialog shall appear:

Let's just accept the defaults and click OK.
We have just finished configuring a mask for the 'first_name' field and it is listed in the Masks Summary tab:

3. Define the Input and Output Parameters
Open the Execution tab:

Input File Details
Click on the button to browse to the file to be masked.
Whenever masking a CSV file and the first row contains column headings then make sure to select the checkbox First row is header and create field names that reference these column names. If a CSV file does not contain a heading row then clear this checkbox and create field names as numbers starting from 1 as the leftmost column, 2 for the next column, and so on.
Output File Details
Instead of overwriting the original file, let's write the masked version of the file to another folder called 'out'.
After entering the parameters as discussed, the form now appears as follows:

Further information on these settings can be found in the File Formats section.
4. Preview the Masking Project
It is recommended that you perform a Preview Run as you are creating a masking project to ensure that the masks you are configuring are performing as you would expect. The Preview Run performs the normal masking functions and shows the Before/After results but does not actually overwrite any files.
Click on the 'Preview Run' button or on the corresponding toolbar button.

Project Security
After clicking on the Preview Run button FileMasker shall validate whether any security parameters are required. In this case, the Person Given Name mask is configured for deterministic mode masking (the default). Therefore, this mask shall require a deterministic seed value for which there is no default. A deterministic seed is considered sensitive and therefore a Project Key shall also be required. If this seems like it's starting to get complicated then please be assured that it's not. For now, it's enough to know that this is as complicated as it gets (i.e. it's really quite simple) and FileMasker shall automatically prompt you whenever it needs a seed value or Project Key. For a more detailed explanation please refer to the 'Project Key' and 'Determinism' topics in the Project Settings section.
Therefore, FileMasker shall now prompt you for a Project Key..

You can think of this Project Key as a password that you create. It is used to encrypt sensitive project data such as deterministic seeds. The Project Key is not stored anywhere, only its hash value is stored in the project file in order be able to later validate whether a correct Project Key has been entered when required, such as when editing pr accessing deterministic seeds.
Next, FileMasker shall prompt you for the Default Seed..

This can be any text.
After clicking OK, FileMasker shall have its required security parameters and shall proceed to execute the masking project.
The masks are executed and the Before/After Comparison window is displayed:

Notice that in the right half of the panel, the After column, that the word 'Preview' appears. This confirms that the masked results are only a preview and have not been committed to any file.
5. Produce a Masked File
Once you are satisfied that your configured masks are performing as expected, you can just as easily commit the masked results by writing them to a file as specified for Output File Details in step 3 above.
To perform the actual masking run, just click on the 'Run...' button or on the corresponding toolbar button.

This will perform the masks and show the Before/After Comparison display:

Note that the the actual file path where the masked output was written is shown in the After column heading.
Next Steps
This introduction covered how to define fields and configure masks for them.
Additional masks are available and are documented in the Masks Reference section.
You can also use FileMasker as an AWS Lambda to mask files on S3. For details please refer to the Amazon AWS section.
|