DataVeil home

Home    Table of Contents

Relationship Obfuscation

 

It is possible to obfuscate relationships by disabling a Dependency.

For example, suppose that your company is considering a joint venture or even considering being bought outright an during the initial stage you need to provide a copy of your database for a limited revenue audit where the sales amounts must be preserved but the specific trading details should remain confidential, such as which particular sales locations are most lucrative as it could leak sensitive strategic business information prematurely.

Consider that you have a table called Stores that lists all of your retail stores and their location information, and you have another table called Order where each row identifies a transaction and the retail store where the transaction originated. There is a Foreign Key defined on Order.StoreID which references the parent Store.StoreID.

 

Therefore, considering that the requirement is that sales amounts are to be preserved, you might think of shuffling the sales amounts (Sale.saleAmt) column and the Store.storeID column. Although that would shuffle sales amounts among stores, it would still leave a statistical hint as to the frequency of transactions produced by individual stores and therefore this may not be sufficient obfuscation.

 

Now, consider disabling the dependency between Store.storeID and Sale.storeID and generate a random distribution of storeID's. In this case we created a simple Random Number mask (not shown) to generate values of 1 to 5 which are valid storeID's in this simple example. You must make sure that if you disable a dependency that you generate only valid values to maintain referential integrity. If your real-word situation requires more than just a random number or that which any simple mask can generate then consider creating a CSV file with valid values and use a DataSet mask instead. 

The outcome is shown below. Notice that the frequency distribution of storeID's in the Sale table has been altered. e.g. Previously, a particular storeID appeared four times, now no storeID appears four times.

Therefore the relationships of individual sales to stores has been completely obfuscated including the statistical properties of how any individual store is performing relative to the others.

 

 

Home    Table of Contents