SQL Server Data Masking

This article discusses SQL Server data masking types and the options available for its implementation.

The two basic types are static and dynamic masking.

Implementation options will depend upon the masking type to be used and whether it is preferred to use a third-party masking tool or a built-in SQL Server solution if available. Another factor is the intended purpose and audience of the masked data.

In addition to providing general masking information this article will give an overview of DataVeil static data masking software features that are optimized for SQL Server.

SQL Server Data Masking with DataVeil

SQL Server Dynamic Data Masking

Dynamic data masking means that the original data in the database is unchanged and that only the data presented to the user or application is changed.

For example, if the database contains an email address ‘john.smith@domain.com’ then the actual value displayed to the user could be ‘xxxx.xxxxx@xxxxxx.com’ but the data in the database remains unchanged. Therefore, the database continues to contain all the original sensitive values.

Since the database will continue to contain the original sensitive data, the database must continue to be protected with the same degree of security as the original live database. This is a point that many users who adopt dynamic data masking for their test environment often fail to appreciate.

For instance, it is not enough to make a copy of a sensitive live database, move it into a non-secure environment, apply dynamic data masking and then provide users in the test environment with access to the dynamically masked database. There are often too many potential holes in a test environment’s security, such as shared access to admin accounts and even defects in the dynamic masking software that could allow users to bypass it and access the original sensitive data. There have been many such cases in earlier versions of Microsoft’s dynamic data masking for SQL Server and it is quite possible that other vulnerabilities may still exist in this or any other vendor’s dynamic data masking software.

Another important consideration that is often overlooked is whether using dynamic masking would violate data security or privacy laws or regulations. The simple presence of a copy of the original sensitive database in a test environment may in itself be a violation of relevant regulations. Therefore, dynamic masking may be best considered for secure live environments where some less-privileged users need access to only some information but have no requirement to see other more sensitive information.

There are many third-party solutions; however, for dynamic data masking in SQL Server it may be worth considering Microsoft’s built-in dynamic data masking capabilities.

SQL Server Static Data Masking

Static data masking means that sensitive data in the original database is overwritten or deleted. Often a desired outcome of static data masking is for the sensitive data to have been replaced with realistic but fictitious data. Therefore, after masking, someone looking at the database contents may not even realize that the data has been masked because the names of people look real, their addresses and phone numbers look real, and so on. In practice, it’s not always possible to mask all data with realistic fictitious data and so some data may be masked using generic values such as simple redaction.

The big advantage of static data masking is that it makes the data much safer to use in non-secure environments because the sensitive data no longer exists in the actual database.

The potential down-side is that the procedure for masking the data can present some challenges. First, a copy of the sensitive data must be moved into a secure staging area where it shall be masked. Next, the execution of the actual masking process could take many hours or overnight to complete. Fortunately, DataVeil provides SQL Server users with a CLR assembly that can greatly speed up the process by approximately 10x. This is described further below.

Considering that the concept of static data masking is rather simple it may seem that implementation of such a tool would be rather straight-forward. The reality is that there are a great many challenges to implement this to produce realistic and usable data while handling constraints, dependencies and other data characteristics.

In fact, Microsoft attempted to produce its own static data masking solution and shared its preview release of Static Data Masking for Azure SQL Database and SQL Server in 2018. However, soon after in 2019, Microsoft decided to withdraw the feature citing “We have decided that our current prototype does not meet our customers’ expectations. We will therefore not carry this capability forward.”

In our opinion, Microsoft quickly realized that its simplistic dynamic approaches for SQL Server data masking would not be sufficiently useful in a static masking context. A primary reason is that simple substitutions will often break applications – either due to business rules not considered in the substitution data or that database constraints themselves are too easily violated – such as duplicate values, consistent masking not maintained, range constraints, foreign keys and others. These are not issues in dynamic masking because none of the values stored in the database are changed.  Therefore, to overcome such critical issues of static data masking a comprehensive approach is required. To make the process simple, intuitive and easy to use is a further challenge.

DataVeil for SQL Server Data Masking

The DataVeil data masking software provides a comprehensive static data masking solution for SQL Server. Great care has been taken to make this solution intuitive and easy to use.

DataVeil has many optimizations specifically to maximize performance on SQL Server plus a CLR assembly option to even further boost performance.

DataVeil software supports static data masking in SQL Server 2008, 2012, 2014, 2016, 2019 and 2022. DataVeil Technologies is also committed to provide compatibility with future versions of SQL Server.

Accelerated SQL Server Data Masking with DataVeil CLR Assembly

The default installation of DataVeil literally takes only a minute. In fact, all it takes is to unzip the delivered software ZIP file.

This default installation will use built-in masking functions written in T-SQL that are automatically managed by DataVeil. Even this default installation puts DataVeil’s masking performance amongst the fastest SQL masking tools available.

Additionally, DataVeil provides a CLR assembly whose optional installation gives a further 10x SQL Server data masking performance boost.

Fast data masking in sql server with CLR

Enhanced JSON & XML Masking Support

If JSON or XML masking is important for your requirements, then care should be taken in choosing a masking tool. This is because some tools have limited capabilities that may not be immediately obvious.

For instance, when using a JSONPath or XPath expression to mask multiple instances of a person family name, some tools are unable to differentiate distinct values within a record and will instead replace them all with only a single value.

As an example, suppose that a record represents a list of employees and contains multiple instances of a field called ‘familyName’ with values “Smith”, “Jones” and “Franklin”. Some tools will choose a single replacement value, such as “Baker”, and use this to replace all of these distinct instances. What’s worse is that if these values also occur separately in normal SQL columns then this means they would be masked differently resulting in inconsistent masking.

DataVeil, however, ensures that each instance of the same field name within a record is handled distinctly thereby ensuring masking consistency across the database and even across multiple databases that could even be masked at different times.

Azure Active Directory

In addition to the regular on-premise connectivity options of Windows Authentication and SQL Server, DataVeil supports Azure Active Directory options of Integrated, Password and Multi-Factor Authentication.

Workflow Integration

Users often have an environment where they have a workflow process for creating a masked copy of their database and moving it into a non-secure environment such as their test environment.

Some users may even have a virtualized database environment to provision SQL Server database clones that have a data masker step as part of the workflow.

Typically, the data masker step is just a command-line execution of some bundled or external masking software.

DataVeil provides a command-line interface that makes it easy to integrate DataVeil as the data masker step in such a workflow.

Advanced Masking Features

In addition to typical masking capabilities such as shuffling and replacing values with realistic fictitious values, such as person names and street addresses, DataVeil offers additional advanced masking features, such as:

  • Format preserving masks
  • Partial masking with dynamic ranges
  • Reusable Components and Macros
  • Discovery with built-in and customizable Patterns and Masks

Download Now - Get Started in Minutes

The DataVeil static data masking software has been crafted and refined since 2010 to deliver an elegant solution that's simple to understand, easy to use and comprehensive in its masking capabilities.

DataVeil is optimized for SQL Server data masking.