Data masking in PL/SQL

Pranav Bakare - Sep 28 - - Dev Community

Certainly! Here’s a detailed explanation of data masking in PL/SQL, along with a simple example that illustrates the concept.

What is Data Masking?

Data masking is a technique used to protect sensitive information in databases by replacing it with fictitious data. The goal is to maintain the usability of the data for development, testing, or reporting purposes while safeguarding sensitive information such as personally identifiable information (PII), financial details, etc.

Why Use Data Masking?

Data Privacy: Protect sensitive information from unauthorized access.

Compliance: Meet regulatory requirements (e.g., GDPR, HIPAA).

Testing & Development: Allow developers and testers to work with realistic data without exposing real sensitive data.

Types of Data Masking

  1. Static Data Masking (SDM): Data is masked in a non-production environment (e.g., development or testing).

  2. Dynamic Data Masking (DDM): Data is masked in real-time based on user roles or access permissions.

Common Masking Techniques

Substitution: Replacing actual data with random or fictitious data.

Shuffling: Mixing up values within a column to retain the same data type.

Nulling Out: Replacing data with NULL values.

Encryption: Using algorithms to protect data.

Example Scenario

Let's assume we have an employees table that contains sensitive information like Social Security Numbers (SSNs) and email addresses. We will create a procedure to mask these details.

  1. Create the Employees Table

CREATE TABLE employees (
employee_id NUMBER PRIMARY KEY,
first_name VARCHAR2(50),
last_name VARCHAR2(50),
ssn VARCHAR2(11), -- Format: XXX-XX-XXXX
email VARCHAR2(100)
);

-- Insert sample data
INSERT INTO employees (employee_id, first_name, last_name, ssn, email) VALUES (1, 'John', 'Doe', '123-45-6789', 'john.doe@example.com');
INSERT INTO employees (employee_id, first_name, last_name, ssn, email) VALUES (2, 'Jane', 'Smith', '987-65-4321', 'jane.smith@example.com');

COMMIT;

  1. Create a Masking Procedure

This procedure will mask the SSNs by replacing the first five digits with 'X' and will change the domain of the email addresses to 'example.com'.

CREATE OR REPLACE PROCEDURE mask_sensitive_data AS
BEGIN
-- Mask Social Security Numbers (SSNs)
UPDATE employees
SET ssn = 'XXX-XX-' || SUBSTR(ssn, 8, 4) -- Mask the first five digits
WHERE ssn IS NOT NULL;

-- Mask Email Addresses
UPDATE employees
SET email = SUBSTR(email, 1, INSTR(email, '@') - 1) || '@example.com'
WHERE email IS NOT NULL;

-- Commit changes
COMMIT;
Enter fullscreen mode Exit fullscreen mode

END mask_sensitive_data;

  1. Execute the Masking Procedure

To apply the masking, you would execute the procedure as follows:

BEGIN
mask_sensitive_data;
END;

  1. Check the Masked Data

After executing the procedure, you can query the employees table to see the results:

SELECT * FROM employees;

Expected Output:

employee_id first_name last_name ssn email
1 John Doe XXX-XX-6789 john.doe@example.com
2 Jane Smith XXX-XX-4321 jane.smith@example.com

Explanation of the Masking Logic

SSN Masking: The SSNs are transformed from 123-45-6789 to XXX-XX-6789, where the first five digits are masked.

Email Masking: The email addresses change from john.doe@example.com to john.doe@example.com, but with the domain replaced to example.com.

Best Practices

Test the Masking Process: Ensure the masking logic does not inadvertently expose sensitive information.

Backup Data: Always back up original data before applying masking.

Compliance: Adhere to relevant data protection regulations and company policies when masking data.

Dynamic Data Masking Example

In scenarios where you need to apply masking based on user roles, you can create a view. Here's how to create a view that dynamically masks data for non-admin users:

CREATE OR REPLACE VIEW masked_employees AS
SELECT
employee_id,
first_name,
last_name,
CASE
WHEN user_role = 'admin' THEN ssn
ELSE 'XXX-XX-' || SUBSTR(ssn, 8, 4) -- Mask for non-admin users
END AS ssn,
CASE
WHEN user_role = 'admin' THEN email
ELSE SUBSTR(email, 1, INSTR(email, '@') - 1) || '@example.com'
END AS email
FROM
employees;

In this view:

If the user has the role 'admin', they can see the actual SSN and email.

For all other users, the sensitive data is masked.

Conclusion

Data masking in PL/SQL helps to protect sensitive information while allowing the use of data for testing and development. The methods shown can be tailored to fit specific needs, and by implementing these techniques, organizations can enhance their data privacy measures effectively.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .