Checking for GDPR Compliance with SQL

Checking for GDPR Compliance with SQL

Ensuring GDPR compliance is crucial for businesses handling personal data in the EU. Compliance is mostly a process driven activity rather than a technical one, however technical solutions such as SQL scripts can be used to identify some some common compliance issues, for example:

  • Unencrypted Personal Data e.g. names and addresses
  • Data Retention Issues e.g. records that should have been deleted
  • Consent Tracking e.g. missing consent records
  • Access Control e.g. wide access to sensitive columns

Below is SQL script that checks for common GDPR related issues in a SQL Server Database.

It’s important to point out that the database structure will be different for each company so this is just an example SQL script that will need to be adapted to your specific use case.

If you would like this script to be customised for your own company or if you are interested in data consulting services, please reach out to me directly.

-- 1. Check for unencrypted personal data in tables with sensitive fields

-- (Assumes sensitive columns like 'Email', 'Phone', 'SSN' should be encrypted)

SELECT TABLE_NAME

        ,COLUMN_NAME

        ,DATA_TYPE

FROM INFORMATION_SCHEMA.COLUMNS

WHERE COLUMN_NAME IN (

               'Email'

               ,'Phone'

               ,'SSN'

               ,'Address'

               ) -- Add more sensitive fields as needed

        AND DATA_TYPE IN (

               'varchar'

               ,'nvarchar'

               ,'text'

               ) -- Unencrypted data types

ORDER BY TABLE_NAME;

 

-- 2. Check for personal data older than the retention period (e.g., 2 years) without deletion

-- (Assumes you have a 'CreationDate' or similar timestamp to check for old records)

SELECT CustomerID

        ,Name

        ,Email

        ,CreationDate

FROM Customers

WHERE DATEDIFF(YEAR, CreationDate, GETDATE()) > 2 -- Records older than 2 years

ORDER BY CreationDate;

 

-- 3. Check for missing consent records for users with personal data

-- (Assumes a 'ConsentTable' where user consent for data processing is recorded)

SELECT c.CustomerID

        ,c.Name

        ,c.Email

FROM Customers c

LEFT JOIN ConsentTable ct ON c.CustomerID = ct.CustomerID

WHERE ct.ConsentDate IS NULL -- Users without consent records

ORDER BY c.CustomerID;

 

-- 4. Check for unmasked or anonymized sensitive data in logs (e.g., error logs or audit trails)

-- (Sensitive data like SSNs or credit card numbers should be masked in logs)

SELECT LogID

        ,LogMessage

        ,LogDate

FROM ApplicationLogs

WHERE LogMessage LIKE '%SSN%' -- Potentially sensitive data found in logs

        OR LogMessage LIKE '%CreditCard%' -- Check for exposed credit card data

ORDER BY LogDate DESC;

 

-- 5. Check for wide access to tables with sensitive personal data

-- (Checks if tables containing sensitive data are accessible to too many roles)

SELECT t.name AS TableName

        ,p.name AS PrincipalName

        ,dp.permission_name

FROM sys.tables t

JOIN sys.database_permissions dp ON t.object_id = dp.major_id

JOIN sys.database_principals p ON dp.grantee_principal_id = p.principal_id

WHERE t.name IN (

               'Customers'

               ,'Orders'

               ,'Payments'

               ) -- Sensitive tables

ORDER BY t.name

        ,dp.permission_name;

 

-- 6. Check for records flagged for deletion (as per GDPR "right to be forgotten") but still in the database

-- (Assumes there's a field marking records for deletion)

SELECT CustomerID

        ,Name

        ,Email

        ,DeletionRequestDate

FROM Customers

WHERE DeletionRequestDate IS NOT NULL -- Records flagged for deletion

        AND DeletionDate IS NULL -- But still not deleted

ORDER BY DeletionRequestDate;

 

-- 7. Check for excessive data collection (GDPR requires data minimization)

-- (Checks if any non-essential columns like marketing preferences or extra user details are collected)

SELECT COLUMN_NAME

        ,TABLE_NAME

FROM INFORMATION_SCHEMA.COLUMNS

WHERE COLUMN_NAME IN (

               'MarketingPreferences'

               ,'SocialMediaHandles'

               ,'ExtraDetails'

               )

ORDER BY TABLE_NAME;

 

-- 8. Check if any data breach logs or security incidents have been recorded (GDPR requires breach notifications)

-- (Assumes there is a SecurityIncidents table where breaches are logged)

SELECT IncidentID

        ,IncidentDescription

        ,IncidentDate

        ,ResolutionStatus

FROM SecurityIncidents

WHERE IncidentDate >= DATEADD(YEAR, - 1, GETDATE()) -- Last 12 months of incidents

ORDER BY IncidentDate DESC;


A few points about the script:

  • If you don’t know the exact column names you are looking for, you could use the LIKE Operator i.e. using %Email% to match variants such as EmailAddress, CustomerEmail, or WorkEmail etc.
  • Encryption: You might need to integrate encryption checks, which would require a more sophisticated method, such as checking for column encryption (if using SQL Server’s Always Encrypted feature).
  • Anonymisation: If anonymisation or pseudonymisation is used, the query would need to adapt to how this is implemented.
  • Data Access Logs: You could also monitor who has accessed personal data by querying audit logs.
  • Audit and Breach Monitoring: GDPR requires breach notifications, so you should ensure data breaches are logged and reported.

This SQL script can serve as a first line of defence in checking GDPR compliance issues in a SQL Server database. It won't cover everything, but it can identify key areas where the database may need improvement to comply with GDPR standards.

Michael Fitzgerald

Director - Data and Development

7mo

nice read Kieran - thanks for sharing

Frank Daly

IT Consultant / Writer at VisiActive

7mo

Excellent article with useful SQL code snippets

To view or add a comment, sign in

More articles by Kieran Daly

Insights from the community

Others also viewed

Explore topics