Public Data, Open Source Data, and GDPR Compliance
A deeper examination of the intricacies of data privacy regulations, particularly within the scope of the General Data Protection Regulation (GDPR), reveals the importance of distinguishing between public data and open-source data, along with their respective implications for both legal professionals and IT specialists. Legal professionals, especially in-house counsel, must often adopt a practical approach to theoretical problems to best assist the organisations they serve. Hence, the necessity of debating "open-source data" in relation to GDPR compliance arises.
Open Source Data
We are all familiar with the concept of open-source data. Open-source data, which does not involve personal data, can be immensely valuable across various fields. Consider the following points regarding open-source data:
Open-source data often refers to datasets that are openly licensed, allowing anyone to freely access, use, modify, and share the data. This is closely related to the philosophy behind open-source software, which aims to promote transparency, innovation, and community collaboration. Further definitions can be found on, inter alia, opensource.org.
Accessibility: Open-source data removes barriers to entry, enabling researchers, developers, and organisations to leverage the data without significant restrictions.
Collaboration: By making data freely available, it fosters a collaborative environment where individuals and organisations can work together to improve and innovate.
Transparency: Open-source data promotes transparency and trust, as users can scrutinise and verify the data and methodologies used.
Government Data: Many governments provide open data portals where datasets related to demographics, economics, and public services are made freely available. An example is datavejviser.dk.
Scientific Research: Open access to sources of various scientific studies and datasets enables peer review and further research.
Correlation between Personal Data (or PII) and Public Data in a Legal Perspective
Public data, in this context, refers to any information intentionally made available to the public by the data subject or accessible due to its public nature.
Under the GDPR, processing public data, even within special categories of personal data, requires careful consideration. Article 9 of the GDPR explicitly outlines conditions under which special categories of data may be processed, one of which is when the data has been "manifestly made public" by the data subject (cf. Article 9, paragraph 2, litra e). This exemption does not apply to Article 6 of the GDPR.
Recommended by LinkedIn
It is widely accepted that publicly available personal data does not grant carte blanche to processors. The principles in Article 5 still apply, as well as recital 39. Some principles to follow include:
I would argue that the principles of accuracy, and integrity and confidentiality should not have the same level of protection as the above-mentioned when data has manifestly been made public or is part of a public dataset. This is due to the nature of public data and the ethos of open-source. Some examples will be provided later in this post.
Furthermore, the rights of the data subject in Articles 12-22 are debatable when the data has been made public.
The GDPR has not fully addressed the considerations of data made publicly available and the "free" processing of these types of data. The only mention is in Article 9. This means many organisations struggle with the processing of such data, risk analysis, and related issues.
All the above applies solely to Article 9, and a teleological interpretation favoured by the CJEU, still applies, although an expanded interpretation to cover Article 6 would be beneficial (but not likely).
A Debatable Point of View
The debate around public and open-source data often centres on the balance between accessibility and privacy. While public data provides valuable resources for innovation and transparency, it also poses risks if not managed properly. Legal professionals must navigate these complexities, ensuring data is used responsibly and in compliance with regulations.
An argument I would pose is that all data made publicly available should be considered "open-source" data to some extent, although the principles found in Article 5 should still apply.
For example, if my small company, Nakai Consulting, were to process data on a public-facing website operated by a municipality, I would argue that the personal identifiable data they made public on their website would be "free" for me to process without significant risks, assuming we entered into a contract. This could include names of employees, business emails, contact information to their work phone, etc.
I should mention that my company does not engage in such activities; this is merely an illustrative example.
For IT specialists and knowledgeable legal professionals, the challenge lies in implementing robust systems and solutions that can handle open-source data while maintaining stringent data protection standards. This includes leveraging technologies such as encryption, pseudonymisation, and secure data-sharing protocols. As mentioned before, taking a practical approach to solve a theoretical problem is essential. It can be particularly confusing for IT specialists that publicly available data on a website carries such weight in privacy laws and is subject to debate regarding transfers and associated risks, when open source data has been around in the IT world for a very long time.
Conclusion
As we move towards a more data-driven world, the interplay between public data, open-source data, and GDPR compliance will remain a critical area for both the legal and IT sectors. The reassessment of the GDPR, expected to occur soon, should address the concept of "open-source data" in terms of data privacy and apply it to, inter alia, Article 6, while still limiting data mining and data scraping on websites. Hopefully adopt some of the ideas from the IT interpretation of open source and thus making life easier for a lot of people both in terms of processing, but also in consideration of which types of data you make available publicly.
I firmly believe, that the GDPR and the processing of data (both personal and non-personal) should adhere to some of the principles in Article 5 and should also apply ethical use considerations to guide the use of data. This involves respecting the privacy and intentions of data subjects, even if the data is freely accessible.
I personally have not seen a lot of research into this topic, so feel free to share your thoughts and insights on this topic.