AI-driven data innovation and GDPR protection

by Licia Presutti

Artificial intelligence: “The study and design of intelligent agents, where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success”. (Russell and Norvig)

“AI is the new electricity”. (Andrew Ng)

“AI is here to remind us what it is that makes us more human”. (Kai-Fu Lee)

1. Introduction

Artificial intelligence (AI) needs a huge amount of data in order to learn and make decisions, thus, data protection management constitutes a top dilemma to reflect on.

Data is the mirror of humanity and the data economy is a reality, so it is essential for a company to be compliant with the data protection regulation (GDPR).

However, we will face a range of legal and ethical issues in the search for a balance between considerable social advances in the name of AI and fundamental privacy rights.

So, what challenges AI development will raise in the emerging algorithmic economy with specific regard to data protection regulation and automated decision making?

2. What is artificial intelligence?

Firstly, a little bit of context regarding artificial intelligence. John McCarthy, an American computer scientist, coined the term “artificial intelligence” in 1956. Today, this term refers to everything from robotic process automation to actual robotics. Recently, it has gained prominence due to big data. AI can perform tasks such as identifying patterns in data more efficiently than humans, enabling businesses to gain more insight from their data[1].

Artificial intelligence describes a computer-based system that can learn and avoid complexity in different situations. In this way, data (mostly personal data) enables the system to learn and become efficient.

AI-based systems can become effective only if they have enough relevant data to learn from, in order to make better decisions. Artificial intelligence, machine learning, supervised learning, unsupervised learning and deep learning are terms that are often used as synonyms, but let’s clarify these concepts.

AI: “AI (artificial intelligence) is the simulation of human intelligence processes by machines, especially computer systems. These processes include learning (the acquisition of information and rules for using the information), reasoning (using rules to reach approximate or definite conclusions) and self-correction”[2].

Some examples of artificial intelligence are machine vision, natural language processing, robotics, and self-driving cars.

Machine learning: “machine learning is based on algorithms that can learn from data without relying on rules-based programming”[3].

“Machine Learning is the science of getting computers to learn and act like humans do, and improve their learning over time in autonomous fashion, by feeding them data and information in the form of observations and real-world interactions.”[4].

Supervised machine learning: “supervised learning is a method used to enable machines to classify objects, problems or situations based on related data fed into the machines”[5].

Unsupervised machine learning: “unsupervised learning is a method used to enable machines to classify both tangible and intangible objects without providing the machines any prior information about the objects”[6].

Deep learning: “deep learning is a collection of algorithms used in machine learning, used to model high-level abstractions in data through the use of model architectures, which are composed of multiple nonlinear transformations. It is part of a broad family of methods used for machine learning that are based on learning representations of data[7]”.

The power of AI consists in identifying complex patterns. This happens in the big data scenario when the volume, the velocity and the variety of the data are simply too large for a human brain to deal with and to recognize correlations.

3. AI and Big Data: profiling and automated decision making under the GDPR.

When discussing artificial intelligence, it’s automatic to talk about the General Data Protection Regulation (GDPR). Indeed, the GDPR has had the most impact of any law globally in terms of creating a more regulated data market[8]. In fact, the GDPR enters into force in a crucial time for the digital economy and for the digital world in general.

Among the challenges faced by the data protection regulation in the digital age, the emergence of big data stands out as being one of the main issues. Generally speaking, the term refers to the practice of creating and analyzing huge datasets, which include personal data. Consequently, the data analysis could be an issue for the individuals whose data is analyzed and stored[9].

The big data Age represents a new way to collect, analyze data in the digital era. Moreover, the big data scenario refers to advanced forms of data analysis, usually machine-driven and powered by data mining tools. Thus, the availability of these tools requires more attention in preventing data protection issues[10]. By contrast, stringent data protection laws could impede the free flow of data and also the benefits that could derived from data processing[11].

At this point, the main question is: which are the GDPR provisions relevant for the development of artificial intelligence in the big data scenario?

With specific attention to the automated decision-making, including profiling, the GDPR gives individuals the right not to be subject solely to automated decision-making. Along these lines, it applies when artificial intelligence is under development with the help of personal data and when it is used to analyze or reach decisions about individuals^[12].

In this regard the relevant GDPR provisions are the following:

Art. 4 Definitions.

“Personal data: means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person […]”.

“Profiling: any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person’s performance at work, economic situation, health, personal preferences, interests, reliability, behavior, location or movements […]”.

The relation between artificial intelligence and big data is bi-directional: on the one hand artificial intelligence, through machine learning, needs a huge volume of data (including personal data) to learn from and to execute a better decision-making process. On the other hand, data analysts use artificial intelligence techniques to synthesize information from huge datasets[13].

To be more specific, GDPR has an impact on AI development with particular emphasis on Article 22.

Article 22 refers to the concept of automated decision making: the ability to make decisions by technological means without human involvement, based on any type of data. Generally speaking, this article prescribes that AI cannot be used as the sole decision-maker in choices that have legal or similarly significant effects on users^[14].

Art. 22 Automated individual decision-making, including profiling.

The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.
Paragraph 1 shall not apply if the decision:
a. is necessary for entering into, or performance of, a contract between the data subject and a data controller;
b. is authorised by Union or Member State law to which the controller is subject and which also lays down suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests; or
c. is based on the data subject’s explicit consent.
In the cases referred to in points (a) and (c) of paragraph 2, the data controller shall implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, at least the right to obtain human intervention on the part of the controller, to express his or her point of view and to contest the decision.

Decisions referred to in paragraph 2 shall not be based on special categories of personal data referred to in Article 9(1), unless point (a) or (g) of Article 9(2) applies and suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests are in place.

So, the GDPR’s article 22 sets forth a specific legal rule governing decision-making processes, which are both fully automated and substantially impact individuals, such as credits applications or recruiting. Generally speaking, the data protection regulation provides the individual with the right not to be subjected to these processes without a human intervention[15].

In this way, article 22 directly impacts the big data system, prohibiting automated analysis regarding many of the datasets. From this point of view, allowing the human intervention could encumber the automated process and it could slow down the innovative technologies system[16].

As Zarsky said, article 22 could represent a deep distrust towards automated processes behind artificial intelligence. Thus, it could be possible that firms will be required to change their technology architectures and even business models, opting for less efficient practices which comply with this rule[17].

The main issue to reflect on is point three, in which the data controller shall implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, at least the right to obtain human intervention on the part of the controller, to express his or her point of view and to contest the decision. Indeed, the most direct restriction in the GDPR that specifically targets the use of AI is the requirement of human intervention in certain algorithmic decisions. Article 22’s right to human intervention and explanation of logic requires that AI decisions could be explainable. When people talk about computers learning to “teach themselves”, rather than us having to teach them (one of the principles of machine learning), they are often alluding to unsupervised learning processes^[18]. While a supervised model of learning uses labeled sets of data to develop algorithms, supplemented by human oversight, unsupervised models allow AI to evolve on its own. With unsupervised models, it may not be possible to trace the AI’s learning processes or to explain its decisions, due to a lack of data labels and relationships. Thus, even supervised models may be too hard to explain, which would impair one of AI’s most useful purposes: automated decisions and forecasts. As a result, the GDPR’s extensive protection of data privacy rights could, in some ways, restrain the use of AI’s most useful features: autonomy and automation[19].

4. Conclusion

Effective privacy regulation represents a key point to allow technologies like artificial intelligence to help solve the world’s greatest challenges. The combination of advances in computing power, memory and analytics create the possibility that technology can embrace innovation in precision medicine, disease detection, driving assistance, increased productivity, workplace safety, education and more. At the same time, it is crucial to recognize the need for a legal system to prevent harmful uses of technology and to safeguard personal information in order to embrace new AI-driven technologies^[20]. Privacy is a fundamental human right and effective privacy protection is crucial to allow individuals to trust technology and participate in society. Indeed, it is essential to find a balance between the ability to engage in big data analysis (and consequently to embrace an ethical AI-development) to its full extent and the protection of privacy interests and rights[21].

In conclusion, it is fundamental to be aware that a failure in the field of artificial intelligence development could imply a second-tier status in the emerging algorithmic economy and in the next wave of social evolution. In addition, due to data protection enforcement, many firms could be discouraged from offering their AI-driven services in the EU, thus leaving European consumers and businesses unable to access beneficial services that are available to their counterparts and competitors elsewhere and making the EU market for AI less competitive and innovative[22].

[1]“AI (artificial intelligence)” available at https://searchenterpriseai.techtarget.com/definition/AI-Artificial-Intelligence;

[2]“AI (artificial intelligence)” available at https://searchenterpriseai.techtarget.com/definition/AI-Artificial-Intelligence;

[3] “An executive’s guide to machine learning”, (McKinsey & Co), available at https://www.mckinsey.com/industries/high-tech/our-insights/an-executives-guide-to-machine-learning;

[4] “What is machine learning?” available at https://www.techemergence.com/what-is-machine-learning/;

[5] “Supervised Learning”, https://www.techopedia.com/definition/30389/supervised-learning;

[6] “Unsupervised Learning”, https://www.techopedia.com/definition/30390/unsupervised-learning;

[7] “Deep learning” available at https://www.techopedia.com/definition/30325/deep-learning;

[8] “GDPR and AI: Friends, foes or something in between?” available at https://www.sas.com/en_us/insights/articles/data-management/gdpr-and-ai–friends–foes-or-something-in-between-.html;

[9] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[10] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[11] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[12] “AI (artificial intelligence)” available at https://searchenterpriseai.techtarget.com/definition/AI-Artificial-Intelligence;

[13] “Artificial Intelligence, robotics, privacy and data protection”, available at https://edps.europa.eu/sites/edp/files/publication/16-10-19_marrakesh_ai_paper_en.pdf;

[14] “GDPR panic may spur data and AI innovation”, available at https://techcrunch.com/2018/06/07/gdpr-panic-may-spur-data-and-ai-innovation/;

[15] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[16] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr

[17] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[18] “Supervised V Unsupervised Machine Learning — What’s The Difference?”, available at https://www.forbes.com/sites/bernardmarr/2017/03/16/supervised-v-unsupervised-machine-learning-whats-the-difference/#747839485d85;

[19] “Supervised V Unsupervised Machine Learning — What’s The Difference?”, available at https://www.forbes.com/sites/bernardmarr/2017/03/16/supervised-v-unsupervised-machine-learning-whats-the-difference/#747839485d85;

[20] “Intel Privacy Proposal Aims at ‘Ethical’ Data Use, AI Development”, available at https://www.meritalk.com/articles/intel-privacy-proposal-aims-at-ethical-data-use-ai-development/;

[21] “Incompatible: The GDPR in the Age of Big Data”, (Tal Z. Zarsky), available at https://scholarship.shu.edu/cgi/viewcontent.cgi?referer=http://scholar.google.it/&httpsredir=1&article=1606&context=shlr;

[22]“The Impact of the EU’s New Data Protection Regulation on AI”, available at http://www2.datainnovation.org/2018-impact-gdpr-ai.pdf.

Autore

AI-driven data innovation and GDPR protection

Articolo 5 – D.Lgs. 82/2005 (Codice dell’Amministrazione Digitale)

Author Licia Presutti

AI-driven data innovation and GDPR protection