Skip to main content
About HEC About HEC
Summer School Summer School
Faculty & Research Faculty & Research
Master’s programs Master’s programs
Bachelor Programs Bachelor Programs
MBA Programs MBA Programs
PhD Program PhD Program
Executive Education Executive Education
HEC Online HEC Online
About HEC
Overview Overview
Who
We Are
Who
We Are
Egalité des chances Egalité des chances
HEC Talents HEC Talents
International International
Sustainability Sustainability
Diversity
& Inclusion
Diversity
& Inclusion
The HEC
Foundation
The HEC
Foundation
Campus life Campus life
Activity Reports Activity Reports
Summer School
Youth Programs Youth Programs
Summer programs Summer programs
Online Programs Online Programs
Faculty & Research
Overview Overview
Faculty Directory Faculty Directory
Departments Departments
Centers Centers
Chairs Chairs
Grants Grants
Knowledge@HEC Knowledge@HEC
Master’s programs
Master in
Management
Master in
Management
Master's
Programs
Master's
Programs
Double Degree
Programs
Double Degree
Programs
Bachelor
Programs
Bachelor
Programs
Summer
Programs
Summer
Programs
Exchange
students
Exchange
students
Student
Life
Student
Life
Our
Difference
Our
Difference
Bachelor Programs
Overview Overview
Course content Course content
Admissions Admissions
Fees and Financing Fees and Financing
MBA Programs
MBA MBA
Executive MBA Executive MBA
TRIUM EMBA TRIUM EMBA
PhD Program
Overview Overview
HEC Difference HEC Difference
Program details Program details
Research areas Research areas
HEC Community HEC Community
Placement Placement
Job Market Job Market
Admissions Admissions
Financing Financing
FAQ FAQ
Executive Education
Home Home
About us About us
Management topics Management topics
Open Programs Open Programs
Custom Programs Custom Programs
Events/News Events/News
Contacts Contacts
HEC Online
Overview Overview
Executive programs Executive programs
MOOCs MOOCs
Summer Programs Summer Programs
Youth programs Youth programs
Article

How to Handle People’s Data Ethically

Human Resources
Published on:

Despite legislative transformation, the issue of data handling is far from resolved. Since GDPR (the General Data Protection Regulation) came into effect in 2018, the EU has collected over €3 billion in fines from companies who have broken the rules. And with the AI industry ramping up, the question of ethical data handling is more pressing than ever, say Dominique Rouziès, Professor of Marketing, and Michael Segalla, Professor of Management, both at HEC Paris.

mosaic of images_cover

Copyright: peshkov

A double-edged sword

Since word got out that ‘data is the new oil’, companies have been enthusiastically collecting as much of it as possible. Data helps businesses make predictions, assess risk, understand their customers, and evolve their products and services. It’s precious and powerful but collecting it by the barrel comes with drawbacks – not just for firms, but for us all. 

When handling data provided by humans, firms must be aware not only of the data security and the risk of breaches, but of the harms that can result from how the data is used. 

AI – raising the stakes

In the era of ChatGPT, GPT-4, Google’s Bard and other AI systems, how we treat sensitive information takes on a new significance. AI-enabled risk management has already led to claims of discrimination, where algorithms trained on biased material negatively skew outcomes for minority groups and women.

AI technologies can build complex psychological profiles using a few scraps of personal data. Even social media ‘likes’ can reliably predict personality traits and political persuasions. The algorithms developed by AI tools can also compromise privacy due to what is overlooked – the omitted variable bias. An AI system works according to what it’s trained on, and if that material is missing a vital element, mistakes are easily made, as in the famous case of Target’s pregnancy predictor algorithm - which neglected to exclude minors. It resulted in the sending of marketing materials that revealed a teenager’s pregnancy to her parents.

 

For businesses, responsible data handling is a priority not just for ethical reasons, but to limit the impact of fines, brand damage and future legislative complications.

 

AIs are designed to benefit society, and they do so in all kinds of ways. But whatever their intended purpose, these technologies use human-created data as their fuel, and we need to manage which kinds of information they run on. We can do this in part by making sure sensitive data and other personal information is collected, processed, and stored responsibly.

The five ‘Ps’ of Ethical Data Handling

We developed the five Ps model to help guide companies collecting human data or making use of existing databases. 

•    Provenance
 

Firms that collect human data must pay special attention to where the data comes from, who provided and collected it, and whether it was obtained with consent, free of coercion or subterfuge. 

This applies not only to new data collection, but retroactively – many firms have stores of ‘dark data’ collected from customers in the past. It’s typically unstructured data such as visitor logs, social media comments and uploaded media, that was unused at the time but is now being exploited.

•    Purpose

It goes without saying that the reasons for collecting data must be ethically sound. It’s also important that the people who consented to data collection know how, and for what, their data is used. Having collected personal data, businesses often reuse it for purposes different to those originally intended. It is important to consider whether the person who provided their data would agree to its use for another project.

In the case of customer data, repurposing of personal data has become its own industry. Many companies routinely sell their first-party data as a product, some of them using it as a primary revenue source. However, this practice is becoming less acceptable, and has led to fines and sanctions for some organizations. 

If the purpose of data collection changes, or the company finds a new use for existing data, they must consider whether consent should be obtained again. 

•    Protection

Personal information can be exposed by data breaches; even the largest and most sophisticated organizations experience hacks and leaks. In the USA, organizations reported some 2,000 data breaches in 2021; breaches are not uncommon even in more strictly regulated EU markets. 

For many businesses, data security is outsourced to specialist firms. But this is no watertight guarantee, particularly if the provider is linked to commercial or political entities that may internally gain access to the data. 

It’s important for firms to determine how they will protect the personal data they collect, who can access it, how it will be anonymized, and when it will be destroyed. 

•    Privacy

Companies must strike a balance between holding data securely and ensuring it’s still usable for their purposes. For example, anonymization helps protect data providers, but if over-applied, it can make the data useless for marketing. 

Data can be aggregated, approximated, subtly altered or pseudonymized. When cross-referenced with other information though, individuals can still be identified from just a few details, as various high-profile examples have shown. Netflix published an anonymized data set with 100 million records of customers’ movie ratings, challenging data scientists to use it to create a new recommendation algorithm. Researchers were able to identify 84% of the individuals in Netflix’s ‘anonymized’ dataset by cross-referencing it with another one from movie ranking site IMDb.

Another privacy pitfall is geolocation, used to provide location-based recommendations and map services, among other applications. Geolocation can tie an individual’s IP address to their physical address, making their homes easy to find. It can also mistakenly link them to a nearby building or organization that has nothing to do with them. This could have unintended consequences for the individual, who has little recourse for correcting the errors.  

•    Preparation

Data is often imperfect, inconsistent, and incomplete. It might appear in multiple languages, contain typos, or vary in format. Data cleaning is a necessary step for making it useful to analysts, but when the data comes in large volumes, they need to rely on software rather than humans to do the job. This opens the door to programming errors that can introduce wildly inaccurate figures, skewing the results unless researchers catch and correct the problem.
 
 

Applications

For businesses, responsible data handling is a priority not just for ethical reasons, but to limit the impact of fines, brand damage, and future legislative complications. Acting responsibly may not generate revenue, but it is a critical part of long-term business sustainability. To comply with GDPR, firms must appoint a data controller, but this person is typically a mid- or senior-level compliance manager lacking necessary knowledge in AI and ethics. Instead, we advise businesses to take a cue from academia, where an Institutional Review Board (IRB) versed in ethics supervises the collection of human data. Some large companies have already set up IRBs, often including a specialist in digital ethics.   

Methodology

The reflections in this article are based on a number of different studies, workshops, webinars and projects conducted by professors of HEC Paris.
Based on an interview with Professors Rouziès and Segalla based on their article, ‘The Ethics of Managing Peoples’ Data’, published in the Harvard Business Review, July-August 2023, pp. 86-94.

Related content on Human Resources