Ethical & Legal AI Data Collection in 2024: Examples & Policies
Ethics is a crucial aspect of life, and its absence may wreak havoc in the world. And similar to its application in our daily lives, ethical considerations should also apply in the tech world.
Disruptive technologies such as AI, ML, Internet of things (IoT), computer vision, etc., require all sorts of data to operate. This data often includes biometric data, such as facial images and voice recordings. Collecting and managing such data has various ethical and legal considerations attached to them, which, if disregarded, can lead to expensive lawsuits.
In this article, we explore data collection ethics and legal practices that business leaders can consider while sourcing/gathering data to develop and deploy data-hungry AI/ML solutions.
How to achieve data collection ethics? (Best practices)
Extensive research has been done on data collection ethics and how to achieve it; however, there is no golden door to the land of absoluteness. Ethics is more of a process and a culture that needs to be adopted by all contributors (data collectors, developers, decision-makers, sales, marketing, executives, etc.) in developing and implementing an AI/ML solution.
Specifically for data collectors, this section highlights some best practices they can follow:
Ethics training
Providing sufficient training about data collection ethics can be beneficial in promoting and adopting the culture. A best practice to make sure that the instructions are heeded is using an ethics checklist that the staff should tick off whenever they are collecting data.
You can also check our data-driven list of data collection/harvesting services to find the option that best suits your business needs.
Consent
Obtaining consent is one of the most important parts of data collection ethics. This is part of the agreement that is made between the data owner and the collector and should be done. before the data is collected, for instance, if a smart home device gathers voice data from its user, there should be a notification while setting up the app giving the user the option to provide consent.
Clarity and understanding
This means that when collectors require user consent, their request should be clearly stated in easily understandable words. The data collectors should ensure that the user fully understands what he/she is giving consent for.
Trust and consistency
This means that ethical and security practices while collecting data should be consistent to build trust in the data provider. For instance, if there are 500 data providers, then all 500 of them should be subjected to equal ethical considerations.
Awareness and Transparency
The data collection process should be transparent. The data provider should know what data is being collected, who will have access to that data, and how that data will be used.
Additionally, the data providers should have control over the usage of that data. For instance, if the data provider wants to stop the usage and sharing of data in the future, he/she should have the option to opt-out easily.
Risk consideration
Another important point to consider is that the risk of problems occurring in the future can never be completely eliminated. Therefore, the data collection must consider the risk of such unforeseen events and try to prepare a mitigation plan. Additionally, the data collector should communicate this risk to the data provider.
History of Major Data Collection Lawsuits (Case studies)
This section highlights some cases of unethical data collection done in the past:
Unethical facial recognition data collection
In 2019, the Washington Post released that the US’s immigration and customs enforcement authority unethically collected facial image data to track the activities of immigrants.
Watch the video to see how JFK airport only gathers facial images of foreigners.
To learn more about facial recognition, check out this quick read.
Voice data collection by smart home devices
Similarly, brands that offer smart home devices have also been under scrutiny for unethically collecting voice (biometric data) data of their users.
For instance, Alexa is currently (at the time this article is being written) under a lawsuit for collecting user voice data without consent. This was found in a collaborative study by researchers from the University of Washington and three other institutions, which led to the lawsuit.
Watch this video to see how smart home devices gather user data:
Latest Regulations of Data Collection and Protection
This section covers some data collection policies and rules around the world.
- Europe’s General Data Protection Regulation (GDPR) gives people the right to delete/remove their data from the systems that it was uploaded in.
- The Children’s Online Privacy Protection Act (COPPA) is in place in the U.S. to protect children’s data. It includes the dos and don’ts of gathering and using children’s data, such as when to take consent from the guardian, where not to use the data, etc.
- The Genetic Information Nondiscrimination Act (GINA) in the U.S. protect’s people’s genetic data from being used by insurance companies, hospitals, and other organizations who might exploit it.
- The Federal Trade Commission Act (FTC) in the U.S. also protects consumer data.
- The Data Protection Act 2018 is the UK’s version of the GDPR.
- Data governance in China is regulated by 3 main laws:
For more in-depth knowledge on data collection, feel free to download our whitepaper:
Further reading
- Quick Guide to Data Collection Quality Assurance
- Quick Guide to Data Collection/Harvesting/Sourcing
- Crowdsourced AI Data Collection Benefits & Best Practices
If you need help finding a vendor or have any questions, feel free to contact us:
Comments
Your email address will not be published. All fields are required.