A report from the Center for Democracy and Technology (CDT) entitled “Navigating Demographic Measurement for Fairness and Equity” explores the increasing expectations on AI system developers and users to proactively identify and address potential bias or discrimination. The report emphasizes the importance of demographic data in measuring fairness and bias within these systems, offering methodologies, guidance, and case studies for those undertaking fairness and equity assessments.
The report guides practitioners through the initial phases of demographic measurement efforts. This includes determining the relevant lens of analysis, selecting what demographic characteristics to consider, and navigating how to focus on relevant sub-communities. It then delves into several approaches to uncover demographic patterns.
The report highlights a variety of methods for measuring demographic characteristics for fairness measurement. These include direct collection from individuals, observation and inference, proxies and surrogate characteristics, auxiliary datasets, cohort discovery, keywords and terms, observation and labeling of content, synthetic data use, exploratory analysis, and qualitative research.
Recognizing the misuse history of demographic data against vulnerable communities, the report stresses that responsibly handling demographic data is as crucial as the measurement methods themselves. It suggests a mix-and-match approach among described methods to strengthen protections against potential harms while enabling critical work.
The report also outlines approaches for handling demographic characteristics for fairness measurement. These include pseudonymization; infrastructure controls; encryption; retention and ephemerality; aggregation; differential privacy; secure multi-party computation; user controls; organizational oversight; separate teams; privacy impact assessments.
In its recommendations section, it urges practitioners to establish ongoing relationships with affected communities by co-designing data collection strategies, discussing potential risks/benefits, defining fairness goals collaboratively. It also encourages government agencies to recognize available approaches for companies to identify disparities even without comprehensive demographic data collection.
While there is no one-size-fits-all solution in assessing AI systems for fairness or justifying widespread data collection efforts, this report makes clear that lack of obvious access to raw demographic data should not be considered an insurmountable barrier. The report emphasizes the need for practitioners to engage early and often with impacted communities, clearly document and communicate their practices, and embed strong technical and institutional safeguards to prevent misuse.