Webp jw2qwdzod43nz9o0jz4530nbjfnm
Lina M. Khan Chair of the Federal Trade Commission | Official website

FTC warns companies: Hashing does not ensure anonymity

ORGANIZATIONS IN THIS STORY

The Federal Trade Commission (FTC) routinely evaluates the privacy representations companies make against their data handling practices. When discrepancies arise between claims and reality, incorrect assertions about data identification are often to blame. Companies frequently claim that data lacking clearly identifying information is anonymous, but true anonymity means data can never be associated back to a person. If data can uniquely identify or target a user, it can still cause harm.

One method companies use to obscure personal data is "hashing." Hashing involves converting a piece of data—such as an email address, phone number, or user ID—into a number (called a hash) in a consistent manner: the same input will always produce the same hash. For example, hashing the fictional phone number “123-456-7890” transforms it into the hash “2813448ce6316cb70b38fa29c8c64130.” This hexadecimal number appears random but consistently results from hashing that phone number.

Hashing has potential benefits: a hash by itself cannot easily reveal the original data. Consequently, companies often use hashing when they prefer not to store or share directly identifying information but still want to retain the ability to match data later. Since the hash “2813448ce6316cb70b38fa29c8c64130” seems meaningless and cannot easily be traced back to the original phone number, companies often claim that hashing preserves user privacy.

However, this logic is flawed—hashes are not "anonymous" and can still identify users, leading to potential harm. Companies should not claim that hashing personal information renders it anonymized. FTC staff remain vigilant in ensuring compliance with laws and taking action against deceptive privacy claims.

In 2012, former Chief Technologist Ed Felten wrote a blog titled “Does Hashing Make Data ‘Anonymous’?” He concluded that it does not. While hashing might obscure how a user identifier appears, it still creates a unique signature capable of tracking an individual over time. The warning was clear: do not rely on hashing to reduce data sensitivity.

Some companies have failed to heed this warning. In 2015, the FTC brought a case against Nomi for allegedly surveilling consumers within stores using their MAC addresses—a device identifier for network connections. The complaint stated, “Nomi cryptographically hashes the MAC addresses it observes prior to storing them on its servers. Hashing obfuscates the MAC address, but the result is still a persistent unique identifier.”

Nomi was not alone in incorrectly relying on hashing for reducing data sensitivity. In 2022, the FTC filed a case against online counseling service BetterHelp for allegedly sharing consumers' sensitive health data—including hashed email addresses—with Facebook. The complaint detailed that BetterHelp knew Facebook would "undo the hashing and reveal the email addresses of those Visitors and Users." Despite sending hashes instead of email addresses, Facebook allegedly identified individuals seeking mental health counseling and used this information for targeted ads.

The privacy harms in both cases stemmed from identifying users rather than how they were identified. Hashing is just one tool used in persistent user identification; other mechanisms also exist.

In 2023, the FTC filed a complaint against Premom for allegedly collecting and sharing users' unique advertising and device identifiers with third parties despite claiming only "non-identifiable data" would be shared. The complaint explained how Premom's actions enabled third parties to track individuals and associate fertility app usage with specific users through these identifiers.

Similarly, in January 2024, the FTC announced a complaint against InMarket for unlawfully collecting data linked with unique mobile device identifiers without informed consent.

The FTC continues working to safeguard Americans' privacy by closely monitoring online user identifiers such as email addresses, phone numbers, MAC addresses, hashed email addresses, device identifiers, and advertising identifiers. Regardless of appearance or complexity, all user identifiers have powerful capabilities for tracking people over time; thus opacity cannot excuse improper use or disclosure.

___

ORGANIZATIONS IN THIS STORY