You are identifiable by combining multiple factors, even if every individual factor is not enough to be identifiable.
If I understand it correctly, the EU 'personal data' (PD) concept is much wider than the US 'Personally Identifiable Information' (PII) concept. You are touching one of the differences here.
For GDPR purposes, data is PII if it can be used in combination with any other data to identify an individual. Doesn’t matter if the individual data points are not themselves identifying.
One thing I’ve been curious about is whether AI and algorithms that can potentially take a huge amount of anonymous data and “identify” a user (but not explicitly), only identify in the sense that the output of the AI was only possible by correlating individuals granularity enough. I’m almost certain the answer is yes. I’m not clear on whether GDPR addresses that issue or not.
> For GDPR purposes, data is PII if it can be used in combination with any other data to identify an individual.
By that definition, all data is PII. There is no information available on this planet that has not been influenced by people.
I'm not trying to be obtuse. I worry about this problem a lot. Obviously we need to keep companies from doing stupid stuff like storing the first digit of a Social Security number (can't identify someone by that!) and then the second digit (also not uniquely identifying!), etc.
On the other hand, what if I have web log files that only store URL, timestamp, and status code? Is that OK? If I get hits for two specific pages within a couple of minutes of each other, and there's only one person on the planet who would know about both those pages, I know they were visiting my site at that time.
People influence the world around them and it feels like privacy laws are trying to prevent companies from understanding that influence. At the same time every other incentive is pushing those companies to understand more.
> By that definition, all data is PII. There is no information available on this planet that has not been influenced by people.
I think that is a step too far. For example, it seems quite clear that a dataset of daily average temperatures from the top of Everest is not personally identifying information.
Black hair = PII, address = PII, drives black BMW = PII, any of this information together with other information could be used to identify an individual and that is exactly the issue. It is like saying that one brick is a house just because multiple bricks can make a house. If you gather enough data you can potentially point to specific individual. Just like unique PC fingerprinting - gather enough data points so that the fingerprint is unique.
AFAIK according to the GDPR, knowing each individual fact is fine. Only the combination is PD.
Hence, installing a camera that counts black-haired people, another that counts people entering some location, a third counting people having a BMW is perfectly fine. Merging the 3 recorded tapes to identify a person is not. Giving the 3 tapes to someone else is only OK if you guarantee somehow they wont do the merge.
Privacy laws are mainly aimed at allowing those whose data is being used to be aware of this, understand what is used for which purpose, and to elect to control this should they object.
See first definition of https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...
You are identifiable by combining multiple factors, even if every individual factor is not enough to be identifiable.
If I understand it correctly, the EU 'personal data' (PD) concept is much wider than the US 'Personally Identifiable Information' (PII) concept. You are touching one of the differences here.