-
Notifications
You must be signed in to change notification settings - Fork 89
Systems of equations attack to recover visited websites? #112
Description
I'm building on this issue by @johnwilander.
In https://github.com/WICG/floc/issues/99, it is stated that "FLoC is not useful for tracking." I don't think that's accurate.
As far as I know, the user's cohort will not be partitioned per first party site so multiple sites can observe the cohort ID in sync as it changes week after week. A hash of the cohorts seen so far will likely get more and more unique as the weeks go by.
Websites or tracker scripts on websites can expose arrays of the cohorts they've seen to help all trackers identify the user, like this:
let cohortCollectionForWebsiteA = [ "week01_2022" : "0666", "week03_2022" : "A566", "week04_2022" : "2111", "week05_2022" : "1171", "week07_2022" : "749B", ] let cohortCollectionForWebsiteB = [ "week01_2022" : "0666", "week02_2022" : "0030", "week05_2022" : "1171", "week06_2022" : "7311", "week07_2022" : "749B", ]Trackers send these to a server for matching across websites, in the example above, resulting in the intersection
[ "week01_2022", "week05_2022", "week07_2022" ].
Just a slight thought.
What if we consider a slightly different threat/exploitation scenario (unless it's simply a flavour of tge remarks quoted n the above, which is why I retain them here), which are linked to the risks I already pointed to. Specifically reversing the cohort ID to obtain the actually visited websites?.
So the idea to hypothetically improve such reversal could follow a reasoning where we know that the sets of visited sites in week_i (i = given week of the year) corresponds to a specific ID_i. The computation of the FloC is based on input (website addresses). So I wonder if it would be possible to mathematically construct a system of equations of the form:
FloC(sites_1) = ID_1,
...,
FloC(sites_i) = ID_i
And then use the properties of SimHash to obtain the visited websites or at least improve the inference of the visited websites. Note, I did not focus on the analytical solution so I do not know the circumstances when such a system of equations would be solvable. It would be interesting to consider it in the threat model, though, so I leave a proof exercise to the proponents. Thank you.