Skip to content

Conversation

@JeremySkinner
Copy link

@JeremySkinner JeremySkinner commented Mar 13, 2018

Continues the work that @yanniboi began in #2

This PR implements the "Remove" task that will go through and remove/anonymize fields. When editing a field definition, and "Right to be forgotten" is set to "Anonymise", then a new Sanitizer dropdown appears. This uses the same sanitizer plugins available from the original gdpr_dump module.

image

For non-editable fields (such as base fields on User), these settings can be attached for now using hook_entity_base_field_info_alter:

function gdpr_tasks_entity_base_field_info_alter(&$fields, \Drupal\Core\Entity\EntityTypeInterface $entity_type) {
if ($entity_type->id() == 'user') {
$fields['mail']['third_party_settings']['gdpr_fields']['gdpr_fields_enabled'] = TRUE;
$fields['mail']['third_party_settings']['gdpr_fields']['gdpr_fields_rta'] = 'inc';
$fields['mail']['third_party_settings']['gdpr_fields']['gdpr_fields_rtf'] = 'anonymise';
$fields['mail']['third_party_settings']['gdpr_fields']['gdpr_fields_sanitizer'] = 'gdpr_email_sanitizer';
$fields['name']['third_party_settings']['gdpr_fields']['gdpr_fields_enabled'] = TRUE;
$fields['name']['third_party_settings']['gdpr_fields']['gdpr_fields_rta'] = 'inc';
$fields['name']['third_party_settings']['gdpr_fields']['gdpr_fields_rtf'] = 'anonymise';
$fields['name']['third_party_settings']['gdpr_fields']['gdpr_fields_sanitizer'] = 'gdpr_random_text_sanitizer';
$fields['roles']['third_party_settings']['gdpr_fields']['gdpr_fields_enabled'] = TRUE;
$fields['roles']['third_party_settings']['gdpr_fields']['gdpr_fields_rta'] = 'no';
$fields['roles']['third_party_settings']['gdpr_fields']['gdpr_fields_rtf'] = 'remove';
}

(Aside: We should be consistent with use of UK or US English for anonymise/anonymize, sanitise/sanitize etc)

When viewing the task, fields are listed with their corresponding action:

image

Once run, the "Processed by" field will be populated and a log will be created indicating which fields were affected and which sanitizer was used on them (this data should be backed up separately from the database backup and used as the basis of re-playing remove requests after a database restore). This will need to be tidied up at some point and probably moved to a separate tab.

image

If a field is set to anonymise, but no sanitizer is defined then it'll attempt to find the default for the datatype. At the moment, TextSanitizer is used for string fields and DateSanitizer is used for datetime fields. Others can be added by implementing the gdpr_type_sanitizers_alter hook:

// No sanitizer defined directly on the field. Instead try and get one for the datatype.
$sanitizers = [
'string' => 'gdpr_text_sanitizer',
'datetime' => 'gdpr_date_sanitizer',
];
$this->moduleHandler->alter('gdpr_type_sanitizers', $sanitizers);
$sanitizer = $sanitizers[$type];

A few things to note:

  • I changed the interface declaration for GdprSanitizerInterface::sanitize to also receive an instance of the field which can be used for metadata purposes (eg calculating max length of the field).
  • The IDs for the sanitizers had a typo (gpdr_* instead of gdpr_*), which I've fixed but is technically a breaking change.
  • Introduced RandomTextSanitizer as an alternative to TextSanitizer and UsernameSanitizer. This generates a random string of a specified length prefixed by anon_. This is useful for sanitizing usernames as it shows it's been anonymized at a glance (while the original UsernameSanitizer generates usernames that don't 'look' like they've been sanitized).

@JeremySkinner JeremySkinner requested a review from yanniboi March 13, 2018 15:17
@andrewbelcher
Copy link
Contributor

GDPR is a European thing, so I would be inclined to break convention and go with UK English!

Description and screenshot all look great to me, let me know if you need a code review from me and I can try and get to that tomorrow.

@JeremySkinner
Copy link
Author

@andrewbelcher yes please, that'd be great. I've assigned Yan too so he can take a look.

@yanniboi yanniboi changed the title Feature/gdpr tasks remove Add ability to remove/anonymise user data Mar 14, 2018
else {
$max_length = NULL;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You dont need to explicitly set $max_length to NULL as you are checking for isset later anyway. Just remove the else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - that's my C# background coming through!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
return $randomString;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this you can use \Drupal\Component\Utility\Random::string() or \Drupal\Component\Utility\Random::name() for alpha numeric strings.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

continue;
}

if ($type == 'typed_data_entity_relationship') {
Copy link
Contributor

@yanniboi yanniboi Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, can you stick a @todo in here to remove the gdpr_task specific continue once we have solved the concept of 'ignore/excluded' relationships and entity types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

private $currentUser;

private $sanitizerFactory;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document the class properties? At least lets stick in the classes so that IDE's can tell what it is we are working with.

Copy link
Contributor

@yanniboi yanniboi Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly the rest of the Anonymizer methods. Can we add @param and @doc definitions to the docblocks?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanniboi Yes, I can add those. What's the convention for where to apply doc blocks usually? I assumed it was for public/protected members, but not private.

Regarding the IDE, it can already tell without the docblocks on the fields as it picks it up form the type hints on the constructor parameters used to initialize the fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeremySkinner the convention in Drupal is to document everything. If nothing else, it makes it easier for other developers in the team to read and understand what's going on.

Also, generally I would use protected rather than private unless there is a specific reason why we'd need this to be the final implementation/use of it. Generally the convention is to assume that people could extend these classes if they required unless there are specific reasons not to (e.g. security).

@JeremySkinner JeremySkinner requested a review from yanniboi March 14, 2018 14:25
@JeremySkinner
Copy link
Author

@yanniboi @andrewbelcher I've added the export on completion functionality.

yanniboi added a commit that referenced this pull request Aug 8, 2018
* By yanniboi: Added GDPR tasks module.

* By yanniboi: PHPCS review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants