Skip to content

EPC Duplicate Certificates #5

@anetobradley

Description

@anetobradley

When using the EPC datasets we need to be careful with duplicate EPCs for the same property. While not an enormous issue as an EPC is valid for up to 10 years unless the property is renovated or retrofitted, there may be multiple records especially for rental properties which are improved to meet recent regulations.

We should be able to spot this by removing duplicates with the same UPRN (UPRN: Unique Property Reference Number) and I would suggest selecting the most recent record and discarding others. I will add this feature to the R code for the energy intensity sampler.

I'm not sure this will have a big impact when taking a recent sample of 5000 certificates from the API, but when using the full csv this could be a problem (My colleague has pointed out some properties in that dataset can have four or five duplicates!).

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions