Skip to content
This repository was archived by the owner on Jul 7, 2025. It is now read-only.
This repository was archived by the owner on Jul 7, 2025. It is now read-only.

PermissionError when using pandas-dedupe #59

@oliverzeman9

Description

@oliverzeman9

Hi, I wanted to use pandas-dedupe but I keep getting this error shortly after indexing is done and it displays "Clustering...", I've been trying to search online but I can't seem to fix this issue.

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Oliver\\AppData\\Local\\Temp\\tmpt0j7qcrt'

Thank you so much for any help!

The whole error message:


PermissionError Traceback (most recent call last)

~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in dedupe_dataframe(df, field_properties, canonicalize, config_name, update_model, threshold, sample_size, n_cores)
250
251 # Cluster the records
--> 252 clustered_df = _cluster(deduper, data_d, threshold, canonicalize)
253 results = df.join(clustered_df, how='left')
254 results.drop(['dictionary'], axis=1, inplace=True)

~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in _cluster(deduper, data, threshold, canonicalize)
144 # ## Clustering
145 print('Clustering...')
--> 146 clustered_dupes = deduper.partition(data, threshold)
147
148 print('# duplicate sets', len(clustered_dupes))

~\Anaconda3\lib\site-packages\dedupe\api.py in partition(self, data, threshold)
176 clusters = self._add_singletons(data.keys(), clusters)
177 clusters = list(clusters)
--> 178 _cleanup_scores(pair_scores)
179 return clusters
180

~\Anaconda3\lib\site-packages\dedupe\api.py in _cleanup_scores(failed resolving arguments)
1482 del arr
1483 if mmap_file:
-> 1484 os.remove(mmap_file)

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\Oliver\AppData\Local\Temp\tmpt0j7qcrt'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions