Hi, I wanted to use pandas-dedupe but I keep getting this error shortly after indexing is done and it displays "Clustering...", I've been trying to search online but I can't seem to fix this issue.
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Oliver\\AppData\\Local\\Temp\\tmpt0j7qcrt'
Thank you so much for any help!
The whole error message:
PermissionError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in dedupe_dataframe(df, field_properties, canonicalize, config_name, update_model, threshold, sample_size, n_cores)
250
251 # Cluster the records
--> 252 clustered_df = _cluster(deduper, data_d, threshold, canonicalize)
253 results = df.join(clustered_df, how='left')
254 results.drop(['dictionary'], axis=1, inplace=True)
~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in _cluster(deduper, data, threshold, canonicalize)
144 # ## Clustering
145 print('Clustering...')
--> 146 clustered_dupes = deduper.partition(data, threshold)
147
148 print('# duplicate sets', len(clustered_dupes))
~\Anaconda3\lib\site-packages\dedupe\api.py in partition(self, data, threshold)
176 clusters = self._add_singletons(data.keys(), clusters)
177 clusters = list(clusters)
--> 178 _cleanup_scores(pair_scores)
179 return clusters
180
~\Anaconda3\lib\site-packages\dedupe\api.py in _cleanup_scores(failed resolving arguments)
1482 del arr
1483 if mmap_file:
-> 1484 os.remove(mmap_file)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\Oliver\AppData\Local\Temp\tmpt0j7qcrt'
Hi, I wanted to use pandas-dedupe but I keep getting this error shortly after indexing is done and it displays "Clustering...", I've been trying to search online but I can't seem to fix this issue.
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Oliver\\AppData\\Local\\Temp\\tmpt0j7qcrt'Thank you so much for any help!
The whole error message:
PermissionError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in dedupe_dataframe(df, field_properties, canonicalize, config_name, update_model, threshold, sample_size, n_cores)
250
251 # Cluster the records
--> 252 clustered_df = _cluster(deduper, data_d, threshold, canonicalize)
253 results = df.join(clustered_df, how='left')
254 results.drop(['dictionary'], axis=1, inplace=True)
~\Anaconda3\lib\site-packages\pandas_dedupe\dedupe_dataframe.py in _cluster(deduper, data, threshold, canonicalize)
144 # ## Clustering
145 print('Clustering...')
--> 146 clustered_dupes = deduper.partition(data, threshold)
147
148 print('# duplicate sets', len(clustered_dupes))
~\Anaconda3\lib\site-packages\dedupe\api.py in partition(self, data, threshold)
176 clusters = self._add_singletons(data.keys(), clusters)
177 clusters = list(clusters)
--> 178 _cleanup_scores(pair_scores)
179 return clusters
180
~\Anaconda3\lib\site-packages\dedupe\api.py in _cleanup_scores(failed resolving arguments)
1482 del arr
1483 if mmap_file:
-> 1484 os.remove(mmap_file)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\Oliver\AppData\Local\Temp\tmpt0j7qcrt'