NZDedupeCheck

Checks quality of matching between records in CF NZ Dedupe Report

Inputs:
•NZ Dedupe Report in .csv sorted by Identifier!
NOTE: Requires columns for Title, Publication Date, Language Of Cataloging, Author, ISBN (Normalized), Edition, and Publisher, along with standard columns. Use KB - NZ Dedupe Report with Comparison Fields template available in NZ Analytics instance.

Outputs:
•Report with confidence next to all records with matching values in Identifier column

Process:
•Prompts for file using tkinter filedialog
•Compares adjacent rows on value in Identifier column (file must be sorted by Identifier to ensure matches are adjacent)
•If a match is found, compares key fields (Title, Publication Date, Language Of Cataloging, Author, ISBN (Normalized), Edition, and Publisher) using fuzz.WRatio
•Adds a column for Similarity, populated with the average of all comparison fields or 0 if no matching record found
•Prompt user to select a directory for the output file
•Saves the output as a file with a unique name using date and time

Dependencies:
•Pandas
•FuzzyWuzzy
•NumPy
•DateTime
•TKinter
•Time

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
NZDedupeCheck.py		NZDedupeCheck.py
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NZDedupeCheck

About

Uh oh!

Releases

Packages

Languages

cu-library/NZDedupeCheck

Folders and files

Latest commit

History

Repository files navigation

NZDedupeCheck

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages