Skip to content

aleksAperans/claude-cleaner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claude Data Cleaner

A Flask web application for cleaning and standardizing company names and addresses using Claude AI.

Features

  • Upload CSV files containing company names and addresses
  • Clean and standardize data using Claude AI
  • Support for multiple Claude models (Haiku and Sonnet)
  • Preserve reference columns
  • Download cleaned results and error reports
  • Real-time processing status and logs

Requirements

  • Python 3.12+
  • Anthropic API key

Usage

  1. Upload a CSV file containing at minimum:
    • name column: Entity names
    • address column: Full addresses
  2. Optional columns:
    • country: Country names/codes
    • Any additional reference columns
  3. Select Claude model:
    • Haiku (faster)
    • Sonnet (more thorough)
  4. Monitor processing progress
  5. Download results when complete

File Format

Input CSV Format

name,address,country,id
Acme Corp,123 Main St,US,001

Output CSV Format

The tool will generate a CSV with the following columns:

  • All original columns
  • cleaned_name_1: Primary standardized name
  • cleaned_name_2: Alternative name (if DBA/AKA present)
  • cleaned_address: Standardized address
  • cleaned_country: Standardized country
  • has_dba: Multiple names indicator
  • country_match: Indicator if country match
  • name_inferred: Indicator if null name inferred
  • status: Processing status
  • error: Error message (if any)

Limitations

  • Maximum file size: 16MB
  • Supported file format: CSV only
  • Required columns: name, address

About

cleans and normalizes entity name and address with claude llm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors