Skip to content

Bortlesboat/receipt-parser

Repository files navigation

Receipt Parser & Spending Analyzer

A Python tool for parsing grocery receipt data and generating spending analytics.

Overview

This project transforms unstructured receipt data into actionable spending insights:

  • Parse receipt transactions into structured data
  • Categorize items automatically
  • Generate multi-dimensional spending analysis
  • Export to Excel with formatted tables and dashboards

Features

Data Parsing (parse_receipts.py)

  • Extracts transaction metadata (date, time, store, totals)
  • Parses line items with prices and tax flags
  • Computes derivative metrics:
    • Day of week, time of day patterns
    • Pre-discount totals and savings percentages
    • Effective tax rates
    • Self-checkout detection

Item Categorization

Automatically categorizes grocery items into:

  • Dairy, Meat/Protein, Fruit, Vegetables
  • Cereal, Snacks, Beverages, Bakery
  • Frozen, Candy/Sweets, Eggs
  • And more...

Analytics Generated

Analysis Description
By Store Spending and visits per location
By Month Monthly spending trends
By Day of Week Which days you shop most
By Time of Day Morning/Afternoon/Evening patterns
By Category What you spend the most on
Top Items Most frequently purchased items
Savings Tracking Discount utilization

Excel Export (create_excel.py)

Creates a formatted workbook with:

  • Transactions sheet - All receipt summaries with formatting
  • Items sheet - Line-item detail with categories
  • Dashboard - Summary metrics and instructions

Data Structure

Receipt transactions follow this structure:

{
    'receipt_id': 'XXXX XXX XXX XXX',
    'date': '01/15/2026',
    'time': '14:30',
    'store_name': 'Store Name',
    'store_address': '123 Main St',
    'city': 'City',
    'state': 'FL',
    'zip': '12345',
    'subtotal': 45.00,
    'sales_tax': 2.50,
    'grand_total': 47.50,
    'savings': 10.00,
    'payment_method': 'Credit Card',
    'items': [
        ('ITEM NAME', 4.99, 'F'),   # F = Food stamp eligible
        ('TAXABLE ITEM', 3.99, 'TF'), # T = Taxable, F = Food
        ('Promotion', -2.00, 'F'),   # Negative = discount
    ]
}

Setup

Requirements

openpyxl

Installation

pip install openpyxl

Configuration

  1. Copy config_example.py to config.py
  2. Update paths to match your data directory

Usage

# Parse receipts and generate CSVs
python parse_receipts.py

# Create Excel workbook
python create_excel.py

Output Files

File Description
transactions.csv Summary of each receipt
items.csv Line items with categories
summary.csv Aggregated statistics
spending_database.xlsx Formatted Excel workbook

Project Structure

receipt-parser/
├── README.md
├── config_example.py      # Template configuration
├── parse_receipts.py      # Receipt parsing and analytics
├── create_excel.py        # Excel workbook generator
├── requirements.txt       # Dependencies
└── .gitignore

Tech Stack

  • Python 3.x
  • openpyxl - Excel file generation with formatting
  • csv - Standard library for CSV handling
  • datetime - Date parsing and manipulation
  • collections.defaultdict - Aggregation helpers

Sample Output

The analytics reveal patterns like:

  • Which stores you frequent most
  • What time of day you typically shop
  • Your most purchased items
  • How much you save with promotions
  • Spending by category breakdown

Extending

To add your own receipt data:

  1. Structure transactions in the format shown above
  2. Add to the transactions list in parse_receipts.py
  3. Run both scripts to regenerate analytics

License

Personal project for spending analysis.

About

Receipt parsing pipeline: transaction extraction, auto-categorization, and spending analytics with Excel dashboards

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages