Skip to content

maiconjobim/node-stream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PostgreSQL to Excel Export with Node.js

This project demonstrates two approaches to export large datasets from PostgreSQL to Excel using Node.js:

  1. Streaming Approach (index.js): Uses pg-query-stream and ExcelJS streaming API for memory-efficient processing of large datasets.
  2. Batch Processing Approach (index2.js): Uses traditional batch processing with ExcelJS for simpler implementation.

📋 Database Schema

The project works with a locations table containing 1 million rows with the following structure:

Column Type Description
id BIGSERIAL Primary key
name VARCHAR(100) Location name
latitude DECIMAL(10,8) Geographic latitude coordinate
longitude DECIMAL(11,8) Geographic longitude coordinate
population INTEGER Population count
created_at TIMESTAMPTZ Record creation timestamp

🚀 Getting Started

Prerequisites

  • Docker and Docker Compose
  • Node.js 16+
  • npm or yarn

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/node-streams.git
    cd node-streams
  2. Install dependencies:

    npm install
  3. Start the PostgreSQL container with sample data:

    docker-compose up -d

    This will:

    • Start PostgreSQL 15
    • Create the locations table
    • Seed it with 1 million sample records

🛠 Usage

Option 1: Streaming Export (Recommended for large datasets)

node index.js

Option 2: Batch Processing Export

node index2.js

Both scripts will:

  1. Connect to the PostgreSQL database
  2. Export all records to an Excel/CSV file
  3. Show progress in the console
  4. Save the output file in the project root with a timestamp

⚙️ Configuration

Edit the following files as needed:

  • docker-compose.yml: Database connection settings
  • index.js/index2.js: Export parameters and query customization

🚀 Performance

  • Streaming Approach: Processes data in chunks (10,000 rows at a time) using Node.js streams
  • Batch Processing: Loads data in batches (50,000 rows at a time) into memory
  • Memory Usage: Streaming approach uses significantly less memory for large datasets

🧹 Cleanup

You can adjust the following settings in index.js:

  • batchSize: Number of records to process in each batch (default: 100,000)
  • Database connection settings in dbConfig

Performance Tips

  • Increase batchSize for faster processing (if you have enough RAM)
  • Decrease batchSize if you encounter memory issues
  • The script includes progress logging every 10,000 records

Cleanup

To stop and remove all containers and volumes:

docker-compose down -v

💡 Tip: For very large exports, consider running the script with increased Node.js memory limit:

node --max-old-space-size=4096 index.js

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published