forked from Cypherpunk-Labs/PinShare
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
PinShare currently uses an in-memory data store for file metadata, which means:
- All data is lost when the service restarts
- No data persistence across container rebuilds
- Limited by available RAM
- No ACID guarantees for concurrent operations
- No advanced query capabilities
This issue proposes migrating to PostgreSQL for persistent, scalable, and reliable data storage.
Current Implementation
In-memory store in internal/store/store.go:
type BaseMetadata struct {
FileSHA256 string
IpfsCID string
FileName string
FileType string
ModerationVotes int
Tags map[string]int
BanSet int
}Data structure:
- Global map:
map[string]*BaseMetadata - Key: FileSHA256 (SHA256 hash of file)
- No persistence, no transactions, no relationships
Proposed Changes
PostgreSQL Schema
CREATE TABLE files (
file_sha256 VARCHAR(64) PRIMARY KEY,
ipfs_cid VARCHAR(100) NOT NULL,
file_name TEXT,
file_type VARCHAR(100),
moderation_votes INTEGER DEFAULT 0,
ban_set INTEGER DEFAULT 0,
added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT chk_ban_set CHECK (ban_set IN (0, 3, 6, 9))
);
CREATE TABLE file_tags (
id SERIAL PRIMARY KEY,
file_sha256 VARCHAR(64) REFERENCES files(file_sha256) ON DELETE CASCADE,
tag_name VARCHAR(100) NOT NULL,
tag_count INTEGER DEFAULT 1,
UNIQUE(file_sha256, tag_name)
);
CREATE INDEX idx_ipfs_cid ON files(ipfs_cid);
CREATE INDEX idx_file_type ON files(file_type);
CREATE INDEX idx_added_at ON files(added_at);
CREATE INDEX idx_tags_file ON file_tags(file_sha256);
CREATE INDEX idx_tags_name ON file_tags(tag_name);
-- Full-text search support (for future search endpoint)
CREATE INDEX idx_file_name_fts ON files USING gin(to_tsvector('english', file_name));Benefits
- Data Persistence: Metadata survives service restarts
- Scalability: Not limited by RAM, can store millions of records
- Reliability: ACID transactions, data integrity guarantees
- Advanced Queries:
- Full-text search on file names
- Complex filtering and aggregations
- Efficient joins for tags and relationships
- Backup & Recovery: Standard PostgreSQL backup tools
- Monitoring: Rich ecosystem of PostgreSQL monitoring tools
- Multi-user: Proper concurrent access handling
Implementation Steps
-
Add PostgreSQL dependency
- Update
go.modwith PostgreSQL driver (e.g.,pgxorlib/pq) - Add PostgreSQL to docker-compose.yml
- Update
-
Create database layer
- Implement
storeinterface abstraction - Create PostgreSQL implementation
- Add connection pooling
- Implement migrations system (e.g.,
golang-migrate)
- Implement
-
Update API handlers
- Replace in-memory store calls with DB queries
- Add proper error handling for DB operations
- Implement transaction support where needed
-
Migration path
- Support both stores during transition (feature flag)
- Provide export/import tools for data migration
- Update documentation
-
Configuration
- Add database connection config (env vars)
- Support connection string or individual params
- Default to in-memory for development if DB not available
Docker Compose Example
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: pinshare
POSTGRES_USER: pinshare
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
- ./migrations:/docker-entrypoint-initdb.d
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U pinshare"]
interval: 10s
timeout: 5s
retries: 5
volumes:
postgres_data:Recommended Go Libraries
- Database driver:
github.com/jackc/pgx/v5(high-performance, PostgreSQL-specific) - Query builder:
github.com/Masterminds/squirrel(optional, for complex queries) - Migrations:
github.com/golang-migrate/migrate/v4 - Connection pooling: Built into pgx
Backward Compatibility
Maintain the in-memory store as an option:
- Feature flag or env var:
STORAGE_TYPE=postgres|memory - Development defaults to memory for simplicity
- Production uses PostgreSQL for reliability
Testing
- Add integration tests with testcontainers
- Unit tests for store implementation
- Migration tests
- Performance benchmarks
Related Issues
- Implement server-side search endpoint for scalable file searching #2: Server-side search endpoint (will leverage PostgreSQL full-text search)
Implementation Priority
High Priority - Data persistence is critical for production deployments. Current in-memory approach is only suitable for testing/development.
Additional Considerations
Alternative: SQLite
For simpler deployments, SQLite could be an intermediate step:
- File-based persistence (simpler than PostgreSQL)
- Still supports SQL queries and FTS
- No separate database service required
- Good for single-node deployments
However, PostgreSQL is recommended for:
- Multi-node deployments
- Better concurrency handling
- More mature full-text search
- Production-grade monitoring and tooling
Migrated from bryanchriswhite/PinShare#3
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request