Skip to content

bchou9/LstoreStorageEngine

Repository files navigation

L-Store — Python implementation

What this implements

This repository contains a L-Store style columnar storage engine in Python. It shows core L-Store ideas (base vs tail pages, indirection, schema encoding, merge/compaction, page directory, and a bufferpool). The code is organized as a small DB engine plus demo/test harnesses to exercise typical operations (insert/select/update/delete/aggregates/range queries).

  • Data is stored per column (each column uses fixed-size pages).
  • Each logical record has a base record (stable copy) and subsequent tail records for updates.
  • A page directory maps RIDs to the corresponding page range, base/tail page index, row offset and status flags.
  • Indirection column points from a base record to the latest tail record (if any).
  • Schema encoding uses a bitmask to describe which columns are present in a tail record.
  • A merge (compaction) process periodically consolidates tail records back into base pages.
  • A bufferpool manages in-memory page frames and writes pages to/from disk.

This makes it easy to model update-heavy workloads without rewriting base pages on every update and enables column-wise storage/access for analytical queries.

Core concepts explained

  • Base page: Stores the original or consolidated version of tuples for a contiguous RID range. Base pages are relatively stable.
  • Tail page: Stores updates (deltas) to columns for certain tuples. Multiple updates can be chained in tail pages.
  • Indirection column: For each base RID, holds a pointer (RID) to the latest tail record (if updated). If no update, it points to itself.
  • Schema encoding: Integer bitmask telling which columns were updated in a tail record (so non-updated columns are omitted there).
  • Page directory: Maps a logical RID to physical location: page range index, base/tail page index, offset, and whether it's base or tail.
  • Merge / compaction: Reads tail pages, applies latest updates to base records and writes back consolidated base pages, resetting relevant metadata.
  • Bufferpool: In-memory cache of loaded page frames (pages for each column), with dirty-bit and basic eviction/commit behavior.

Repository layout

Lstore_database-main/
├─ README.md                # original brief README
├─ __main__.py              # demo/main runner (benchmarks)
├─ demo.py                  # interactive test/demo harness
├─ m1_* / m2_* / m3_*       # milestone/tester scripts
└─ lstore/
   ├─ __init__.py
   ├─ config.py             # page/sizing constants and special values
   ├─ page.py               # Page model + disk read/write helpers
   ├─ bufferpool.py         # Bufferpool and Frame classes
   ├─ db.py                 # Database wrapper (open/create/close)
   ├─ table.py              # Table, page range, allocation and merge logic
   ├─ record.py             # Record representation
   ├─ query.py              # User-facing Query API (insert/select/update/delete/agg)
   ├─ index.py              # Index wrapper (uses BTrees.IOBTree)
   ├─ transaction.py        # Transaction skeleton
   ├─ transaction_worker.py # (worker thread example)
   └─ btreetest.py          # B-tree basic tests/usage

Key components & responsibilities

lstore/config.py

Defines engine-wide constants:

  • PAGE_SIZE (4096 bytes)
  • PAGE_RECORD_SIZE (8 bytes)
  • BASE_PAGE_COUNT (# of base pages per page range)
  • META_COLUMN_COUNT (number of metadata columns: indirection, RID, base_rid, timestamp, schema)
  • ENTRIES_PER_PAGE computed
  • SPECIAL_NULL_VALUE sentinel

These constants determine page layout and serialization rules.

lstore/page.py

A small page abstraction that:

  • Maintains a bytearray of size PAGE_SIZE.
  • Supports write(value, row) and read(row) storing 8-byte values per ROW (fixed-size).
  • read_from_disk / write_to_disk helpers that read/write pages for all columns for a page frame.

Note: pages store 8-byte values per record slot and this project uses fixed 8-byte cells (integer/encoded values).

lstore/bufferpool.py

  • Frame: stores in-memory pages for all columns of that page (a "page frame").
  • Bufferpool: loads page frames from disk on demand, keeps a mapping of frames, supports dirty-bit, pin (Lock), and commits frames back to disk.
  • Tracks frames with frame_dir mapping (table_name, page_range, page_index, is_base) -> frame_index.

This organizes I/O and caching; eviction / replacement is simple and oriented towards educational clarity.

lstore/table.py

  • Manages a table's on-disk layout under table_path/ directories.
  • Maintains page_directory, page_ranges, allocating base/tail pages and tail page growth.
  • Implements merge(rid) which runs consolidation: reads base & tail records, applies latest column values and writes consolidated base pages; resets metadata appropriately.
  • write_new_rec for writing a brand new base record (on insert), and helpers to read a record by RID.

lstore/index.py

  • Thin wrapper around an on-disk/ in-memory B-tree (it uses BTrees.IOBTree.IOBTree).
  • Stores a B-Tree per column (created on demand).
  • Provides locate(column, value) and range locate helpers.

lstore/record.py

  • Record object that holds metadata (meta_data = [INDIRECTION, RID, base_rid, timestamp, schema_encoding]) and the data columns.

lstore/query.py

  • Public API used by clients and demo scripts: insert(), select(), update(), delete(), sum(), count(), average(), range_query(), increment() etc.
  • Coordinates with Table and Index for record-level operations and uses schema encoding/indirection for updates.

lstore/db.py

  • Database wrapper that allows open(path), create_table(name, num_columns, key), get_table(name), and close() which persists metadata (page directory, indices) and commits bufferpool frames.

transaction.py / transaction_worker.py

  • Basic transaction skeleton that aggregates queries and attempts to run them with lightweight locking semantics. The transaction concurrency model is intentionally simple (teaching-focused).

Quick start — run demo & tests

Prereqs

  • Python 3.x
  • BTrees module (part of ZODB packages)
    pip install BTrees
    (or: python3 -m pip install BTrees)

Run the interactive demo:

From repo root:

python3 demo.py

Run the benchmark/main demo:

python3 __main__.py

Run milestone/test scripts:

There are many tester scripts included. Example:

python3 m1_tester.py
python3 m2_tester_part1.py
python3 m3_tester_part_1.py

These scripts exercise insert/select/update/delete and aggregated queries which are useful for regression checks.

Usage examples

# minimal_example.py
from lstore.db import Database
from lstore.query import Query

db = Database()
db.open('./main_db')                 # bufferpool created and DB root initialized
grades = db.create_table('Grades', 5, 0)  # num_columns=5, key column index=0
q = Query(grades)

# Insert: (key, col0, col1, col2, col3, col4)
q.insert(1001, 10, 20, 30, 40, 50)

# Select by key
rows = q.select(1001, 0, [1,1,1,1,1])  # returns Record objects
print(rows[0].columns)

# Update
q.update(1001, None, 21, None, None, None)

# Aggregate
sum_ = q.sum(1000, 2000, 1)  # sum column 1 for key range [1000..2000]

db.close()

Notes about API shapes:

  • Query.insert(key, col0, col1, ...) — first argument is key (the column index specified when creating the table).
  • Query.select(key, column_index, query_columns_bitmask_list)query_columns is a list of booleans or 0/1 to specify which columns you want returned.
  • Query.update(key, *columns) — pass None for columns you don't want to change.

Design notes & implementation details

Page layout

  • Pages are fixed-size byte arrays. Each record slot stores a fixed 8-byte integer value.
  • Each column has a separate .bin file per page (columnar layout). When a page frame is loaded, it includes num_columns + META_COLUMN_COUNT pages (metadata columns are stored alongside data columns).

Page ranges and allocation

  • Tables are split into page ranges, each containing BASE_PAGE_COUNT base pages and an extendable tail page folder.
  • On allocation, the engine creates base pages and allocates tail pages as required.

Update flow

  • update() writes a new tail record containing only changed column values and a schema-encoding bitmask describing which columns are present.
  • The base record’s indirection column is updated to point to the tail RID.
  • Read logic checks indirection and schema encoding to reconstruct latest values.

Merge/compaction

  • table.merge(rid) consolidates updates from tail pages for a page-range into base pages. Merge copies the latest values into base pages and resets indirection/schema accordingly.
  • A simple merge bufferpool is used to isolate merge I/O.

Indexing

  • Index is a B-Tree per column (created lazily). The BTree implementation comes from BTrees.IOBTree.IOBTree.

Concurrency & transactions

  • The project includes a simple Transaction class that groups queries into a single logical unit and a Query lock for table operations.
  • Bufferpool.Frame includes a pin (threading.Lock) to avoid conflicting concurrent modifications while a frame is in use.
  • Locking is coarse-grained / teaching-focused — not a production-grade MVCC implementation. The transaction worker skeleton (transaction_worker.py) demonstrates how transactions might be scheduled.

Tests & milestone scripts

  • m1_tester.py, m1_extended_main.py, m1_extended_tester.py: milestone 1 (insert/select basics).
  • m2_tester_part1.py, m2_tester_part2.py, m2_extended_*.py: updates and merges.
  • m3_tester*: more complex workloads.

You can run these scripts to validate engine behavior.

Contributing

  1. Fork the repo.
  2. Open a branch for your feature/fix.
  3. Add tests and update README or docs.
  4. Open a PR describing the change.

Authors

  • Natheethorn Teacharuangchit
  • Michael Shaw
  • Stuart Feng
  • Henry Chou
  • Eric Wang

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages