This is a simulated relational database that efficiently retrieve key value storage. The database manages user-saved places, using a custom-built B+ Tree storage engine, and secondary indexing. It also integrates with Google Maps via Python scripts to import real-world "Saved Places" data into the database.
- BPlusNode: Represents a node (internal or leaf) in a B+ Tree. Handles splitting, merging, and traversal. Internal nodes store keys, key-value pairs (key and record pointers) are stored in leaf nodes , which are linked together using a doubly-linked list data structure.
- BPlusTree: Manages the self-balancing B+ Tree structure (order 3). Provides insert, delete, search, range search, and prefix search operations. When the keys in a node overflows it will automatically split. When it underflows it will borrow or merge with sibling nodes.
- BTreeIndex: A wrapper around the B+ Tree, abstracting the interface for database use.
- LeaderDB: Main database engine. Inherits publicly from DBInstance abstract class. Manages tables (with indexes), executes CRUD operations in the CLI.
- SecondaryIndex: Supports secondary indexes (non-primary keys) for attribute-value to list-of-PKs mapping to support faster attribute-based search (e.g., find by email, find by title).
- CsvParser / FileUtils / DataInserter: Utilities for loading CSV files into tables and inserting structured place data.
Currently Write Ahead Log implementations are incomplete, please disregard at the current time of project. We are keeping it to continue working on it later.
m = order = 3 for this implementation, meaning each internal node has at least m/2 children. Assume n total keys, then height h of tree is h = logn. At each internal node binary search is performed to search the local key array (O(logm)) time, but since m is fixed, this is done in O(1) time. Since we established height h of the tree is approximately logn, so search/insert/delete is O(logn) operation.
- InternalNode and LeafNode inherit publicly from base class BPlusNode
- LeaderDB inherits publicly from abstract base class DBInstance. This class could extend to future DB nodes for multiple DB instances.
The database schema supports users importing their saved places lists from Google Maps:
email: Email address (used as key).user_id(PK): Random unique generated integer ID.created_at: creation timestamp.
hashedId(PK): The Google place_id is hashed to integers as the primary key, and the original place id will be kept as a string attribute.place_id: original place id kept as a string attribute.name: Name of the place (e.g., "Central Park").address: Formatted address.latitude: Latitude coordinate.longitude: Longitude coordinate.description: Description of the place.
list_id(PK): Random unique list ID.user_id(FK): Belongs to a user.title: Title of the list (e.g., "Favorites").createdAt: timestamp.
- Combined Primary Key of
list_idandplace_id: int IDs list_id: Mapped tolist_idplace_Id: Links places to lists (many-to-many relationship)
get <key>Retrieve one record (all attributes) from the current table using its primary key.
get <key>createCreate a new record or a new table. You will be prompted to specify if you want to create a table or a record.
create
// Program will ask if you want to create table or instance (record in a table)
(table/instance)
Create table or instance? table
Enter table name: Pets
Enter headers (comma separated): name,age
Table pets created with headers: name, ageupdateUpdate an existing record in the current table by specifying a key and providing new attributes.
update
Key: (Enter key)
Attributes (comma separated): (Enter comma separated attributes)
Record updated.delete <key>Delete a specific record from the current table using its primary key.
delete:
Enter key to delete: 5
Key removed.drop (table)Delete an entire table from the database (except the default table).
drop
Please enter table name to drop: places
Table dropped.use (table)Switch to a different table for subsequent operations.
use (tablename)-
tablesList all existing tables in the database. -
load <filepath>Load records from a CSV file into the current table. The first line of the CSV should contain column headers.
load
Enter CSV file path: (enter path)saveSave all tables into CSV files. Each table will be saved to its own CSV under the output/ directory.
save
Saved default to ./output/default.csvviewView up to 10 records from the current table, formatted nicely in columns.
viewcreateindex <col>Build a secondary index on a specified column to speed up select queries.
createindex 2
Enter column name: (column name)
Secondary index built on <column name>select <cols>|\* where <col>=<val>Query the current table by applying a filter condition (where) and optionally projecting specific columns.
select * where name=(name)createuserCreate a new user account and optionally upload Google Maps Saved Places data into your database.
createuser
Enter user email: (email)
Would you like to upload your Saved Places? (yes/no):Paste the relative path, the project use this relative path:
saved_places_dir/yian261_at_gmail_dot_comjoinPerform a join between two tables based on matching column values, with optional projection of columns.
join A.1 B.2helpDisplay the full list of available commands and their descriptions.
helpexitExit the program.
exitMake sure you have a C++17 compatible compiler installed.
- Clone the repository.
- In the project root, compile the program:
make clean
makeAfter compiling, run
./leaderdbUsers may use the provided csv files in the filepath saved_places_dir/yian261_at_gmail_dot_com
The dataset_project files can be used for other table command testing
GOOGLE_API_KEY=<your api key>python -m venv myenv
source myenv/bin/activate # windows `myenv\Scripts\activate`
pip install -r requirements.txtRun
pip install -r requirements.txtpython fetch_places.py takeout/Saved <user email>Zirui Wen, Yian Chen
Google takeout data from Yian Chen
From Patorjk