This repository contains the code and scripts for generating embeddings for books, storing them in a Supabase database, and providing search and graph functionalities using Supabase Edge Functions.
. ├── supabase │ ├── functions │ │ ├── search │ │ │ └── index.ts │ │ └── graph │ │ └── index.ts ├── scripts │ ├── data │ │ └── pg_catalog.csv │ └── populate_database.ipynb | | ├── README.md
The Supabase Edge Functions are located in the supabase/functions directory.
- search/index.ts: Handles the search functionality by accepting embeddings and returning the top 10 books.
- graph/index.ts: Handles the graph functionality by accepting a book ID and building a similarity graph.
The script for generating embeddings and populating the Supabase database is located in the scripts directory.
- populate_database.ipynb: A Jupyter notebook that contains the code to process the books, generate embeddings, and populate the Supabase database. This notebook is intended to run on Google Colab.
The pg_catalog.csv file, located in the scripts/data directory, contains the list of books and their metadata used for book processing.
- Supabase CLI
- Python 3.7 or higher
- Jupyter Notebook (for running
populate_database.ipynb) - Google Colab account
-
Sign up for Supabase and create a new project.
-
Note your Supabase
URLandAPI Key. -
Set up your database schema to store the documents and embeddings. You can use the following schema as a starting point:
CREATE TABLE documents ( id uuid PRIMARY KEY, content text, embeddings float8[], metadata jsonb );
-
Install the Supabase CLI:
npm install -g supabase
-
Log in to Supabase:
supabase login
-
Initialize the Supabase project:
supabase init
-
Navigate to the root directory of your project.
-
Deploy the search function:
supabase functions deploy search
-
Deploy the graph function:
supabase functions deploy graph
- Upload the
pg_catalog.csvfile located inscripts/datato your Google Drive. - Open
populate_database.ipynbin Google Colab. - Follow the instructions in the notebook to process the books, generate embeddings, and populate your Supabase database.
To search for the top 10 books based on a question, make a POST request to the search function endpoint with the embeddings:
curl -X POST https://rgjkrflnxopeixwpsjae.supabase.co/functions/v1/search -d '{"embeddings": [[0.1, 0.2, ...]], "topN": 10}'To get a graph of similar books based on a book ID, make a GET request to the graph function endpoint:
curl https://rgjkrflnxopeixwpsjae.supabase.co/functions/v1/graph?book_id=1234This project is licensed under the GNU General Public License