pgai-timescale/llms.txt at main · fullstackinfo/pgai-timescale · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# pgai

> Supercharge your PostgreSQL database with AI capabilities. Supports automatic creation and synchronization of vector embeddings, seamless vector and semantic search, Retrieval Augmented Generation (RAG) directly in SQL, and ability to call leading LLMs like OpenAI, Ollama, Cohere, and more via SQL.

pgai is a PostgreSQL extension that provides AI capabilities within your database. It consists of two main components:

1. The core PostgreSQL extension which allows you to call various LLM models directly from SQL for tasks like generating content, vector search, and RAG applications.

2. The pgai Vectorizer system which automatically creates and synchronizes vector embeddings for your data using a worker process that runs alongside your database.

The project is maintained by Timescale, a PostgreSQL database company.

## Installation

- [Install pgai with Docker](projects/extension/docs/install/docker.md): Detailed instructions for installing the pgai extension using Docker containers
- [Install the pgai extension from source](projects/extension/docs/install/source.md): Instructions for installing the pgai extension from source on macOS and Linux

## Core pgai Extension

- [Text to SQL](docs/structured_retrieval/text_to_sql.md): Describes how to use the Text to SQL feature of the pgai PostgreSQL extension, which uses LLMs to generate SQL queries.
- [Chunk text with SQL functions](docs/utils/chunking.md): Describes SQL functions to split text into smaller chunks using character-based and recursive text splitting.
- [Use pgai with Anthropic](projects/extension/docs/model_calling/anthropic.md): Describes how to use the pgai extension with the Anthropic AI provider to list models and generate responses.
- [Use pgai with Cohere](projects/extension/docs/model_calling/cohere.md): Describes how to use the Cohere AI provider with the pgai extension for PostgreSQL
- [Delayed embed](projects/extension/docs/model_calling/delayed_embed.md): Describes how to use the pgai extension to run background actions for delayed vector embedding processing, with security considerations.
- [Moderate](projects/extension/docs/model_calling/moderate.md): Describes how to moderate comments using OpenAI in a PostgreSQL extension called pgai
- [Use pgai with Ollama](projects/extension/docs/model_calling/ollama.md): Describes how to use the pgai extension with the Ollama AI platform to generate text, embeddings, and structured outputs.
- [Use pgai with OpenAI](projects/extension/docs/model_calling/openai.md): Describes how to use the pgai extension to interact with the OpenAI API for text processing tasks like embedding, moderation, and summarization.
- [Use pgai with Voyage AI](projects/extension/docs/model_calling/voyageai.md): Describes how to use the pgai extension with the Voyage AI API to generate embeddings in your PostgreSQL database.
- [Handling API keys](projects/extension/docs/security/handling-api-keys.md): Describes how to configure API keys for the pgai extension, which integrates with OpenAI APIs
- [Privileges](projects/extension/docs/security/privileges.md): Describes the ai.grant_ai_usage function for managing permissions to use the pgai extension's AI features
- [Load dataset from Hugging Face](projects/extension/docs/utils/load_dataset_from_huggingface.md): Describes how to load datasets from Hugging Face's datasets library directly into a PostgreSQL database using the pgai extension.

## pgai Vectorizer

The pgai Vectorizer is a system that automatically creates and synchronizes vector embeddings for your data. It requires a worker process that runs alongside your database.

- [Vectorizer quick start](docs/vectorizer-quick-start.md): Quickstart guide for creating an Ollama-based vectorizer in a self-hosted Postgres instance with pgai
- [Adding a Vectorizer embedding integration](docs/vectorizer/adding-embedding-integration.md): Describes how to add a new vectorizer embedding integration to the pgai extension
- [pgai Vectorizer API reference](docs/vectorizer/api-reference.md): Detailed API reference for the pgai Vectorizer, including configuration options for embedding, indexing, and other features.
- [Document embeddings in pgai](docs/vectorizer/document-embeddings.md): Comprehensive guide to document embedding generation and management in the pgai extension, including support for OpenAI and Ollama embeddings.
- [Automate AI embedding with pgai Vectorizer](docs/vectorizer/overview.md): Automate AI embedding with the pgai Vectorizer, supporting Ollama, Voyage AI, and OpenAI as embedding providers.
- [Creating vectorizers from python](docs/vectorizer/python-integration.md): Documentation on creating vectorizers from Python and integrating pgai's Vectorizer with SQLAlchemy
- [Vectorizer quick start with Ollama](docs/vectorizer/quick-start-ollama.md): Quick start guide for using the pgai Vectorizer feature with the Ollama AI provider
- [Vectorizer quick start with OpenAI](docs/vectorizer/quick-start-openai.md): Quickly create a vectorizer in Postgres using the pgai extension and OpenAI for embedding generation.
- [Vectorizer quick start with VoyageAI](docs/vectorizer/quick-start-voyage.md): Quickly set up a local development environment and create a Voyage AI-powered vectorizer for semantic search on Postgres
- [Vectorizer quick start](docs/vectorizer/quick-start.md): Quickly set up a local development environment and create a pgai Vectorizer to automatically embed data in Postgres.
- [S3 integration guide](docs/vectorizer/s3-documents.md): Guide for integrating pgai with AWS S3 for self-hosted and Timescale Cloud installations
- [Running on Timescale Cloud](docs/vectorizer/worker.md): Documentation for running the pgai Vectorizer worker on Timescale Cloud or self-hosted Postgres
- [Use pgai with LiteLLM](projects/extension/docs/model_calling/litellm.md): Describes how to use the pgai extension with the LiteLLM AI model provider

## Optional

- [Anonymize Entities](examples/anonymize_entities.sql): The `anonymize_entities.sql` example demonstrates how to anonymize extracted entities from text using the pgai PostgreSQL extension.
- [Bg Worker Moderate](examples/bg_worker_moderate.sql): This example demonstrates how to use the OpenAI moderation API to check and moderate comments in a PostgreSQL database.
- [Delayed Embed](examples/delayed_embed.sql): This example demonstrates how to use the pgai extension to generate text embeddings and store them in a PostgreSQL table.
- [Discord Bot Documentation](examples/discord_bot/README.md): Documentation for the discord bot example.
- [Embeddings From Documents Documentation](examples/embeddings_from_documents/README.md): Documentation for the embeddings from documents example.
- [Litellm Vectorizer Documentation](examples/evaluations/litellm_vectorizer/README.md): Documentation for the litellm vectorizer example.
- [Ollama Vectorizer Documentation](examples/evaluations/ollama_vectorizer/README.md): Documentation for the ollama vectorizer example.
- [Voyage Vectorizer Documentation](examples/evaluations/voyage_vectorizer/README.md): Documentation for the voyage vectorizer example.
- [Extract Entities](examples/extract_entities.sql): This example demonstrates how to extract and replace named entities in text with their corresponding entity types using the pgai PostgreSQL extension.
- [Best Embedding Model Rag App](examples/finding_best_open_source_embedding_model/best_embedding_model_rag_app.ipynb): This example demonstrates how to evaluate and select the best open-source embedding model for a Retrieval-Augmented Generation (RAG) application using Ollama and the pgai PostgreSQL extension.
- [Rag Report Demo Why Some Customers Donot Like Pizza](examples/rag_report_demo_why_some_customers_donot_like_pizza.sql): This example demonstrates how to use the pgai PostgreSQL extension to generate a report analyzing why some customers do not like pizza, based on customer comments.
- [Simple Fastapi App Documentation](examples/simple_fastapi_app/README.md): Documentation for the simple fastapi app example.
- [Summarize Article](examples/summarize_article.sql): This example demonstrates how to call the Anthropic API to generate a summary of a given article text and return the extracted summary.
- [Text To Sql Documentation](examples/text_to_sql/README.md): Documentation for the text to sql example.
- [Gen Desc](examples/text_to_sql/gen_desc.sql): The `gen_desc.sql` file demonstrates how to generate textual descriptions for tables in the `postgres_air` schema using the pgai extension.
- [Gen Embed](examples/text_to_sql/gen_embed.sql): This example demonstrates how to generate text embeddings for object and SQL descriptions using the pgai PostgreSQL extension.
- [Trigger Moderate](examples/trigger_moderate.sql): This example demonstrates a simple SQL trigger for moderating content in a PostgreSQL database.