Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions demos/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,5 @@ Once the introduction notebook is complete, you can explore the other notebooks:
- [SF_TPCH_q1.ipynb](notebooks/SF_TPCH_q1.ipynb) demonstrates how to connect a Snowflake database with PyDough.
- [MySQL_TPCH.ipynb](notebooks/MySQL_TPCH.ipynb) demonstrates how to connect a MySQL database with PyDough.
- [PG_TPCH.ipynb](notebooks/PG_TPCH.ipynb) demonstrates how to connect a Postgres database with PyDough.
- [Oracle_TPCH.ipynb](notebooks/Oracle_TPCH.ipynb) demonstrates how to connect an Oracle database with PyDough.

317 changes: 317 additions & 0 deletions demos/notebooks/Oracle_TCPH.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,317 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d1cd6a33",
"metadata": {},
"source": [
"# Oracle PyDough Database connector"
]
},
{
"cell_type": "markdown",
"id": "b190b0ef",
"metadata": {},
"source": [
"> ## 🚀 Initial Setup\n",
">\n",
"> ---\n",
">\n",
"> ### 1️⃣ Oracle Database\n",
">\n",
"> You can connect to your **own Oracle database** using your credentials — for example, if you have **Oracle Database Software** or another local server running.\n",
">\n",
"> ---\n",
">\n",
"> ### 2️⃣ Docker Image (TPC-H Database)\n",
">\n",
"> You can also test with our **pre-built Oracle TPC-H database** available on **Docker Hub**.\n",
">\n",
"> #### 📋 Requirements\n",
"> - Make sure you have **Docker Desktop** installed and running.\n",
">\n",
"> #### 📦 Pull and Run the Container\n",
"> ```bash\n",
"> docker run -d --name [CONTAINER_NAME]\\\n",
"> --platform linuxamd64 \\\n",
"> -e ORACLE_PWD=[PASSWORD] \\\n",
"> -p 1521:1521 \\\n",
"> bodoai1/pydough-oracle-tpch:latest\n",
"> ```\n",
"> - Replace `[CONTAINER_NAME]` with your preferred container name. \n",
"> - Replace `[PASSWORD]` with your preferred password.\n",
">\n",
"> *(Make sure the `1521` port is available and not being used by another container.)* \n",
"> \n",
"> ---\n",
">\n",
"> #### 🔑 Environment Variables\n",
"> To connect to this database, use:\n",
"> ```env\n",
"> ORACLE_USERNAME=toch\n",
"> ORACLE_PASSWORD=[PASSWORD]\n",
"> ```\n",
"> *(Make sure `[PASSWORD]` matches the one you used when running the container.)*\n",
">\n",
"> ---\n",
">\n",
"> 💡 **Tip:** \n",
"> Store these credentials in a `.env` file in your project directory for easy access and security.\n",
">\n",
"> Example `.env` file:\n",
"> ```env\n",
"> ORACLE_USERNAME=root\n",
"> ORACLE_PASSWORD=mysecretpassword\n",
"> ```\n",
">\n",
">\n",
"> #### Deleting the container and image\n",
"> Once the tests have finished you can stop the container and delete it with the image using the following docker commands:\n",
">```bash\n",
"> docker stop [CONTAINER_NAME]\n",
"> docker rm [CONTAINER_NAME]\n",
"> docker rmi bodoai1/pydough-oracle-tpch\n",
">```"
]
},
{
"cell_type": "markdown",
"id": "097cba60",
"metadata": {},
"source": [
"> ## 🔌 Installing Oracle Connector\n",
">\n",
"> Make sure to have the **`python-oracledb`** package installed:\n",
">\n",
"> ---\n",
">\n",
"> - **If you're working inside the repo**:\n",
"> ```bash\n",
"> pip install -e \".[oracle]\"\n",
"> ```\n",
">\n",
"> - **Or install the connector directly**:\n",
"> ```bash\n",
"> python -m pip install oracledb --upgrade\n",
"> ```"
]
},
{
"cell_type": "markdown",
"id": "1f39b2af",
"metadata": {},
"source": [
"> ## Importing Required Libraries\n",
">\n",
"> ---"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b473d180",
"metadata": {},
"outputs": [],
"source": [
"import pydough\n",
"import datetime\n",
"import os"
]
},
{
"cell_type": "markdown",
"id": "6c595441",
"metadata": {},
"source": [
"> ## 🔑 Loading Credentials and connecting to Oracle with PyDough\n",
">\n",
"> ---\n",
">\n",
"> ### 1️⃣ Load Credentials from a Local `.env` File\n",
"> - The `.env` file contains your MySQL login details like:\n",
"> ```env\n",
"> ORACLE_PASSWORD=mypassword\n",
"> ```\n",
"> - These are read in Python using:\n",
"> ```python\n",
"> import os\n",
"> password = os.getenv(\"ORACLE_PASSWORD\")\n",
"> ```\n",
">\n",
"> ---\n",
">\n",
"> ### 2️⃣ Oracle-PyDough `connect_database()` Parameters\n",
"> - **`user`** *(required)*: Username for Oracle connection. \n",
"> - **`password`** *(required)*: Password used for MySQL connection. \n",
"> - **`service_name`** *(required)*: Oracle database service name. \n",
"> - **`host`** *(optional)*: IP to access Oracle server. Default: `\"localhost\"` or `\"127.0.0.1\"`. \n",
"> - **`port`** *(optional)*: Port number to access Oracle server. Default: `1521`. \n",
"> - **`tcp_connect_timeout`** *(optional)*: Timeout in seconds for Oracle connection. Default: `3`. \n",
"> - **`attempts`** *(optional)*: Number of times the connection is attempted. Default: `1`. \n",
"> - **`delay`** *(optional)*: Seconds to wait before another connection attempt. Default: `2`. \n",
">\n",
"> ---\n",
">\n",
"> ### 3️⃣ Connect to Oracle Using PyDough\n",
"> - `pydough.active_session.load_metadata_graph(...)` \n",
"> Loads a metadata graph mapping your Oracle schema (used for query planning/optimizations). \n",
"> - `connect_database(...)` \n",
"> Uses the loaded credentials to establish a live connection to your Oracle database.\n",
">\n",
"> ---\n",
">\n",
"> **⚠️ Notes:** \n",
"> - Ensure the `.env` exists and contains **all required keys**. \n",
"> - The metadata graph path must point to a **valid JSON file** representing your schema."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7487b588",
"metadata": {},
"outputs": [],
"source": [
"oracle_user = \"tpch\"\n",
"oracle_password = os.getenv(\"ORACLE_PASSWORD\")\n",
"oracle_service_name = \"FREEPDB1\"\n",
"oracle_host = \"127.0.0.1\"\n",
"oracle_port = 1521\n",
"connection_timeout = 2\n",
"attempts = 2 \n",
"delay = 5.0 \n",
"\n",
"pydough.active_session.load_metadata_graph(\"../../tests/test_metadata/sample_graphs.json\", \"TPCH\")\n",
"pydough.active_session.connect_database(\"oracle\", \n",
" user=oracle_user,\n",
" password=oracle_password,\n",
" service_name=oracle_service_name,\n",
" host=oracle_host,\n",
" port=oracle_port,\n",
" tcp_connect_timeout=connection_timeout,\n",
" attempts=attempts,\n",
" delay=delay\n",
")"
]
},
{
"cell_type": "markdown",
"id": "305e11ec",
"metadata": {},
"source": [
"> ## ✨ Enabling PyDough's Jupyter Magic Commands\n",
">\n",
"> ---\n",
">\n",
"> This step loads the **`pydough.jupyter_extensions`** module, which adds custom magic commands (like `%%pydough`) to your notebook.\n",
">\n",
"> ---\n",
">\n",
"> ### 📌 What These Magic Commands Do\n",
"> - **Write PyDough directly** in notebook cells using:\n",
"> ```python\n",
"> %%pydough\n",
"> ```\n",
"> - **Automatically render** query results inside the notebook.\n",
">\n",
"> ---\n",
">\n",
"> ### 💻 How It Works\n",
"> This is a **Jupyter-specific feature** — the `%load_ext` command dynamically loads these extensions into your **current notebook session**:\n",
"> ```python\n",
"> %load_ext pydough.jupyter_extensions\n",
"> ```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "93dde776",
"metadata": {},
"outputs": [],
"source": [
"%load_ext pydough.jupyter_extensions"
]
},
{
"cell_type": "markdown",
"id": "d9b9d04a",
"metadata": {},
"source": [
"> ## 📊 Running TPC-H Query 1 with PyDough in Oracle\n",
">\n",
"> ---\n",
">\n",
"> This cell runs **TPC-H Query 1** using **PyDough**.\n",
">\n",
"> ---\n",
">\n",
"> ### 📝 What the Query Does\n",
"> - **Computes summary statistics**: sums, averages, and counts for orders. \n",
"> - **Groups by**: `return_flag` and `line_status`. \n",
"> - **Filters by**: a shipping date cutoff. \n",
">\n",
"> ---\n",
">\n",
"> ### 📤 Output\n",
"> - `pydough.to_df(output)` converts the result to a **Pandas DataFrame**. \n",
"> - This makes it easy to inspect and analyze results directly in Python. \n",
">\n",
"> ---\n",
">"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "86b45425",
"metadata": {},
"outputs": [],
"source": [
"%%pydough\n",
"# TPCH Q1\n",
"output = (lines.WHERE((ship_date <= datetime.date(1998, 12, 1)))\n",
" .PARTITION(name=\"groups\", by=(return_flag, status))\n",
" .CALCULATE(\n",
" L_RETURNFLAG=return_flag,\n",
" L_LINESTATUS=status,\n",
" SUM_QTY=SUM(lines.quantity),\n",
" SUM_BASE_PRICE=SUM(lines.extended_price),\n",
" SUM_DISC_PRICE=SUM(lines.extended_price * (1 - lines.discount)),\n",
" SUM_CHARGE=SUM(\n",
" lines.extended_price * (1 - lines.discount) * (1 + lines.tax)\n",
" ),\n",
" AVG_QTY=AVG(lines.quantity),\n",
" AVG_PRICE=AVG(lines.extended_price),\n",
" AVG_DISC=AVG(lines.discount),\n",
" COUNT_ORDER=COUNT(lines),\n",
" )\n",
" .ORDER_BY(L_RETURNFLAG.ASC(), L_LINESTATUS.ASC())\n",
")\n",
"\n",
"pydough.to_df(output)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "PyDough",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
33 changes: 32 additions & 1 deletion documentation/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,9 @@ Below is a list of all supported values for the database name:

- `snowflake`: uses a Snowflake database. [See here](https://docs.snowflake.com/en/user-guide/python-connector.html#connecting-to-snowflake) for details on the connection API and what keyword arguments can be passed in.

- `postgres` or `postgres`: uses a Postgres database. [See here](https://www.psycopg.org/docs/) for details on the connection API and what keyword arguments can be passed in.
- `postgres`: uses a Postgres database. [See here](https://www.psycopg.org/docs/) for details on the connection API and what keyword arguments can be passed in.

- `oracle`: uses an Oracle database. [See here](https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html) for details on the connection API and what keyword arguments can be passed in.

> Note: If you installed PyDough via pip, you can install optional connectors using pip extras:
>
Expand All @@ -364,6 +366,7 @@ Here’s a quick reference table showing which connector is needed for each dial
| `mysql` | `mysql-connector-python` |
| `snowflake` | `snowflake-connector-python[pandas]` |
| `postgres` | `psycopg2-binary` |
| `oracle` | `python-oracledb` |

Below are examples of how to access the context and switch it out for a newly created one, either by manually setting it or by using `session.load_database`. These examples assume that there are two different sqlite database files located at `db_files/education.db` and `db_files/shakespeare.db`.

Expand Down Expand Up @@ -439,6 +442,34 @@ You can find a full example of using MySQL database with PyDough in [this usage
```
You can find a full example of using Postgres database with PyDough in [this usage guide](./../demos/notebooks/PG_TPCH.ipynb).

- Oracle: You can connect to an Oracle database using `load_metadata_graph` and `connect_database` APIs. For example:
```py
pydough.active_session.load_metadata_graph("../../tests/test_metadata/sample_graphs.json", "TPCH")
pydough.active_session.connect_database("oracle",
user=oracle_user,
password=oracle_password,
host=oracle_host,
port=oracle_port
service_name=oracle_service_name,
)
```
Also you can use `dsn` instead of `host`, `port` and `service_name`.

Example with a connection object
```py
pydough.active_session.load_metadata_graph("../../tests/test_metadata/sample_graphs.json", "TPCH")
oracle_conn: oracledb.connection = oracledb.connect(
dbname=oracle_db,
user=oracle_user,
password=oracle_password,
host=oracle_host,
port=oracle_port,
service_name=oracle_service_name,
)
pydough.active_session.connect_database("oracle", connection=oracle_conn)
```
You can find a full example of using an Oracle database with PyDough in [this usage guide](./../demos/notebooks/Oracle_TPCH.ipynb).

<!-- TOC --><a name="evaluation-apis"></a>
## Evaluation APIs

Expand Down
2 changes: 2 additions & 0 deletions pydough/database_connectors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ The database connectors module provides functionality to manage database connect
- `SNOWFLAKE`: Represents the Snowflake SQL dialect.
- `MYSQL`: Represents the MySQL dialect.
- `POSTGRES`: Represents the Postgres dialect
- `ORACLE`: Represents the Oracle dialect
- `DatabaseContext`: Dataclass that manages the database connection and the corresponding dialect.
- Fields:
- `connection`: The `DatabaseConnection` object.
Expand All @@ -35,6 +36,7 @@ The database connectors module provides functionality to manage database connect
- `load_snowflake_connection`: Loads a Snowflake connection.
- `load_mysql_connection`: Loads a MySQL database connection.
- `load_postgres_connection`: Loads a Postgres database connection.
- `load_oracle_connection`: Loads an Oracle database connection.

## Usage

Expand Down
Loading