diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 000000000..4c5882b5b Binary files /dev/null and b/.DS_Store differ diff --git a/.readthedocs.yaml b/.readthedocs.yaml new file mode 100644 index 000000000..77b6f73c2 --- /dev/null +++ b/.readthedocs.yaml @@ -0,0 +1,28 @@ +# .readthedocs.yaml +# Read the Docs configuration file +# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details + +# Required +version: 2 + +# Set the version of Python and other tools you might need +build: + os: ubuntu-22.04 + tools: + python: "3.11" + # You can also specify other tool versions: + # nodejs: "19" + # rust: "1.64" + # golang: "1.19" + +# Build documentation in the docs/ directory with Sphinx +sphinx: + configuration: conf.py + +# If using Sphinx, optionally build your docs in additional formats such as PDF +formats: [htmlzip,pdf] + +# Optionally declare the Python requirements required to build your docs +python: + install: + - requirements: requirements.txt \ No newline at end of file diff --git a/404.rst b/404.rst index aa18f4738..44f92acfd 100644 --- a/404.rst +++ b/404.rst @@ -1,18 +1,14 @@ :orphan: -********************************* - Couldn't find the page - 404 -********************************* +************************** +Page Cannot Be Found - 404 +************************** -Unfortunately we could not find this page. - -Use the **search bar**, or use the navigation sidebar to find what you're looking for. - -.. rubric:: Looking for the old documentation? - -If you're looking for an older version of the documentation, versions 1.10 through 2019.2.1 are available at http://previous.sqream.com . +Use the **Search docs** bar, or use the navigation sidebar to find what you're looking for. .. rubric:: Need help? -If you couldn't find what you're looking for, we're always happy to help. Visit `SQream's support portal `_ for additional support. +If you couldn't find what you're looking for, we're always happy to help. + +Visit the `SQreamDB support portal `_ for additional help. diff --git a/_static/css/custom.css b/_static/css/custom.css index 7823005ee..399293607 100644 --- a/_static/css/custom.css +++ b/_static/css/custom.css @@ -57,3 +57,4 @@ div.rst-versions > div.rst-other-versions > div.injected > dl:nth-child(4) { display: none; } +} diff --git a/_static/images/New_Dark_Gray.png b/_static/images/New_Dark_Gray.png new file mode 100644 index 000000000..34ac016eb Binary files /dev/null and b/_static/images/New_Dark_Gray.png differ diff --git a/_static/images/SAP_BO.png b/_static/images/SAP_BO.png new file mode 100644 index 000000000..413ce7d0e Binary files /dev/null and b/_static/images/SAP_BO.png differ diff --git a/_static/images/SAP_BO_2.png b/_static/images/SAP_BO_2.png new file mode 100644 index 000000000..91bc53d1a Binary files /dev/null and b/_static/images/SAP_BO_2.png differ diff --git a/_static/images/SQDBArchitecture.png b/_static/images/SQDBArchitecture.png new file mode 100644 index 000000000..961cdd45d Binary files /dev/null and b/_static/images/SQDBArchitecture.png differ diff --git a/_static/images/SQream_logo_without background-15.png b/_static/images/SQream_logo_without background-15.png new file mode 100644 index 000000000..1a4460581 Binary files /dev/null and b/_static/images/SQream_logo_without background-15.png differ diff --git a/_static/images/aws_sqreamdb_architecture.png b/_static/images/aws_sqreamdb_architecture.png new file mode 100644 index 000000000..5d464318a Binary files /dev/null and b/_static/images/aws_sqreamdb_architecture.png differ diff --git a/_static/images/chunks_and_extents.png b/_static/images/chunks_and_extents.png index bb092fab7..972b624e3 100644 Binary files a/_static/images/chunks_and_extents.png and b/_static/images/chunks_and_extents.png differ diff --git a/_static/images/color_table.png b/_static/images/color_table.png new file mode 100644 index 000000000..b815f9616 Binary files /dev/null and b/_static/images/color_table.png differ diff --git a/_static/images/favicon-dark.png b/_static/images/favicon-dark.png new file mode 100644 index 000000000..241e68580 Binary files /dev/null and b/_static/images/favicon-dark.png differ diff --git a/_static/images/favicon-light.png b/_static/images/favicon-light.png new file mode 100644 index 000000000..dc73ebff4 Binary files /dev/null and b/_static/images/favicon-light.png differ diff --git a/_static/images/kafka_flow.png b/_static/images/kafka_flow.png new file mode 100644 index 000000000..c5bfc0ff2 Binary files /dev/null and b/_static/images/kafka_flow.png differ diff --git a/_static/images/monitor_service_example.png b/_static/images/monitor_service_example.png new file mode 100644 index 000000000..dd8623ab1 Binary files /dev/null and b/_static/images/monitor_service_example.png differ diff --git a/_static/images/new.png b/_static/images/new.png new file mode 100644 index 000000000..a0df8ff0f Binary files /dev/null and b/_static/images/new.png differ diff --git a/_static/images/new_2022.1.1.png b/_static/images/new_2022.1.1.png new file mode 100644 index 000000000..2ffb80039 Binary files /dev/null and b/_static/images/new_2022.1.1.png differ diff --git a/_static/images/new_2022.1.png b/_static/images/new_2022.1.png new file mode 100644 index 000000000..27b2d285a Binary files /dev/null and b/_static/images/new_2022.1.png differ diff --git a/_static/images/new_dark_gray_2022.1.1.png b/_static/images/new_dark_gray_2022.1.1.png new file mode 100644 index 000000000..6d290734a Binary files /dev/null and b/_static/images/new_dark_gray_2022.1.1.png differ diff --git a/_static/images/new_gray_2022.1.1.png b/_static/images/new_gray_2022.1.1.png new file mode 100644 index 000000000..7c6cd28db Binary files /dev/null and b/_static/images/new_gray_2022.1.1.png differ diff --git a/_static/images/sqream_db_internals.png b/_static/images/sqream_db_internals.png index 6b7b2b36b..d41ea5478 100644 Binary files a/_static/images/sqream_db_internals.png and b/_static/images/sqream_db_internals.png differ diff --git a/_static/images/sqream_db_table_crop.png b/_static/images/sqream_db_table_crop.png new file mode 100644 index 000000000..dcfe3bf46 Binary files /dev/null and b/_static/images/sqream_db_table_crop.png differ diff --git a/_static/images/storage_organization.png b/_static/images/storage_organization.png index 8cde6d70e..d6a06d763 100644 Binary files a/_static/images/storage_organization.png and b/_static/images/storage_organization.png differ diff --git a/_static/images/studio_icon_execution_details_view.png b/_static/images/studio_icon_execution_details_view.png new file mode 100644 index 000000000..b6665946d Binary files /dev/null and b/_static/images/studio_icon_execution_details_view.png differ diff --git a/_static/images/table_columns_storage.png b/_static/images/table_columns_storage.png index 322538dac..071e140ea 100644 Binary files a/_static/images/table_columns_storage.png and b/_static/images/table_columns_storage.png differ diff --git a/_static/samples/input.json b/_static/samples/input.json new file mode 100644 index 000000000..bfd11d94e --- /dev/null +++ b/_static/samples/input.json @@ -0,0 +1,57 @@ +{ + "totalNumberOfFragmentedChunks":{ + "from":0, + "to":100 + }, + "percentageStorageCapacity":{ + "from":0, + "to":0.9 + }, + "daysForLicenseExpire":{ + "from":60 + }, + "stuckSnapshots":{ + "from":0, + "to":2 + }, + "queriesInQueue":{ + "from":0, + "to":100 + }, + "availableWorkers":{ + "from":0, + "to":5 + }, + "nodeHeartbeatMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "checkLocksMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "keysAndValuesNMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "keysWithPrefixMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "nodeHeartbeatMsgVariance":{ + "from":0, + "to":1000 + }, + "checkLocksMsgVariance":{ + "from":0, + "to":1000 + }, + "keysAndValuesNVariance":{ + "from":0, + "to":1000 + }, + "keysWithPrefixMsgVariance":{ + "from":0, + "to":1000 + } +} \ No newline at end of file diff --git a/architecture/aws_architecture.rst b/architecture/aws_architecture.rst new file mode 100644 index 000000000..378be596e --- /dev/null +++ b/architecture/aws_architecture.rst @@ -0,0 +1,16 @@ +:orphan: + +.. _aws_architecture: + +**************** +AWS Architecture +**************** + +SQreamDB on AWS private cloud deployment enables full management of sensitive data and transactions within a dedicated cloud environment. + +The following diagram describes how SQDB is deployed on AWS infrastructure. + + + +.. figure:: /_static/images/aws_sqreamdb_architecture.png + :scale: 60 % diff --git a/architecture/concurrency_and_scaling_in_sqream.rst b/architecture/concurrency_and_scaling_in_sqream.rst new file mode 100644 index 000000000..0b9025062 --- /dev/null +++ b/architecture/concurrency_and_scaling_in_sqream.rst @@ -0,0 +1,144 @@ +.. _concurrency_and_scaling_in_sqream: + +****** +Sizing +****** + +Concurrency and Scaling in SQreamDB +=================================== + +A SQreamDB cluster can execute one statement per worker process while also supporting the concurrent operation of multiple workers. Utility functions with minimal resource requirements, such as :ref:`show_server_status`, :ref:`show_locks`, and :ref:`show_node_info` will be executed regardless of the workload. + +Minimum Resource Required Per Worker: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Component + - CPU Cores + - RAM (GB) + - Local Storage (GB) + * - Worker + - 8 + - 128 + - 10 + * - Metadata Server + - 16 cores per 100 Workers + - 20 GB RAM for every 1 trillion rows + - 10 + * - SqreamDB Acceleration Studio + - 16 + - 16 + - 50 + * - Server Picker + - 1 + - 2 + - + + +Lightweight queries, such as :ref:`copy_to` and :ref:`Clean-Up` require 64 RAM (GB). + +Maximum Workers Per GPU: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - GPU + - Workers + * - NVIDIA Turing A10 (16GB) + - 1 + * - NVIDIA Volta V100 (32GB) + - 2 + * - NVIDIA Ampere A100 (40GB) + - 3 + * - NVIDIA Ampere A100 (80GB) + - 6 + * - NVIDIA Hopper H100 (80GB) + - 6 + * - L40S Ada Lovelace (48GB) + - 4 + + + +.. tip:: Your GPU is not on the list? Visit `SQreamDB Support `_ for additional information. + + +Scaling When Data Sizes Grow +---------------------------- + +For many statements, SQreamDB scales linearly when adding more storage and querying on large data sets. It uses optimized 'brute force' algorithms and implementations, which don't suffer from sudden performance cliffs at larger data sizes. + +Scaling When Queries Are Queuing +-------------------------------- + +SQreamDB scales well by adding more workers, GPUs, and nodes to support more concurrent statements. + +What To Do When Queries Are Slow +-------------------------------- + +Adding more workers or GPUs does not boost the performance of a single statement or query. + +To boost the performance of a single statement, start by examining the :ref:`best practices` and ensure the guidelines are followed. + +Adding additional RAM to nodes, using more GPU memory, and faster CPUs or storage can also sometimes help. + +.. _spooling: + +Spooling Configuration +====================== + +:math:`limitQueryMemoryGB=\frac{\text{Total RAM - Internal Operation - metadata Server - Server picker}}{\text{Number of Workers}}` + +:math:`spoolMemoryGB=limitQueryMemoryGB - 50GB` + +The ``limitQueryMemoryGB`` flag is the total memory you’ve allocated for processing queries. In addition, the ``limitQueryMemoryGB`` defines how much total system memory is used by each worker. Note that ``spoolMemoryGB`` must bet set to less than the ``limitQueryMemoryGB``. + +Example +------- + +Setting Spool Memory +~~~~~~~~~~~~~~~~~~~~ + +The provided examples assume a configuration with 2T of RAM, 8 workers running on 2 A100(80GB) GPUs, with 200 GB allocated for Internal Operations, Metadata Server, Server Picker, and UI. + +Configuring the ``limitQueryMemoryGB`` using the Worker configuration file: + +.. code-block:: json + + { + "cluster": "/home/test_user/sqream_testing_temp/sqreamdb", + "gpu": 0, + "licensePath": "home/test_user/SQream/tests/license.enc", + "machineIP": "127.0.0.1", + "metadataServerIp": 127.0.0.1, + "metadataServerPort": 3105, + "port": 5000, + "useConfigIP": true, + "limitQueryMemoryGB" : 225, + } + +Configuring the ``spoolMemoryGB`` using the legacy configuration file: + +.. code-block:: json + + { + "diskSpaceMinFreePercent": 10, + "enableLogDebug": false, + "insertCompressors": 8, + "insertParsers": 8, + "isUnavailableNode": false, + "logBlackList": "webui", + "logDebugLevel": 6, + "nodeInfoLoggingSec": 60, + "useClientLog": true, + "useMetadataServer": true, + "spoolMemoryGB": 175, + "waitForClientSeconds": 18000, + "enablePythonUdfs": true + } + +.. rubric:: Need help? + +Visit `SQreamDB Support `_ for additional information. diff --git a/architecture/filesystem_and_filesystem_usage.rst b/architecture/filesystem_and_filesystem_usage.rst index d1838d4e8..90b1c0a40 100644 --- a/architecture/filesystem_and_filesystem_usage.rst +++ b/architecture/filesystem_and_filesystem_usage.rst @@ -1,33 +1,33 @@ .. _filesystem_and_filesystem_usage: -******************************* -Filesystem and usage -******************************* +******************** +Filesystem and Usage +******************** -SQream DB writes and reads data from disk. +SQreamDB writes and reads data from disk. -The SQream DB storage directory, sometimes refered to as a **storage cluster** is a collection of database objects, metadata database, and logs. +The SQreamDB storage directory, sometimes referred to as a **storage cluster** is a collection of database objects, metadata database, and logs. -Each SQream DB worker and the metadata server must have access to the storage cluster in order to function properly. +Each SQreamDB worker and the metadata server must have access to the storage cluster in order to function properly. .. _storage_cluster: Directory organization -============================ +====================== .. figure:: /_static/images/storage_organization.png -The **cluster root** is the directory in which all data for SQream DB is stored. +The **cluster root** is the directory in which all data for SQreamDB is stored. -.. contents:: SQream DB storage cluster directories +.. contents:: SQreamDB storage cluster directories :local: ``databases`` ----------------- +------------- The databases directory houses all of the actual data in tables and columns. -Each database is stored as it's own directory. Each table is stored under it's respective database, and columns are stored in their respective table. +Each database is stored as its own directory. Each table is stored under its respective database, and columns are stored in their respective table. .. figure:: /_static/images/table_columns_storage.png @@ -63,27 +63,27 @@ Each column directory will contain extents, which are collections of chunks. .. figure:: /_static/images/chunks_and_extents.png -``metadata`` or ``leveldb`` ----------------------------- +``metadata`` or ``rocksdb`` +--------------------------- -SQream DB's metadata is an embedded key-value store, based on LevelDB. LevelDB helps SQream DB ensure efficient storage for keys, handle atomic writes, snapshots, durability, and automatic recovery. +SQreamDB's metadata is an embedded key-value store, based on RocksDB. RocksDB helps SQreamDB ensure efficient storage for keys, handle atomic writes, snapshots, durability, and automatic recovery. The metadata is where all database objects are stored, including roles, permissions, database and table structures, chunk mappings, and more. ``temp`` ----------------- +-------- -The ``temp`` directory is where SQream DB writes temporary data. +The ``temp`` directory is where SQreamDB writes temporary data. -The directory to which SQream DB writes temporary data can be changed to any other directory on the filesystem. SQream recommends remapping this directory to a fast local storage to get better performance when executing intensive larger-than-RAM operations like sorting. SQream recommends an SSD or NVMe drive, in mirrored RAID 1 configuration. +The directory to which SQreamDB writes temporary data can be changed to any other directory on the filesystem. SQreamDB recommends remapping this directory to a fast local storage to get better performance when executing intensive larger-than-RAM operations like sorting. SQreamDB recommends an SSD or NVMe drive, in mirrored RAID 1 configuration. -If desired, the ``temp`` folder can be redirected to a local disk for improved performance, by setting the ``tempPath`` setting in the :ref:`configuration` file. +If desired, the ``temp`` folder can be redirected to a local disk for improved performance, by setting the ``tempPath`` setting in the :ref:`legacy configuration` file. ``logs`` ----------------- +-------- -The logs directory contains logs produced by SQream DB. +The logs directory contains logs produced by SQreamDB. See more about the logs in the :ref:`logging` guide. diff --git a/architecture/index.rst b/architecture/index.rst index 5e4d6c867..213700b35 100644 --- a/architecture/index.rst +++ b/architecture/index.rst @@ -1,21 +1,18 @@ .. _architecture: -*********************** -System Architecture -*********************** +************ +Architecture +************ -This topic includes guides that walk an end-user, database administrator, or system architect through the main ideas behind SQream DB. +The :ref:`internals_architecture`, :ref:`concurrency_and_scaling_in_sqream`, and :ref:`filesystem_and_filesystem_usage` guides are walk-throughs for end-users, database administrators, and system architects who wish to get familiarized with the SQreamDB system and its unique capabilities. -While SQream DB has many similarities to other database management systems, it has some unique and additional capabilities. - -Explore the guides below for information about SQream DB's architecture. +.. figure:: /_static/images/sqream_db_table_crop.png + :scale: 60 % .. toctree:: - :maxdepth: 2 - :caption: In this section: - :glob: - :titlesonly: + :hidden: internals_architecture - xxprocesses_and_network_architecture filesystem_and_filesystem_usage + concurrency_and_scaling_in_sqream + diff --git a/architecture/internals_architecture.rst b/architecture/internals_architecture.rst index f25dfeb22..af4e21f73 100644 --- a/architecture/internals_architecture.rst +++ b/architecture/internals_architecture.rst @@ -1,95 +1,67 @@ .. _internals_architecture: -*************************** -Internals and architecture -*************************** +************************** +Internals and Architecture +************************** -SQream DB internals -============================== +Get to know the SQreamDB key functions and system architecture components, best practices, customization possibilities, and optimizations. -Here is a high level architecture diagram of SQream DB's internals. +SQreamDB leverages GPU acceleration as an essential component of its core database operations, significantly enhancing columnar data processing. This integral GPU utilization isn't an optional feature but is fundamental to a wide range of data tasks such as ``GROUP BY``, scalar functions, ``JOIN``, ``ORDER BY``, and more. This approach harnesses the inherent parallelism of GPUs, effectively employing a single instruction to process multiple values, akin to the Single-Instruction, Multiple Data (SIMD) concept, tailored for high-throughput operations. .. figure:: /_static/images/sqream_db_internals.png - :alt: SQream DB internals + :align: left + :width: 75% + :alt: SQreamDB internals -Statement compiler ------------------------- +Concurrency and Admission Control +================================== -The statement compiler is written in Haskell. This takes SQL text and produces an optimised statement plan. +The SQreamDB execution engine employs thread workers and message passing for its foundation. This threading approach enables the concurrent execution of diverse operations, seamlessly integrating IO and GPU tasks with CPU operations while boosting the performance of CPU-intensive tasks. -Concurrency and concurrency control ----------------------------------------- +Learn more about :ref:`concurrency_and_scaling_in_sqream`. -The execution engine in SQream DB is built around thread workers with message passing. It uses threads to overlap different kinds of operations (including IO and GPU operations with CPU operations), and to accelerate CPU intensive operations. +Statement Compiler +================== -Transactions --------------------- - -SQream DB has serializable transactions, with these features: - -* Serializable, with any kind of statement - -* Run multiple :ref:`SELECT queries` concurrently with anything +* Run multiple inserts to the same table at the same time -.. some of this might be better in another document, if you're reading to -.. understand how sqream performs, this is not the internal architecture -.. but something more directly important to a customer/user +* Cannot run multiple statements in a single transaction +* Other operations such as :ref:`delete`, :ref:`truncate`, and DDL use :ref:`coarse-grained exclusive locking`. diff --git a/architecture/processes_and_network_architecture.rst b/architecture/processes_and_network_architecture.rst deleted file mode 100644 index 8a458f8d5..000000000 --- a/architecture/processes_and_network_architecture.rst +++ /dev/null @@ -1,51 +0,0 @@ -.. _processes_and_network_architecture: - -************************************* -Processes and network architecture -************************************* - -A SQream DB installation contains several components: - -* SQream DB workers (``sqreamd``) -* Metadata daemon (``metadata_server``) -* Load balancer (``server_picker``) - - -.. - processes in sqream: - - metadatad - server picker - sqreamd - - monit system - - pacemaker system - - vip - - ui? - - dashboard? - - mention the command line utils here? - - network - - clients connecting, the wlm redirect - - structure of a embedded metadata sqream - simple (do we need to - mention this in the docs, or is it only for troubleshooting in - production) - - single node with separate sqreamds - connections between the - components, server picker, metadata - - multiple nodes - - basic network connections/routes needed - - what's also needed for the pacemaker component - + how the vip works - - diff --git a/conf.py b/conf.py index 95a622cd6..3c71b0373 100644 --- a/conf.py +++ b/conf.py @@ -10,25 +10,29 @@ # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # -# import os -# import sys -# sys.path.insert(0, os.path.abspath('.')) +import os -import sphinx_rtd_theme +# Define the canonical URL if you are using a custom domain on Read the Docs +html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "") +# Tell Jinja2 templates the build is running on Read the Docs +if os.environ.get("READTHEDOCS", "") == "True": + if "html_context" not in globals(): + html_context = {} + html_context["READTHEDOCS"] = True +import sphinx_rtd_theme +html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] # -- Project information ----------------------------------------------------- -project = 'SQream DB' -copyright = '2022 SQream' -author = 'SQream Documentation' +project = 'SQreamDB' +copyright = '2025 SQreamDB' +author = 'SQreamDB Documentation' # The full version, including alpha/beta/rc tags -release = '2021.2' - - +release = '4.12' # -- General configuration --------------------------------------------------- @@ -36,8 +40,9 @@ # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ - 'sphinx_rtd_theme' - ,'notfound.extension' # 404 handling + "sphinx_rtd_theme", + "notfound.extension", # 404 handling + "sphinx_favicon" ] # Mark 'index' as the main page @@ -59,6 +64,8 @@ # html_theme = 'sphinx_rtd_theme' + + # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". @@ -68,7 +75,7 @@ 'css/custom.css', # Relative to the _static path ] -html_logo = '_static/images/sqream_logo.png' +html_logo = '_static/images/SQream_logo_without background-15.png' # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. @@ -90,10 +97,11 @@ 'logo_only': True # Hide "SQream DB" title and only show logo , 'display_version': True # Display version at the top , 'style_external_links': True # Show little icon next to external links - , 'style_nav_header_background': '#0f9790' # SQream teal + , 'style_nav_header_background': '#133148' # SQream teal , 'navigation_depth': -1 , 'collapse_navigation': False , 'titles_only': True + , 'flyout_display': 'attached' } diff --git a/configuration_guides/admin_cluster_flags.rst b/configuration_guides/admin_cluster_flags.rst deleted file mode 100644 index 3c74819d6..000000000 --- a/configuration_guides/admin_cluster_flags.rst +++ /dev/null @@ -1,9 +0,0 @@ -.. _admin_cluster_flags: - -************************* -Cluster Administration Flags -************************* - -The **Cluster Administration Flags** page describes **Cluster** modification type flags, which can be modified by administrators on a session and cluster basis using the ``ALTER SYSTEM SET`` command: - -* `Persisting Your Cache Directory `_ \ No newline at end of file diff --git a/configuration_guides/admin_flags.rst b/configuration_guides/admin_flags.rst deleted file mode 100644 index e71b9f761..000000000 --- a/configuration_guides/admin_flags.rst +++ /dev/null @@ -1,18 +0,0 @@ -.. _admin_flags: - -************************* -Administration Flags -************************* - -The **Administration Flags** page describes the following flag types, which can be modified by administrators on a session and cluster basis using the ``ALTER SYSTEM SET`` command: - - -.. toctree:: - :maxdepth: 1 - :glob: - - admin_regular_flags - admin_cluster_flags - admin_worker_flags - - diff --git a/configuration_guides/admin_regular_flags.rst b/configuration_guides/admin_regular_flags.rst deleted file mode 100644 index 5300310b6..000000000 --- a/configuration_guides/admin_regular_flags.rst +++ /dev/null @@ -1,31 +0,0 @@ -.. _admin_regular_flags: - -************************* -Regular Administration Flags -************************* -The **Regular Administration Flags** page describes **Regular** modification type flags, which can be modified by administrators on a session and cluster basis using the ``ALTER SYSTEM SET`` command: - -* `Setting Bin Size `_ -* `Setting CUDA Memory `_ -* `Limiting Runtime to Utility Functions `_ -* `Enabling High Bin Control Granularity `_ -* `Reducing CPU Hashtable Sizes `_ -* `Setting Chunk Size for Copying from CPU to GPU `_ -* `Indicating GPU Synchronicity `_ -* `Enabling Modification of R&D Flags `_ -* `Checking for Post-Production CUDA Errors `_ -* `Enabling Modification of clientLogger_debug File `_ -* `Activating the NVidia Profiler Markers `_ -* `Appending String at End of Log Lines `_ -* `Monitoring and Printing Pinned Allocation Reports `_ -* `Increasing Chunk Size to Reduce Query Speed `_ -* `Adding Rechunker before Expensing Chunk Producer `_ -* `Setting the Buffer Size `_ -* `Setting Memory Used to Abort Server `_ -* `Splitting Large Reads for Concurrent Execution `_ -* `Setting Worker Amount to Handle Concurrent Reads `_ -* `Setting Implicit Casts in ORC Files `_ -* `Setting Timeout Limit for Locking Objects before Executing Statements `_ -* `Interpreting Decimal Literals as Double Instead of Numeric `_ -* `Interpreting VARCHAR as TEXT `_ -* `VARCHAR Identifiers `_ diff --git a/configuration_guides/admin_worker_flags.rst b/configuration_guides/admin_worker_flags.rst deleted file mode 100644 index 2be570695..000000000 --- a/configuration_guides/admin_worker_flags.rst +++ /dev/null @@ -1,11 +0,0 @@ -.. _admin_worker_flags: - -************************* -Worker Administration Flags -************************* -The **Worker Administration Flags** page describes **Worker** modification type flags, which can be modified by administrators on a session and cluster basis using the ``ALTER SYSTEM SET`` command: - -* `Setting Total Device Memory Usage in SQream Instance `_ -* `Enabling Manually Setting Reported IP `_ -* `Setting Port Used for Metadata Server Connection `_ -* `Assigning Local Network IP `_ \ No newline at end of file diff --git a/configuration_guides/bin_sizes.rst b/configuration_guides/bin_sizes.rst index 9cdd8e0e8..94db3fffb 100644 --- a/configuration_guides/bin_sizes.rst +++ b/configuration_guides/bin_sizes.rst @@ -1,8 +1,11 @@ +:orphan: + .. _bin_sizes: -************************* +**************** Setting Bin Size -************************* +**************** + The ``binSizes`` flag sets the custom bin size in the cache to enable high granularity over bin control. The following describes the ``binSizes`` flag: diff --git a/configuration_guides/block_new_varchar_objects.rst b/configuration_guides/block_new_varchar_objects.rst new file mode 100644 index 000000000..acae66096 --- /dev/null +++ b/configuration_guides/block_new_varchar_objects.rst @@ -0,0 +1,15 @@ +:orphan: + +.. _block_new_varchar_objects: + +**************************** +Blocking New VARCHAR Objects +**************************** + +The ``blockNewVarcharObjects`` flag disables the creation of new tables, views, external tables containing Varchar columns, and the creation of user-defined functions with Varchar arguments or a Varchar return value. + +The following describes the ``blockNewVarcharObjects`` flag: + +* **Data type** - boolean +* **Default value** - ``false`` +* **Allowed values** - ``true``, ``false`` \ No newline at end of file diff --git a/configuration_guides/cache_disk_dir.rst b/configuration_guides/cache_disk_dir.rst index 012955bc3..5b8ff0fb4 100644 --- a/configuration_guides/cache_disk_dir.rst +++ b/configuration_guides/cache_disk_dir.rst @@ -1,11 +1,12 @@ +:orphan: + .. _cache_disk_dir: -************************* +************************************************** Setting Spool Saved File Directory Location -************************* -The ``cacheDiskDir`` flag sets the ondisk directory location for the spool to save files on. +************************************************** -The following describes the ``cacheDiskDir`` flag: +The ``cacheDiskDir`` flag sets the ondisk directory location for the spool to save files on. * **Data type** - size_t * **Default value** - ``128`` diff --git a/configuration_guides/cache_disk_gb.rst b/configuration_guides/cache_disk_gb.rst index eb3d530cd..fc02e3065 100644 --- a/configuration_guides/cache_disk_gb.rst +++ b/configuration_guides/cache_disk_gb.rst @@ -1,3 +1,5 @@ +:orphan: + .. _cache_disk_gb: ************************* diff --git a/configuration_guides/cache_eviction_milliseconds.rst b/configuration_guides/cache_eviction_milliseconds.rst index 129d6281b..8876959af 100644 --- a/configuration_guides/cache_eviction_milliseconds.rst +++ b/configuration_guides/cache_eviction_milliseconds.rst @@ -1,3 +1,5 @@ +:orphan: + .. _cache_eviction_milliseconds: ************************* diff --git a/configuration_guides/cache_partitions.rst b/configuration_guides/cache_partitions.rst index 0637e347c..4e2182c34 100644 --- a/configuration_guides/cache_partitions.rst +++ b/configuration_guides/cache_partitions.rst @@ -1,3 +1,5 @@ +:orphan: + .. _cache_partitions: ************************* diff --git a/configuration_guides/cache_persistent_dir.rst b/configuration_guides/cache_persistent_dir.rst index c3e298189..f434cb173 100644 --- a/configuration_guides/cache_persistent_dir.rst +++ b/configuration_guides/cache_persistent_dir.rst @@ -1,8 +1,11 @@ +:orphan: + .. _cache_persistent_dir: -************************* +****************************************************** Setting Spool Persistent Saved File Directory Location -************************* +****************************************************** + The ``cachePersistentDir`` flag sets the persistent directory location for the spool to save files on. The following describes the ``cachePersistentDir`` flag: diff --git a/configuration_guides/cache_persistent_gb.rst b/configuration_guides/cache_persistent_gb.rst index 418364e5c..35f0df896 100644 --- a/configuration_guides/cache_persistent_gb.rst +++ b/configuration_guides/cache_persistent_gb.rst @@ -1,8 +1,11 @@ +:orphan: + .. _cache_persistent_gb: -************************* +***************************************** Setting Data Stored Persistently on Cache -************************* +***************************************** + The ``cachePersistentGB`` flag sets the amount of data (GB) for the cache to store persistently . The following describes the ``cachePersistentGB`` flag: diff --git a/configuration_guides/cache_ram_gb.rst b/configuration_guides/cache_ram_gb.rst index 31d56613b..d68f7896f 100644 --- a/configuration_guides/cache_ram_gb.rst +++ b/configuration_guides/cache_ram_gb.rst @@ -1,8 +1,11 @@ +:orphan: + .. _cache_ram_gb: -************************* +***************************** Setting InMemory Spool Memory -************************* +***************************** + The ``cacheRamGB`` flag sets the amount of memory (GB) to be used by Spool InMemory. The following describes the ``cacheRamGB`` flag: diff --git a/configuration_guides/check_cuda_memory.rst b/configuration_guides/check_cuda_memory.rst index 84eec3f07..e826bdeef 100644 --- a/configuration_guides/check_cuda_memory.rst +++ b/configuration_guides/check_cuda_memory.rst @@ -1,3 +1,5 @@ +:orphan: + .. _check_cuda_memory: ************************* diff --git a/configuration_guides/compiler_gets_only_ufs.rst b/configuration_guides/compiler_gets_only_ufs.rst index 1190adc3e..2a48b0cc8 100644 --- a/configuration_guides/compiler_gets_only_ufs.rst +++ b/configuration_guides/compiler_gets_only_ufs.rst @@ -1,8 +1,11 @@ +:orphan: + .. _compiler_gets_only_ufs: -************************* +************************************* Limiting Runtime to Utility Functions -************************* +************************************* + The ``compilerGetsOnlyUFs`` flag sets the runtime to pass only utility functions names to the compiler. The following describes the ``compilerGetsOnlyUFs`` flag: diff --git a/configuration_guides/configuration_flags.rst b/configuration_guides/configuration_flags.rst deleted file mode 100644 index 3a25dc9bd..000000000 --- a/configuration_guides/configuration_flags.rst +++ /dev/null @@ -1,20 +0,0 @@ -.. _configuration_flags: - -************************* -Configuration Flags -************************* -SQream provides two methods for configuration your instance of SQream. The current configuration method is based on cluster and session-based configuration, described in more detail below. Users can also use the previous configuration method done using a configuration file. - -The **Configuration Methods** page describes the following configurations methods: - -.. toctree:: - :maxdepth: 1 - :glob: - :titlesonly: - - admin_flags - generic_flags - - - - diff --git a/configuration_guides/configuration_methods.rst b/configuration_guides/configuration_methods.rst deleted file mode 100644 index 00a70ab5c..000000000 --- a/configuration_guides/configuration_methods.rst +++ /dev/null @@ -1,19 +0,0 @@ -.. _configuration_methods: - -************************* -Configuration Methods -************************* -SQream provides two methods for configuration your instance of SQream. The current configuration method is based on cluster and session-based configuration, described in more detail below. Users can also use the previous configuration method done using a configuration file. - -The **Configuration Methods** page describes the following configurations methods: - -.. toctree:: - :maxdepth: 1 - :glob: - :titlesonly: - - current_configuration_method - previous_configuration_method - - - diff --git a/configuration_guides/configuring_sqream.rst b/configuration_guides/configuring_sqream.rst new file mode 100644 index 000000000..9c74d2a02 --- /dev/null +++ b/configuration_guides/configuring_sqream.rst @@ -0,0 +1,22 @@ +:orphan: + +.. _configuring_sqream: + +************************* +Configuring SQream +************************* + +The **Configuring SQream** page describes the following configuration topics: + +.. toctree:: + :maxdepth: 1 + :glob: + :titlesonly: + + current_method_configuration_levels + current_method_flag_types + current_method_modification_methods + current_method_configuring_your_parameter_values + current_method_showing_all_flags_in_the_catalog_table + + diff --git a/configuration_guides/copy_to_restrict_utf8.rst b/configuration_guides/copy_to_restrict_utf8.rst index 5d5990243..62cbd9b4f 100644 --- a/configuration_guides/copy_to_restrict_utf8.rst +++ b/configuration_guides/copy_to_restrict_utf8.rst @@ -1,12 +1,15 @@ +:orphan: + .. _copy_to_restrict_utf8: -************************* +************************************* Enabling High Bin Control Granularity -************************* -The ``copyToRestrictUtf8`` flag sets sets the custom bin size in the cache to enable high bin control granularity. +************************************* + +The ``copyToRestrictUtf8`` flag sets the custom bin size in the cache to enable high bin control granularity. The following describes the ``copyToRestrictUtf8`` flag: * **Data type** - boolean * **Default value** - ``false`` -* **Allowed values** - ``true``, ``false`` \ No newline at end of file +* **Allowed values** - ``true``, ``false`` diff --git a/configuration_guides/cpu_reduce_hashtable_size.rst b/configuration_guides/cpu_reduce_hashtable_size.rst deleted file mode 100644 index c2b01604c..000000000 --- a/configuration_guides/cpu_reduce_hashtable_size.rst +++ /dev/null @@ -1,11 +0,0 @@ -.. _cpu_reduce_hashtable_size: - -************************* -Enabling High Bin Control Granularity -************************* -The ``copyToRestrictUtf8`` flag sets the custom bin size in the cache to enable high bin control granularity. - -The following describes the ``checkCudaMemory`` flag: - -* **Data type** - boolean -* **Default value** - ``FALSE`` \ No newline at end of file diff --git a/configuration_guides/csv_limit_row_length.rst b/configuration_guides/csv_limit_row_length.rst index 03f31c697..318318df2 100644 --- a/configuration_guides/csv_limit_row_length.rst +++ b/configuration_guides/csv_limit_row_length.rst @@ -1,8 +1,11 @@ +:orphan: + .. _csv_limit_row_length: -************************* +****************************** Setting Maximum CSV Row Length -************************* +****************************** + The ``csvLimitRowLength`` flag sets the maximum supported CSV row length. The following describes the ``csvLimitRowLength`` flag: diff --git a/configuration_guides/cuda_mem_cpy_max_size_bytes.rst b/configuration_guides/cuda_mem_cpy_max_size_bytes.rst index 371c9bda4..dccc32e3b 100644 --- a/configuration_guides/cuda_mem_cpy_max_size_bytes.rst +++ b/configuration_guides/cuda_mem_cpy_max_size_bytes.rst @@ -1,8 +1,11 @@ +:orphan: + .. _cuda_mem_cpy_max_size_bytes: -************************* +********************************************** Setting Chunk Size for Copying from CPU to GPU -************************* +********************************************** + The ``cudaMemcpyMaxSizeBytes`` flag sets the chunk size for copying from CPU to GPU. If this value is set to ``0``, do not divide. The following describes the ``cudaMemcpyMaxSizeBytes`` flag: diff --git a/configuration_guides/cuda_mem_cpy_synchronous.rst b/configuration_guides/cuda_mem_cpy_synchronous.rst index 81e762071..fcba5f734 100644 --- a/configuration_guides/cuda_mem_cpy_synchronous.rst +++ b/configuration_guides/cuda_mem_cpy_synchronous.rst @@ -1,8 +1,11 @@ +:orphan: + .. _cuda_mem_cpy_synchronous: -************************* +**************************** Indicating GPU Synchronicity -************************* +**************************** + The ``CudaMemcpySynchronous`` flag indicates if copying from/to GPU is synchronous. The following describes the ``CudaMemcpySynchronous`` flag: diff --git a/configuration_guides/cuda_mem_quota.rst b/configuration_guides/cuda_mem_quota.rst index 43f9d4943..a738e3477 100644 --- a/configuration_guides/cuda_mem_quota.rst +++ b/configuration_guides/cuda_mem_quota.rst @@ -1,8 +1,12 @@ +:orphan: + + .. _cuda_mem_quota: -************************* +**************************************************** Setting Total Device Memory Usage in SQream Instance -************************* +**************************************************** + The ``cudaMemQuota`` flag sets the percentage of total device memory used by your instance of SQream. The following describes the ``cudaMemQuota`` flag: diff --git a/configuration_guides/current_configuration_method.rst b/configuration_guides/current_configuration_method.rst deleted file mode 100644 index e7ca5c0d3..000000000 --- a/configuration_guides/current_configuration_method.rst +++ /dev/null @@ -1,729 +0,0 @@ -.. _current_configuration_method: - -************************** -Configuring SQream -************************** -The **Configuring SQream** page describes SQream’s method for configuring your instance of SQream and includes the following topics: - -.. contents:: - :local: - :depth: 1 - -Overview ------ -Modifications that you make to your configurations are persistent based on whether they are made at the session or cluster level. Persistent configurations are modifications made to attributes that are retained after shutting down your system. - -Modifying Your Configuration ----- -The **Modifying Your Configuration** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Modifying Your Configuration Using the Worker Configuration File -~~~~~~~~~~~ -You can modify your configuration using the **worker configuration file (config.json)**. Changes that you make to worker configuration files are persistent. Note that you can only set the attributes in your worker configuration file **before** initializing your SQream worker, and while your worker is active these attributes are read-only. - -The following is an example of a worker configuration file: - -.. code-block:: postgres - - { - “cluster”: “/home/test_user/sqream_testing_temp/sqreamdb”, - “gpu”: 0, - “licensePath”: “home/test_user/SQream/tests/license.enc”, - “machineIP”: “127.0.0.1”, - “metadataServerIp”: “127.0.0.1”, - “metadataServerPort”: “3105, - “port”: 5000, - “useConfigIP”” true, - “legacyConfigFilePath”: “home/SQream_develop/SqrmRT/utils/json/legacy_congif.json” - } - -You can access the legacy configuration file from the ``legacyConfigFilePath`` parameter shown above. If all (or most) of your workers require the same flag settings, you can set the ``legacyConfigFilePath`` attribute to the same legacy file. - -Modifying Your Configuration Using a Legacy Configuration File -~~~~~~~~~~~ -You can modify your configuration using a legacy configuration file. - -The Legacy configuration file provides access to the read/write flags used in SQream’s previous configuration method. A link to this file is provided in the **legacyConfigFilePath** parameter in the worker configuration file. - -The following is an example of the legacy configuration file: - -.. code-block:: postgres - - { - “developerMode”: true, - “reextentUse”: false, - “useClientLog”: true, - “useMetadataServer”” false - } - -Session vs Cluster Based Configuration -============================== -.. contents:: - :local: - :depth: 1 - -Cluster-Based Configuration --------------- -SQream uses cluster-based configuration, enabling you to centralize configurations for all workers on the cluster. Only flags set to the regular or cluster flag type have access to cluster-based configuration. Configurations made on the cluster level are persistent and stored at the metadata level. The parameter settings in this file are applied globally to all workers connected to it. - -For more information, see the following: - -* `Using SQream SQL `_ - modifying flag attributes from the CLI. -* `SQream Acceleration Studio `_ - modifying flag attributes from Studio. - -For more information on flag-based access to cluster-based configuration, see **Configuration Flag Types** below. - -Session-Based Configuration ----------------- -Session-based configurations are not persistent and are deleted when your session ends. This method enables you to modify all required configurations while avoiding conflicts between flag attributes modified on different devices at different points in time. - -The **SET flag_name** command is used to modify flag attributes. Any modifications you make with the **SET flag_name** command apply only to your open session, and are not saved when it ends - -For example, when the query below has completed executing, the values configured will be restored to its previous setting: - -.. code-block:: console - - set spoolMemoryGB=700; - select * from table a where date='2021-11-11' - -For more information, see the following: - -* `Using SQream SQL `_ - modifying flag attributes from the CLI. -* `SQream Acceleration Studio `_ - modifying flag attributes from Studio. - -Configuration Flag Types -========== -The flag type attribute can be set for each flag and determines its write access as follows: - -* **Administration:** session-based read/write flags that can be stored in the metadata file. -* **Cluster:** global cluster-based read/write flags that can be stored in the metadata file. -* **Worker:** single worker-based read-only flags that can be stored in the worker configuration file. - -The flag type determines which files can be accessed and which commands or commands sets users can run. - -The following table describes the file or command modification rights for each flag type: - -.. list-table:: - :widths: 20 20 20 20 - :header-rows: 1 - - * - **Flag Type** - - **Legacy Configuration File** - - **ALTER SYSTEM SET** - - **Worker Configuration File** - * - :ref:`Regular` - - Can modify - - Can modify - - Cannot modify - * - :ref:`Cluster` - - Cannot modify - - Can modify - - Cannot modify - * - :ref:`Worker` - - Cannot modify - - Cannot modify - - Can modify - -.. _regular_flag_types: - -Regular Flag Types ---------------------- -The following is an example of the correct syntax for running a **Regular** flag type command: - -.. code-block:: console - - SET spoolMemoryGB= 11; - executed - -The following table describes the Regular flag types: - -.. list-table:: - :widths: 2 5 10 - :header-rows: 1 - - * - **Command** - - **Description** - - **Example** - * - ``SET `` - - Used for modifying flag attributes. - - ``SET developerMode=true`` - * - ``SHOW / ALL`` - - Used to preset either a specific flag value or all flag values. - - ``SHOW `` - * - ``SHOW ALL LIKE`` - - Used as a wildcard character for flag names. - - ``SHOW `` - * - ``show_conf_UF`` - - Used to print all flags with the following attributes: - - * Flag name - * Default value - * Is developer mode (Boolean) - * Flag category - * Flag type - - ``rechunkThreshold,90,true,RND,regular`` - * - ``show_conf_extended UF`` - - Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. - - ``compilerGetsOnlyUFs,false,generic,regular,Makes runtime pass to compiler only`` - ``utility functions names,boolean,true,false`` - * - ``show_md_flag UF`` - - Used to show a specific flag/all flags stored in the metadata file. - - - * Example 1: ``* master=> ALTER SYSTEM SET heartbeatTimeout=111;`` - * Example 2: ``* master=> select show_md_flag(‘all’); heartbeatTimeout,111`` - * Example 3: ``* master=> select show_md_flag(‘heartbeatTimeout’); heartbeatTimeout,111`` - -.. _cluster_flag_types: - -Cluster Flag Types ---------------------- -The following is an example of the correct syntax for running a **Cluster** flag type command: - -.. code-block:: console - - ALTER SYSTEM RESET useMetadataServer; - executed - -The following table describes the Cluster flag types: - -.. list-table:: - :widths: 1 5 10 - :header-rows: 1 - - * - **Command** - - **Description** - - **Example** - * - ``ALTER SYSTEM SET `` - - Used to storing or modifying flag attributes in the metadata file. - - ``ALTER SYSTEM SET `` - * - ``ALTER SYSTEM RESET `` - - Used to remove a flag or all flag attributes from the metadata file. - - ``ALTER SYSTEM RESET `` - * - ``SHOW / ALL`` - - Used to print the value of a specified value or all flag values. - - ``SHOW `` - * - ``SHOW ALL LIKE`` - - Used as a wildcard character for flag names. - - ``SHOW `` - * - ``show_conf_UF`` - - Used to print all flags with the following attributes: - - * Flag name - * Default value - * Is developer mode (Boolean) - * Flag category - * Flag type - - ``rechunkThreshold,90,true,RND,regular`` - * - ``show_conf_extended UF`` - - Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. - - ``compilerGetsOnlyUFs,false,generic,regular,Makes runtime pass to compiler only`` - ``utility functions names,boolean,true,false`` - * - ``show_md_flag UF`` - - Used to show a specific flag/all flags stored in the metadata file. - - - * Example 1: ``* master=> ALTER SYSTEM SET heartbeatTimeout=111;`` - * Example 2: ``* master=> select show_md_flag(‘all’); heartbeatTimeout,111`` - * Example 3: ``* master=> select show_md_flag(‘heartbeatTimeout’); heartbeatTimeout,111`` - -.. _worker_flag_types: - -Worker Flag Types ---------------------- -The following is an example of the correct syntax for running a **Worker** flag type command: - -.. code-block:: console - - SHOW spoolMemoryGB; - -The following table describes the Worker flag types: - -.. list-table:: - :widths: 1 5 10 - :header-rows: 1 - - * - **Command** - - **Description** - - **Example** - * - ``ALTER SYSTEM SET `` - - Used to storing or modifying flag attributes in the metadata file. - - ``ALTER SYSTEM SET `` - * - ``ALTER SYSTEM RESET `` - - Used to remove a flag or all flag attributes from the metadata file. - - ``ALTER SYSTEM RESET `` - * - ``SHOW / ALL`` - - Used to print the value of a specified value or all flag values. - - ``SHOW `` - * - ``SHOW ALL LIKE`` - - Used as a wildcard character for flag names. - - ``SHOW `` - * - ``show_conf_UF`` - - Used to print all flags with the following attributes: - - * Flag name - * Default value - * Is developer mode (Boolean) - * Flag category - * Flag type - - ``rechunkThreshold,90,true,RND,regular`` - * - ``show_conf_extended UF`` - - Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. - - - ``compilerGetsOnlyUFs,false,generic,regular,Makes runtime pass to compiler only`` - ``utility functions names,boolean,true,false`` - * - ``show_md_flag UF`` - - Used to show a specific flag/all flags stored in the metadata file. - - - * Example 1: ``* master=> ALTER SYSTEM SET heartbeatTimeout=111;`` - * Example 2: ``* master=> select show_md_flag(‘all’); heartbeatTimeout,111`` - * Example 3: ``* master=> select show_md_flag(‘heartbeatTimeout’); heartbeatTimeout,111`` - -All Configurations ---------------------- -The following table describes the **Generic** and **Administration** configuration flags: - -.. list-table:: - :header-rows: 1 - :widths: 1 2 1 15 1 20 - :class: my-class - :name: my-name - - * - Flag Name - - Access Control - - Modification Type - - Description - - Data Type - - Default Value - - * - ``binSizes`` - - Administration - - Regular - - Sets the custom bin size in the cache to enable high granularity bin control. - - string - - - ``16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536,`` - ``131072,262144,524288,1048576,2097152,4194304,8388608,16777216,`` - ``33554432,67108864,134217728,268435456,536870912,786432000,107374,`` - ``1824,1342177280,1610612736,1879048192,2147483648,2415919104,`` - ``2684354560,2952790016,3221225472`` - - * - ``checkCudaMemory`` - - Administration - - Regular - - Sets the pad device memory allocations with safety buffers to catch out-of-bounds writes. - - boolean - - ``FALSE`` - - * - ``compilerGetsOnlyUFs`` - - Administration - - Regular - - Sets the runtime to pass only utility functions names to the compiler. - - boolean - - ``FALSE`` - - * - ``copyToRestrictUtf8`` - - Administration - - Regular - - Sets the custom bin size in the cache to enable high granularity bin control. - - boolean - - ``FALSE`` - - * - ``cpuReduceHashtableSize`` - - Administration - - Regular - - Sets the hash table size of the CpuReduce. - - uint - - ``10000`` - - * - ``csvLimitRowLength`` - - Administration - - Cluster - - Sets the maximum supported CSV row length. - - uint - - ``100000`` - - * - ``cudaMemcpyMaxSizeBytes`` - - Administration - - Regular - - Sets the chunk size for copying from CPU to GPU. If set to 0, do not divide. - - uint - - ``0`` - - * - ``CudaMemcpySynchronous`` - - Administration - - Regular - - Indicates if copying from/to GPU is synchronous. - - boolean - - ``FALSE`` - - * - ``cudaMemQuota`` - - Administration - - Worker - - Sets the percentage of total device memory to be used by the instance. - - uint - - ``90`` - - * - ``developerMode`` - - Administration - - Regular - - Enables modifying R&D flags. - - boolean - - ``FALSE`` - - * - ``enableDeviceDebugMessages`` - - Administration - - Regular - - Activates the Nvidia profiler (nvprof) markers. - - boolean - - ``FALSE`` - - * - ``enableLogDebug`` - - Administration - - Regular - - Enables creating and logging in the clientLogger_debug file. - - boolean - - ``TRUE`` - - * - ``enableNvprofMarkers`` - - Administration - - Regular - - Activates the Nvidia profiler (nvprof) markers. - - boolean - - ``FALSE`` - - * - ``endLogMessage`` - - Administration - - Regular - - Appends a string at the end of every log line. - - string - - ``EOM`` - - - - * - ``varcharIdentifiers`` - - Administration - - Regular - - Activates using varchar as an identifier. - - boolean - - ``true`` - - - - * - ``extentStorageFileSizeMB`` - - Administration - - Cluster - - Sets the minimum size in mebibytes of extents for table bulk data. - - uint - - ``20`` - - * - ``gatherMemStat`` - - Administration - - Regular - - Monitors all pinned allocations and all **memcopies** to/from device, and prints a report of pinned allocations that were not memcopied to/from the device using the **dump_pinned_misses** utility function. - - boolean - - ``FALSE`` - - * - ``increaseChunkSizeBeforeReduce`` - - Administration - - Regular - - Increases the chunk size to reduce query speed. - - boolean - - ``FALSE`` - - * - ``increaseMemFactors`` - - Administration - - Regular - - Adds rechunker before expensive chunk producer. - - boolean - - ``TRUE`` - - * - ``leveldbWriteBufferSize`` - - Administration - - Regular - - Sets the buffer size. - - uint - - ``524288`` - - * - ``machineIP`` - - Administration - - Worker - - Manual setting of reported IP. - - string - - ``127.0.0.1`` - - - - - * - ``memoryResetTriggerMB`` - - Administration - - Regular - - Sets the size of memory used during a query to trigger aborting the server. - - uint - - ``0`` - - * - ``metadataServerPort`` - - Administration - - Worker - - Sets the port used to connect to the metadata server. SQream recommends using port ranges above 1024† because ports below 1024 are usually reserved, although there are no strict limitations. Any positive number (1 - 65535) can be used. - - uint - - ``3105`` - - * - ``mtRead`` - - Administration - - Regular - - Splits large reads to multiple smaller ones and executes them concurrently. - - boolean - - ``FALSE`` - - * - ``mtReadWorkers`` - - Administration - - Regular - - Sets the number of workers to handle smaller concurrent reads. - - uint - - ``30`` - - * - ``orcImplicitCasts`` - - Administration - - Regular - - Sets the implicit cast in orc files, such as **int** to **tinyint** and vice versa. - - boolean - - ``TRUE`` - - * - ``statementLockTimeout`` - - Administration - - Regular - - Sets the timeout (seconds) for acquiring object locks before executing statements. - - uint - - ``3`` - - * - ``useConfigIP`` - - Administration - - Worker - - Activates the machineIP (true). Setting to false ignores the machineIP and automatically assigns a local network IP. This cannot be activated in a cloud scenario (on-premises only). - - boolean - - ``FALSE`` - - * - ``useLegacyDecimalLiterals`` - - Administration - - Regular - - Interprets decimal literals as **Double** instead of **Numeric**. Used to preserve legacy behavior in existing customers. - - boolean - - ``FALSE`` - - * - ``useLegacyStringLiterals`` - - Administration - - Regular - - Interprets ASCII-only strings as **VARCHAR** instead of **TEXT**. Used to preserve legacy behavior in existing customers. - - boolean - - ``FALSE`` - - * - ``flipJoinOrder`` - - Generic - - Regular - - Reorders join to force equijoins and/or equijoins sorted by table size. - - boolean - - ``FALSE`` - - * - ``limitQueryMemoryGB`` - - Generic - - Worker - - Prevents a query from processing more memory than the flag’s value. - - uint - - ``100000`` - - * - ``cacheEvictionMilliseconds`` - - Generic - - Regular - - Sets how long the cache stores contents before being flushed. - - size_t - - ``2000`` - - - * - ``cacheDiskDir`` - - Generic - - Regular - - Sets the ondisk directory location for the spool to save files on. - - size_t - - Any legal string - - - * - ``cacheDiskGB`` - - Generic - - Regular - - Sets the amount of memory (GB) to be used by Spool on the disk. - - size_t - - ``128`` - - * - ``cachePartitions`` - - Generic - - Regular - - Sets the number of partitions that the cache is split into. - - size_t - - ``4`` - - - * - ``cachePersistentDir`` - - Generic - - Regular - - Sets the persistent directory location for the spool to save files on. - - string - - Any legal string - - - * - ``cachePersistentGB`` - - Generic - - Regular - - Sets the amount of data (GB) for the cache to store persistently. - - size_t - - ``128`` - - - * - ``cacheRamGB`` - - Generic - - Regular - - Sets the amount of memory (GB) to be used by Spool InMemory. - - size_t - - ``16`` - - - - - - - - * - ``logSysLevel`` - - Generic - - Regular - - - Determines the client log level: - 0 - L_SYSTEM, - 1 - L_FATAL, - 2 - L_ERROR, - 3 - L_WARN, - 4 - L_INFO, - 5 - L_DEBUG, - 6 - L_TRACE - - uint - - ``100000`` - - * - ``maxAvgBlobSizeToCompressOnGpu`` - - Generic - - Regular - - Sets the CPU to compress columns with size above (flag’s value) * (row count). - - uint - - ``120`` - - - * - ``sessionTag`` - - Generic - - Regular - - Sets the name of the session tag. - - string - - Any legal string - - - - * - ``spoolMemoryGB`` - - Generic - - Regular - - Sets the amount of memory (GB) to be used by the server for spooling. - - uint - - ``8`` - -Configuration Commands -========== -The configuration commands are associated with particular flag types based on permissions. - -The following table describes the commands or command sets that can be run based on their flag type. Note that the flag names described in the following table are described in the :ref:`Configuration Roles` section below. - -.. list-table:: - :header-rows: 1 - :widths: 1 2 10 17 - :class: my-class - :name: my-name - - * - Flag Type - - Command - - Description - - Example - * - Regular - - ``SET `` - - Used for modifying flag attributes. - - ``SET developerMode=true`` - * - Cluster - - ``ALTER SYSTEM SET `` - - Used to storing or modifying flag attributes in the metadata file. - - ``ALTER SYSTEM SET `` - * - Cluster - - ``ALTER SYSTEM RESET `` - - Used to remove a flag or all flag attributes from the metadata file. - - ``ALTER SYSTEM RESET `` - * - Regular, Cluster, Worker - - ``SHOW / ALL`` - - Used to print the value of a specified value or all flag values. - - ``SHOW `` - * - Regular, Cluster, Worker - - ``SHOW ALL LIKE`` - - Used as a wildcard character for flag names. - - ``SHOW `` - * - Regular, Cluster, Worker - - ``show_conf_UF`` - - Used to print all flags with the following attributes: - - * Flag name - * Default value - * Is developer mode (Boolean) - * Flag category - * Flag type - - - - - ``rechunkThreshold,90,true,RND,regular`` - * - Regular, Cluster, Worker - - ``show_conf_extended UF`` - - Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. - - ``spoolMemoryGB,15,false,generic,regular,Amount of memory (GB)`` - ``the server can use for spooling,”Statement that perform “”group by””,`` - ``“”order by”” or “”join”” operation(s) on large set of data will run`` - ``much faster if given enough spool memory, otherwise disk spooling will`` - ``be used resulting in performance hit.”,uint,,0-5000`` - * - Regular, Cluster, Worker - - ``show_md_flag UF`` - - Used to show a specific flag/all flags stored in the metadata file. - - - * Example 1: ``* master=> ALTER SYSTEM SET heartbeatTimeout=111;`` - * Example 2: ``* master=> select show_md_flag(‘all’); heartbeatTimeout,111`` - * Example 3: ``* master=> select show_md_flag(‘heartbeatTimeout’); heartbeatTimeout,111`` - -.. _configuration_roles: - -Configuration Roles -=========== -SQream divides flags into the following roles, each with their own set of permissions: - -* **`Administration flags `_**: can be modified by administrators on a session and cluster basis using the ``ALTER SYSTEM SET`` command. -* **`Generic flags `_**: can be modified by standard users on a session basis. - -Showing All Flags in the Catalog Table -======= -SQream uses the **sqream_catalog.parameters** catalog table for showing all flags, providing the scope (default, cluster and session), description, default value and actual value. - -The following is the correct syntax for a catalog table query: - -.. code-block:: console - - SELECT * FROM sqream_catalog.settings - -The following is an example of a catalog table query: - -.. code-block:: console - - externalTableBlobEstimate, 100, 100, default, - varcharEncoding, ascii, ascii, default, Changes the expected encoding for Varchar columns - useCrcForTextJoinKeys, true, true, default, - hiveStyleImplicitStringCasts, false, false, default, - -This guide covers the configuration files and the ``SET`` statement. \ No newline at end of file diff --git a/configuration_guides/current_method_configuration_levels.rst b/configuration_guides/current_method_configuration_levels.rst new file mode 100644 index 000000000..d91cb476a --- /dev/null +++ b/configuration_guides/current_method_configuration_levels.rst @@ -0,0 +1,303 @@ +.. _current_method_configuration_levels: + +******************* +Cluster and Session +******************* + +When configuring your SQreamDB environment, you have the option to use flags that apply to either the entire cluster or a specific session. Cluster configuration involve metadata and is persistent. Persistent modifications refer to changes made to a system or component that are saved and retained even after the system is restarted or shut down, allowing the modifications to persist over time. Session flags only apply to a specific session and are not persistent. Changes made using session flags are not visible to other users, and once the session ends, the flags return to their default values. + +Setting the flags +================= + +Syntax +------ + +You may set both cluster and session flags using the following syntax on SQreamDB Acceleration Studio and Console: + +Cluster flag syntax: + +.. code-block:: sql + + ALTER SYSTEM SET + +Session flag syntax: + +.. code-block:: sql + + SET + +Configuration file +------------------ + +You may set session flags within your :ref:`Legacy Configuration File`. + +Flag List +========= + +.. list-table:: + :header-rows: 1 + :widths: auto + :name: my-name + + * - Flag Name + - Who May Configure + - Cluster / Session + - Description + - Data Type + - Default Value and Value Range + * - ``binSizes`` + - SUPERUSER + - Session + - Sets the custom bin size in the cache to enable high granularity bin control. + - string + - + ``16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536,`` + ``131072,262144,524288,1048576,2097152,4194304,8388608,16777216,`` + ``33554432,67108864,134217728,268435456,536870912,786432000,107374,`` + ``1824,1342177280,1610612736,1879048192,2147483648,2415919104,`` + ``2684354560,2952790016,3221225472`` + * - ``blockNewVarcharObjects`` + - SUPERUSER + - Session + - Disables the creation of new tables, views, external tables containing Varchar columns, and the creation of user-defined functions with Varchar arguments or a Varchar return value. + - boolean + - ``FALSE`` + * - ``cacheDiskDir`` + - Anyone + - Session + - Sets the ondisk directory location for the spool to save files on. Allowed values: Any legal string. + - bigint + - Any legal string + * - ``cacheDiskGB`` + - Anyone + - Session + - Sets the amount of memory (GB) to be used by Spool on the disk. Allowed values: 0-4000000000. + - bigint + - ``128`` + * - ``cacheEvictionMilliseconds`` + - Anyone + - Session + - Sets how long the cache stores contents before being flushed. Allowed values: 1-4000000000. + - bigint + - ``2000`` + * - ``cachePartitions`` + - Anyone + - Session + - Sets the number of partitions that the cache is split into. Allowed values: 1-4000000000. + - bigint + - ``4`` + * - ``cachePersistentDir`` + - Anyone + - Session + - Sets the persistent directory location for the spool to save files on. Allowed values: Any legal string. + - string + - ``/tmp`` + * - ``cachePersistentGB`` + - Anyone + - Session + - Sets the amount of data (GB) for the cache to store persistently. Allowed values: 0-4000000000. + - bigint + - ``128`` + * - ``cacheRamGB`` + - Anyone + - Session + - Sets the amount of memory (GB) to be used by Spool InMemory. Allowed values: 0-4000000000. + - bigint + - ``16`` + * - ``checkCudaMemory`` + - SUPERUSER + - Session + - Sets the pad device memory allocations with safety buffers to catch out-of-bounds writes. + - boolean + - ``FALSE`` + * - ``compilerGetsOnlyUFs`` + - SUPERUSER + - Session + - Sets the runtime to pass only utility functions names to the compiler. + - boolean + - ``FALSE`` + * - ``clientReconnectionTimeout`` + - Anyone + - Cluster + - Reconnection time out for the system in seconds. + - Integer + - ``30`` + * - ``copyToRestrictUtf8`` + - SUPERUSER + - Session + - Sets the custom bin size in the cache to enable high granularity bin control. + - boolean + - ``FALSE`` + * - ``csvLimitRowLength`` + - SUPERUSER + - Cluster + - Sets the maximum supported CSV row length. Allowed values: 1-4000000000 + - uint + - ``100000`` + * - ``cudaMemcpyMaxSizeBytes`` + - SUPERUSER + - Session + - Sets the chunk size for copying from CPU to GPU. If set to 0, do not divide. + - uint + - ``0`` + * - ``CudaMemcpySynchronous`` + - SUPERUSER + - Session + - Indicates if copying from/to GPU is synchronous. + - boolean + - ``FALSE`` + * - ``defaultGracefulShutdownTimeoutMinutes`` + - SUPERUSER + - Cluster + - Used for setting the amount of time to pass before SQream performs a graceful server shutdown. Allowed values - 1-4000000000. Related flags: ``is_healer_on`` and ``healer_max_inactivity_hours`` + - bigint + - ``5`` + * - ``developerMode`` + - SUPERUSER + - Session + - Enables modifying R&D flags. + - boolean + - ``FALSE`` + * - ``enableDeviceDebugMessages`` + - SUPERUSER + - Session + - Checks for CUDA errors after producing each chunk. + - boolean + - ``FALSE`` + * - ``enableNvprofMarkers`` + - SUPERUSER + - Session + - Activates the Nvidia profiler (nvprof) markers. + - boolean + - ``FALSE`` + * - ``extentStorageFileSizeMB`` + - SUPERUSER + - Cluster + - Sets the minimum size in mebibytes of extents for table bulk data. + - uint + - ``20`` + * - ``flipJoinOrder`` + - Anyone + - Session + - Reorders join to force equijoins and/or equijoins sorted by table size. + - boolean + - ``FALSE`` + * - ``gatherMemStat`` + - SUPERUSER + - Session + - Monitors all pinned allocations and all **memcopies** to/from device, and prints a report of pinned allocations that were not memcopied to/from the device using the ``dump_pinned_misses`` utility function. + - boolean + - ``FALSE`` + * - ``increaseChunkSizeBeforeReduce`` + - SUPERUSER + - Session + - Increases the chunk size to reduce query speed. + - boolean + - ``FALSE`` + * - ``increaseMemFactors`` + - SUPERUSER + - Session + - Adds rechunker before expensive chunk producer. + - boolean + - ``TRUE`` + * - ``leveldbWriteBufferSize`` + - SUPERUSER + - Session + - Sets the buffer size. + - uint + - ``524288`` + * - ``maxAvgBlobSizeToCompressOnGpu`` + - Anyone + - Session + - Sets the CPU to compress columns with size above (flag’s value) * (row count). + - uint + - ``120`` + * - ``memoryResetTriggerMB`` + - SUPERUSER + - Session + - Sets the size of memory used during a query to trigger aborting the server. + - uint + - ``0`` + * - ``mtRead`` + - SUPERUSER + - Session + - Splits large reads to multiple smaller ones and executes them concurrently. + - boolean + - ``FALSE`` + * - ``mtReadWorkers`` + - SUPERUSER + - Session + - Sets the number of workers to handle smaller concurrent reads. + - uint + - ``30`` + * - ``orcImplicitCasts`` + - SUPERUSER + - Session + - Sets the implicit cast in orc files, such as **int** to **tinyint** and vice versa. + - boolean + - ``TRUE`` + * - ``QueryTimeoutMinutes`` + - Anyone + - Session + - Terminates queries that have exceeded a predefined execution time limit, ranging from ``1`` to ``4,320`` minutes (72 hours). + - integer + - ``0`` (no query timeout) + * - ``sessionTag`` + - Anyone + - Session + - Sets the name of the session tag. Allowed values: Any legal string. + - string + - Any legal string + * - ``spoolMemoryGB`` + - Anyone + - Session + - Sets the amount of memory (GB) to be used by the server for spooling. + - uint + - ``8`` + * - ``statementLockTimeout`` + - SUPERUSER + - Session + - Sets the timeout (seconds) for acquiring object locks before executing statements. + - uint + - ``3`` + * - ``useLegacyDecimalLiterals`` + - SUPERUSER + - Session + - Interprets decimal literals as **Double** instead of **Numeric**. Used to preserve legacy behavior in existing customers. + - boolean + - ``FALSE`` + + + + + * - ``cpuReduceHashtableSize`` + - SUPERUSER + - Session + - Sets the hash table size of the CpuReduce. + - uint + - ``10000`` + * - ``maxPinnedPercentageOfTotalRAM`` + - SUPERUSER + - Session + - Sets the maximum percentage CPU RAM that pinned memory can use. + - uint + - ``70`` + * - ``memMergeBlobOffsetsCount`` + - SUPERUSER + - Session + - Sets the size of memory used during a query to trigger aborting the server. + - uint + - ``0`` + * - ``queueTimeoutMinutes`` + - Anyone + - Session + - Terminates queries that have exceeded a predefined time limit in the queue. + - integer + - Default value: 0. Minimum values: 1 minute. Maximum value: 4320 minutes (72 hours) + * - ``timezone`` + - Anyone + - Session + - The timezone flag dictates the timezone context used when SQDB encounters ``DATETIME2`` values during data ingestion. This includes overriding any existing timezone information within those values. + - Text + - Default value: ``null``. ``local`` local system timezone of the SQDB server. ``+hh:mm`` or ``-hh:mm`` explicitly defines the timezone for all incoming ``DATETIME2`` values, overriding any existing timezone information. + diff --git a/configuration_guides/current_method_configuring_your_parameter_values.rst b/configuration_guides/current_method_configuring_your_parameter_values.rst new file mode 100644 index 000000000..91c79e6d4 --- /dev/null +++ b/configuration_guides/current_method_configuring_your_parameter_values.rst @@ -0,0 +1,33 @@ +.. _current_method_configuring_your_parameter_values: + +**************** +Parameter Values +**************** + ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| **Command** | **Description** | **Example** | ++=====================================================+===========================================================================================================================================+===============================================================================================================+ +| ``SET`` | Used for modifying flag attributes. | ``SET enableLogDebug=false`` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``SHOW / ALL`` | Used to preset either a specific flag value or all flag values. | ``SHOW `` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``SHOW ALL LIKE`` | Used as a wildcard character for flag names. | ``SHOW `` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``SELECT show_conf() ;`` | Used to print all flags with the following attributes: | ``rechunkThreshold,90,true,RND,regular`` | +| | | | +| | * Flag name | | +| | * Default value | | +| | * Is Developer Mode (Boolean) | | +| | * Flag category | | +| | * Flag type | | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``SELECT show_conf_extended();`` | Used to print all information output by the show_conf UF command, in addition to description, usage, data type, default value and range. | ``rechunkThreshold,90,true,RND,regular`` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``show_md_flag UF`` | Used to show a specific flag/all flags stored in the metadata. |* Example 1: ``* master=> ALTER SYSTEM SET heartbeatTimeout=111;`` | +| | |* Example 2: ``* master=> select show_md_flag(‘all’); heartbeatTimeout,111`` | +| | |* Example 3: ``* master=> select show_md_flag(‘heartbeatTimeout’); heartbeatTimeout,111`` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``ALTER SYSTEM SET `` | Used for storing or modifying flag attributes in the metadata. | ``ALTER SYSTEM SET `` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ +| ``ALTER SYSTEM RESET `` | Used to remove a flag or all flag attributes from the metadata. | ``ALTER SYSTEM RESET `` | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+ \ No newline at end of file diff --git a/configuration_guides/current_method_flag_types.rst b/configuration_guides/current_method_flag_types.rst new file mode 100644 index 000000000..6135268c2 --- /dev/null +++ b/configuration_guides/current_method_flag_types.rst @@ -0,0 +1,95 @@ +.. _current_method_flag_types: + +******* +Workers +******* + +Workers can be individually configured using the :ref:`worker configuration file`, which allows for persistent modifications to be made. Persistent modification refers to changes made to a system or component that are saved and retained even after the system is restarted or shut down, allowing the modifications to persist over time. + +It is worth noting that the worker configuration file is not subject to frequent changes on a daily basis, providing stability to the system's configuration. + + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag Name + - Who May Configure + - Description + - Data Type + - Default Value + * - ``cudaMemQuota`` + - SUPERUSER + - Sets the percentage of total device memory used by your instance of SQream. + - uint + - ``90`` + * - ``healerDetectionFrequencySeconds`` + - SUPERUSER + - Determines the default frequency for the healer to check that its conditions are met. + - + - ``86,400`` (seconds) + * - ``isHealerOn`` + - SUPERUSER + - Enables the Query Healer, which periodically examines the progress of running statements and logs statements exceeding the ``healerMaxInactivityHours`` flag setting. + - boolean + - ``TRUE`` + * - ``logFormat`` + - SUPERUSER + - Determines the file format of the log files. Format may by ``csv``, ``json``, or both (all logs will be written saved both as ``csv`` and ``json`` files) + - string + - ``csv`` + * - ``loginMaxRetries`` + - SUPERUSER + - Sets the permitted log-in attempts. + - bigint + - ``5`` + * - ``machineIP`` + - SUPERUSER + - Enables you to manually set the reported IP. + - string + - ``127.0.0.1`` + * - ``maxConnectionInactivitySeconds`` + - SUPERUSER + - Determines the maximum period of session idleness, after which the connection is terminated. + - bigint + - ``86,400`` (seconds) + * - ``maxConnections`` + - SUPERUSER + - Defines the maximum allowed connections per Worker. + - bigint + - ``1000`` + * - ``metadataServerPort`` + - SUPERUSER + - Sets the port used to connect to the metadata server. SQream recommends using port ranges above 1024 because ports below 1024 are usually reserved, although there are no strict limitations. You can use any positive number (1 - 65535) while setting this flag. + - uint + - ``3105`` + * - ``useConfigIP`` + - SUPERUSER + - Activates the machineIP (``TRUE``). Setting this flag to ``FALSE`` ignores the machineIP and automatically assigns a local network IP. This cannot be activated in a cloud scenario (on-premises only). + - boolean + - ``FALSE`` + + + * - ``healerMaxInactivityHours`` + - SUPERUSER + - Used for defining the threshold for creating a log recording a slow statement. The log includes information about the log memory, CPU and GPU. + - bigint + - ``5`` + * - ``limitQueryMemoryGB`` + - Anyone + - Prevents a query from processing more memory than the defined value. + - uint + - ``100000`` + + + + + + + + + + + + + diff --git a/configuration_guides/current_method_modification_methods.rst b/configuration_guides/current_method_modification_methods.rst new file mode 100644 index 000000000..ab07c1a6c --- /dev/null +++ b/configuration_guides/current_method_modification_methods.rst @@ -0,0 +1,123 @@ +.. _current_method_modification_methods: + +************************** +Modification Methods +************************** + +.. _modifying_your_configuration_using_the_worker_configuration_file: + +Worker Configuration File +-------------------------- + +You can modify your configuration using the **worker configuration file (config.json)**. Changes that you make to worker configuration files are persistent. Note that you can only set the attributes in your worker configuration file **before** initializing your SQream worker, and while your worker is active these attributes are read-only. + +The following is an example of the default worker configuration file: + +.. code-block:: json + + { + "cluster": "/home/sqream/sqream_storage", + "cudaMemQuota": 96, + "gpu": 0, + "cpu": -1, + "legacyConfigFilePath": "sqream_config_legacy.json", + "licensePath": "/etc/sqream/license.enc", + "limitQueryMemoryGB": 30, + "parquetReaderThreads" : 8, + "machineIP": "127.0.0.1", + "metadataServerIp": "127.0.0.1", + "metadataServerPort": 3105, + "port": 5000, + "instanceId": "sqream_0_1", + "portSsl": 5100, + "initialSubscribedServices": "sqream", + "useConfigIP": true + "logFormat": "csv","json" + } + +You can access the legacy configuration file from the ``legacyConfigFilePath`` parameter shown above. If all (or most) of your workers require the same flag settings, you can set the ``legacyConfigFilePath`` attribute to the same legacy file. + +.. _modifying_your_configuration_using_a_legacy_configuration_file: + +Cluster and Session Configuration File +-------------------------------------- + +You can modify your configuration using a legacy configuration file. + +The Legacy configuration file provides access to the read/write flags. A link to this file is provided in the **legacyConfigFilePath** parameter in the worker configuration file. + +The following is an example of the default cluster and session configuration file: + +.. code-block:: json + + { + "diskSpaceMinFreePercent": 1, + "DefaultPathToLogs": "/home/sqream/sqream_storage/tmp_logs/", + "enableLogDebug": false, + "insertCompressors": 8, + "insertParsers": 8, + "isUnavailableNode": false, + "logBlackList": "webui", + "logDebugLevel": 6, + "nodeInfoLoggingSec": 60, + "useClientLog": true, + "useMetadataServer": true, + "spoolMemoryGB": 28, + "logMaxFileSizeMB": 20, + "logFileRotateTimeFrequency": "daily", + "waitForClientSeconds": 18000 + } + + +Metadata Configuration File +--------------------------- + +When attempting to free up disk space in Oracle Object Store by executing ``DELETE``, ``cleanup``, ``TRUNCATE``, or ``DROP``, ensure that the following four flags are consistently configured in both the :ref:`Worker` and Metadata configuration files. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag Name + - Who May Configure + - Description + - Data Type + - Default Value + * - ``OciBaseRegion`` + - SUPERUSER + - Sets your Oracle Cloud Infrastructure (OCI) region + - String + - NA + * - ``OciVerifySsl`` + - SUPERUSER + - Controls whether SSL certificates are verified. By default, verification is enabled. To disable it, set the variable to ``FALSE`` + - boolean + - ``TRUE`` + * - ``OciAccessKey`` + - SUPERUSER + - Sets your Oracle Cloud Infrastructure (OCI) access key + - String + - NA + * - ``OciAccessSecret`` + - SUPERUSER + - Sets your Oracle Cloud Infrastructure (OCI) access secret + - String + - NA + +The following is an example of the metadata configuration file: + +.. code-block:: json + + { + "OciBaseRegion": "us-ashburn-1", + "OciVerifySsl": false, + "OciAccessKey": "587f59dxxxxxxxxxxxxxxxxxxxxxxxxx", + "OciAccessSecret": "LrSEb+RZgxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" + } + + + + + + + diff --git a/configuration_guides/current_method_showing_all_flags_in_the_catalog_table.rst b/configuration_guides/current_method_showing_all_flags_in_the_catalog_table.rst new file mode 100644 index 000000000..f47432e64 --- /dev/null +++ b/configuration_guides/current_method_showing_all_flags_in_the_catalog_table.rst @@ -0,0 +1,24 @@ +:orphan: + +.. _current_method_showing_all_flags_in_the_catalog_table: + +************************************** +Showing All Flags in the Catalog Table +************************************** + +SQream uses the **sqream_catalog.parameters** catalog table for showing all flags, providing the scope (default, cluster and session), description, default value and actual value. + +The following is the correct syntax for a catalog table query: + +.. code-block:: sql + + SELECT * FROM sqream_catalog.parameters; + +The following is an example of a catalog table query: + +.. code-block:: console + + externalTableBlobEstimate, 100, 100, default, + ascii, ascii, default, Changes the expected encoding for Varchar columns + useCrcForTextJoinKeys, true, true, default, + hiveStyleImplicitStringCasts, false, false, default, diff --git a/configuration_guides/default_graceful_shutdown_timeout_minutes.rst b/configuration_guides/default_graceful_shutdown_timeout_minutes.rst new file mode 100644 index 000000000..20e27302c --- /dev/null +++ b/configuration_guides/default_graceful_shutdown_timeout_minutes.rst @@ -0,0 +1,13 @@ +:orphan: + +.. _default_graceful_shutdown_timeout_minutes: + +***************************************** +DEFAULT GRACEFUL SHUTDOWN TIMEOUT MINUTES +***************************************** + +The ``defaultGracefulShutdownTimeoutMinutes`` flag determines the duration of the default grace period for shutting down the system before forcefully terminating it. + +* **Data type** - size_t +* **Default value** - ``5`` +* **Allowed values** - 1-4000000000 \ No newline at end of file diff --git a/configuration_guides/developer_mode.rst b/configuration_guides/developer_mode.rst index fbb6c0cec..ebd9873f0 100644 --- a/configuration_guides/developer_mode.rst +++ b/configuration_guides/developer_mode.rst @@ -1,8 +1,11 @@ +:orphan: + .. _developer_mode: -************************* +********************************** Enabling Modification of R&D Flags -************************* +********************************** + The ``developerMode`` flag enables modifying R&D flags. The following describes the ``developerMode`` flag: diff --git a/configuration_guides/enable_device_debug_messages.rst b/configuration_guides/enable_device_debug_messages.rst index c7d340b65..9f83211e7 100644 --- a/configuration_guides/enable_device_debug_messages.rst +++ b/configuration_guides/enable_device_debug_messages.rst @@ -1,8 +1,11 @@ +:orphan: + .. _enable_device_debug_messages: -************************* +**************************************** Checking for Post-Production CUDA Errors -************************* +**************************************** + The ``enableDeviceDebugMessages`` flag checks for CUDA errors after producing each chunk. The following describes the ``enableDeviceDebugMessages`` flag: diff --git a/configuration_guides/enable_log_debug.rst b/configuration_guides/enable_log_debug.rst index 1566c4882..c8e922ffc 100644 --- a/configuration_guides/enable_log_debug.rst +++ b/configuration_guides/enable_log_debug.rst @@ -1,8 +1,11 @@ +:orphan: + .. _enable_log_debug: -************************* -Enabling Modification of clientLogger_debug File -************************* +************************************************ +Enabling Modification of ClientLogger Debug File +************************************************ + The ``enableLogDebug`` flag enables creating and logging in the **clientLogger_debug** file. The following describes the ``enableLogDebug`` flag: diff --git a/configuration_guides/enable_nv_prof_markers.rst b/configuration_guides/enable_nv_prof_markers.rst index 9edbf28e3..8689b8bd1 100644 --- a/configuration_guides/enable_nv_prof_markers.rst +++ b/configuration_guides/enable_nv_prof_markers.rst @@ -1,8 +1,11 @@ +:orphan: + .. _enable_nv_prof_markers: -************************* +************************************** Activating the NVidia Profiler Markers -************************* +************************************** + The ``enableNvprofMarkers`` flag activates the NVidia Profiler (nvprof) markers. The following describes the ``enableNvprofMarkers`` flag: diff --git a/configuration_guides/end_log_message.rst b/configuration_guides/end_log_message.rst index 46a1c71ae..088cc1e30 100644 --- a/configuration_guides/end_log_message.rst +++ b/configuration_guides/end_log_message.rst @@ -1,8 +1,11 @@ +:orphan: + .. _end_log_message: -************************* +************************************ Appending String at End of Log Lines -************************* +************************************ + The ``endLogMessage`` flag appends a string at the end of each log line. The following describes the ``endLogMessage`` flag: diff --git a/configuration_guides/extent_storage_file_size_mb.rst b/configuration_guides/extent_storage_file_size_mb.rst index dd4cddec7..84d762563 100644 --- a/configuration_guides/extent_storage_file_size_mb.rst +++ b/configuration_guides/extent_storage_file_size_mb.rst @@ -1,8 +1,11 @@ +:orphan: + .. _extent_storage_file_size_mb: -************************* +*********************************************** Setting Minimum Extent Size for Bulk Table Data -************************* +*********************************************** + The ``extentStorageFileSizeMB`` flag sets the minimum size in mebibytes of extents for bulk table data. The following describes the ``extentStorageFileSizeMB`` flag: diff --git a/configuration_guides/flip_join_order.rst b/configuration_guides/flip_join_order.rst index 341f12ada..72aa365a8 100644 --- a/configuration_guides/flip_join_order.rst +++ b/configuration_guides/flip_join_order.rst @@ -1,8 +1,11 @@ +:orphan: + .. _flip_join_order: -************************* +************************************** Flipping Join Order to Force Equijoins -************************* +************************************** + The ``flipJoinOrder`` flag reorders join to force equijoins and/or equijoins sorted by table size. The following describes the ``flipJoinOrder`` flag: diff --git a/configuration_guides/gather_mem_stat.rst b/configuration_guides/gather_mem_stat.rst index 802e12b1f..53fe275b3 100644 --- a/configuration_guides/gather_mem_stat.rst +++ b/configuration_guides/gather_mem_stat.rst @@ -1,8 +1,11 @@ +:orphan: + .. _gather_mem_stat: -************************* +************************************************* Monitoring and Printing Pinned Allocation Reports -************************* +************************************************* + The ``gatherMemStat`` flag monitors all pinned allocations and all **memcopies** to and from a device, and prints a report of pinned allocations that were not **memcopied** to and from the device using the **dump_pinned_misses** utility function. The following describes the ``gatherMemStat`` flag: diff --git a/configuration_guides/generic_flags.rst b/configuration_guides/generic_flags.rst deleted file mode 100644 index 2f17e8202..000000000 --- a/configuration_guides/generic_flags.rst +++ /dev/null @@ -1,17 +0,0 @@ -.. _generic_flags: - -************************* -Generic Flags -************************* - -The **Generic Flags** page describes the following flag types, which can be modified by standard users on a session basis: - -.. toctree:: - :maxdepth: 1 - :glob: - - generic_regular_flags - generic_cluster_flags - generic_worker_flags - - diff --git a/configuration_guides/generic_regular_flags.rst b/configuration_guides/generic_regular_flags.rst deleted file mode 100644 index f8235ae07..000000000 --- a/configuration_guides/generic_regular_flags.rst +++ /dev/null @@ -1,21 +0,0 @@ -.. _generic_regular_flags: - -************************* -Regular Generic Flags -************************* - -The **Regular Generic Flags** page describes **Regular** modification type flags, which can be modified by standard users on a session basis: - -* `Flipping Join Order to Force Equijoins `_ -* `Determining Client Level `_ -* `Setting CPU to Compress Defined Columns `_ -* `Setting Query Memory Processing Limit `_ -* `Setting the Spool Memory `_ -* `Setting Cache Partitions `_ -* `Setting Cache Flushing `_ -* `Setting InMemory Spool Memory `_ -* `Setting Disk Spool Memory `_ -* `Setting Spool Saved File Directory Location `_ -* `Setting Data Stored Persistently on Cache `_ -* `Setting Persistent Spool Saved File Directory Location `_ -* `Setting Session Tag Name `_ \ No newline at end of file diff --git a/configuration_guides/generic_worker_flags.rst b/configuration_guides/generic_worker_flags.rst deleted file mode 100644 index 97cee4ecf..000000000 --- a/configuration_guides/generic_worker_flags.rst +++ /dev/null @@ -1,8 +0,0 @@ -.. _generic_worker_flags: - -************************* -Worker Generic Flags -************************* -The Worker Generic Flags** page describes **Worker** modification type flags, which can be modified by standard users on a session basis: - - * `Persisting Your Cache Directory `_ \ No newline at end of file diff --git a/configuration_guides/graceful_shutdown.rst b/configuration_guides/graceful_shutdown.rst new file mode 100644 index 000000000..839170ce5 --- /dev/null +++ b/configuration_guides/graceful_shutdown.rst @@ -0,0 +1,23 @@ +:orphan: + +.. _graceful_shutdown: + +************************************ +Setting the Graceful Server Shutdown +************************************ + +The ``defaultGracefulShutdownTimeoutMinutes`` flag is used for setting the amount of time to pass before SQream performs a graceful server shutdown. + +The following describes the ``defaultGracefulShutdownTimeoutMinutes`` flag: + +* **Data type** - size_t +* **Default value** - ``5`` +* **Allowed values** - 1-4000000000 + +For more information, see :ref:`shutdown_server_command`. + +For related flags, see the folowing: + +* :ref:`is_healer_on` + +* :ref:`current_method_flag_types` \ No newline at end of file diff --git a/configuration_guides/healer_action_cleanup_connection.rst b/configuration_guides/healer_action_cleanup_connection.rst new file mode 100644 index 000000000..2d0c9f7e1 --- /dev/null +++ b/configuration_guides/healer_action_cleanup_connection.rst @@ -0,0 +1,13 @@ +:orphan: + +.. _healer_action_cleanup_connection: + +******************************** +HEALER ACTION CLEANUP CONNECTION +******************************** + +The ``healerActionCleanupConnection`` enables the automatic cleanup of connections during the execution of healer actions, ensuring the proper management and termination of connections in the system. + +* **Data type** - Boolean +* **Default value** - ``true`` +* **Allowed values** - ``true`` or ``false`` \ No newline at end of file diff --git a/configuration_guides/healer_action_graceful_shutdown.rst b/configuration_guides/healer_action_graceful_shutdown.rst new file mode 100644 index 000000000..2de9bb81c --- /dev/null +++ b/configuration_guides/healer_action_graceful_shutdown.rst @@ -0,0 +1,13 @@ +:orphan: + +.. _healer_action_graceful_shutdown: + +******************************* +HEALER ACTION GRACEFUL SHUTDOWN +******************************* + +The ``healerActionGracefulShutdown`` sets the option for performing a graceful shutdown during the execution of healer actions, allowing ongoing operations to complete before terminating the process or system. + +* **Data type** - Boolean +* **Default value** - ``false`` +* **Allowed values** - ``true`` or ``false`` \ No newline at end of file diff --git a/configuration_guides/healer_detection_frequency_seconds.rst b/configuration_guides/healer_detection_frequency_seconds.rst new file mode 100644 index 000000000..22283272e --- /dev/null +++ b/configuration_guides/healer_detection_frequency_seconds.rst @@ -0,0 +1,15 @@ +:orphan: + +.. _healer_detection_frequency_seconds: + +********************************** +Healer Detection Frequency Seconds +********************************** + +The ``healerDetectionFrequencySeconds`` flag determines the default frequency for the healer to check that its conditions are met. + +* **Data type** - size_t +* **Default value** - ``60*60*24`` +* **Allowed values** - 1-4000000000 + +For related flags, see :ref:`is_healer_on`. \ No newline at end of file diff --git a/configuration_guides/healer_run_action_automatically.rst b/configuration_guides/healer_run_action_automatically.rst new file mode 100644 index 000000000..b1a69fed5 --- /dev/null +++ b/configuration_guides/healer_run_action_automatically.rst @@ -0,0 +1,13 @@ +:orphan: + +.. _healer_run_action_automatically: + +******************************* +HEALER RUN ACTION AUTOMATICALLY +******************************* + +The ``healerRunActionAutomatically`` flag determines whether the healer component automatically executes actions to resolve issues or requires manual intervention for each action. + +* **Data type** - Boolean +* **Default value** - ``true`` +* **Allowed values** - ``true`` or ``false`` \ No newline at end of file diff --git a/configuration_guides/increase_chunk_size_before_reduce.rst b/configuration_guides/increase_chunk_size_before_reduce.rst index 982d3db35..8768f124e 100644 --- a/configuration_guides/increase_chunk_size_before_reduce.rst +++ b/configuration_guides/increase_chunk_size_before_reduce.rst @@ -1,8 +1,11 @@ +:orphan: + .. _increase_chunk_size_before_reduce: -************************* +******************************************* Increasing Chunk Size to Reduce Query Speed -************************* +******************************************* + The ``increaseChunkSizeBeforeReduce`` flag increases the chunk size to reduce query speed. The following describes the ``increaseChunkSizeBeforeReduce`` flag: diff --git a/configuration_guides/increase_mem_factors.rst b/configuration_guides/increase_mem_factors.rst index 166a57e14..ac58ae393 100644 --- a/configuration_guides/increase_mem_factors.rst +++ b/configuration_guides/increase_mem_factors.rst @@ -1,8 +1,11 @@ +:orphan: + .. _increase_mem_factors: -************************* -Adding Rechunker before Expensing Chunk Producer -************************* +************************************************ +Adding Rechunker Before Expensing Chunk Producer +************************************************ + The ``increaseMemFactors`` flag adds a rechunker before expensive chunk producer. The following describes the ``increaseMemFactors`` flag: diff --git a/configuration_guides/index.rst b/configuration_guides/index.rst index 6a65853f0..cf89a57ce 100644 --- a/configuration_guides/index.rst +++ b/configuration_guides/index.rst @@ -10,9 +10,6 @@ The **Configuration Guides** page describes the following configuration informat :maxdepth: 1 :glob: - spooling - configuration_methods - configuration_flags - - - + configuring_sqream + ldap + sso \ No newline at end of file diff --git a/configuration_guides/is_healer_on.rst b/configuration_guides/is_healer_on.rst new file mode 100644 index 000000000..633b14971 --- /dev/null +++ b/configuration_guides/is_healer_on.rst @@ -0,0 +1,15 @@ +:orphan: + +.. _is_healer_on: + +************ +Is Healer On +************ + +The ``isHealerOn`` flag enables the Query Healer, which periodically examines the progress of running statements and logs statements exceeding the ``maxStatementInactivitySeconds`` flag setting. + +* **Data type** - boolean +* **Default value** - ``true`` +* **Allowed values** - ``true``, ``false`` + +For related flags, see :ref:`max_statement_inactivity_seconds`. \ No newline at end of file diff --git a/configuration_guides/ldap.rst b/configuration_guides/ldap.rst new file mode 100644 index 000000000..6e5b5f3e7 --- /dev/null +++ b/configuration_guides/ldap.rst @@ -0,0 +1,331 @@ +.. _ldap: + +**** +LDAP +**** + +Lightweight Directory Access Protocol (LDAP) is an authentication management service used with Microsoft Active Directory and other directory services. + +Once LDAP has been configured as an authentication service for SQreamDB, authentication for all existing and newly added roles is handled by an LDAP server. The exception for this rule is the out-of-the-box administrative ``sqream`` role, which will always use the conventional SQreamDB authentication instead LDAP authentication. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +* If SQreamDB is being installed within an environment where LDAP is already configured, it is best practice to ensure that the newly created SQreamDB role names are consistent with existing LDAP user names. + +* When setting up LDAP for an existing SQreamDB installation, it's recommended to ensure that newly created LDAP usernames match existing SQreamDB role names. If SQreamDB roles were not configured in LDAP or have different names, they'll be recreated in SQreamDB as roles without login capabilities, permissions, or default schemas. + +Setting LDAP Authentication Management +====================================== + +To set LDAP authentication for SQreamDB, choose one of the following configuration methods: + +.. contents:: + :local: + :depth: 1 + +Basic Method +------------ + +A traditional approach to authentication in which the user provides a username and password combination to authenticate with the LDAP server. In this approach, all users are given access to SQream. + +Flags +^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag + - Description + * - ``authenticationMethod`` + - Configure an authentication method: ``sqream`` or ``ldap``. To configure LDAP authentication, choose ``ldap`` + * - ``ldapIpAddress`` + - Configure the IP address or the Fully Qualified Domain Name (FQDN) of your LDAP server and select a protocol: ``ldap`` or ``ldaps``. Sqream recommends using the encrypted ``ldaps`` protocol + * - ``ldapConnTimeoutSec`` + - Configure the LDAP connection timeout threshold (seconds). Default = 30 seconds + * - ``ldapPort`` + - LDAP server port number. + * - ``ldapAdvancedMode`` + - Configure either basic or advanced authentication method. Default = ``false`` + * - ``ldapPrefix`` + - String to prefix to the user name when forming the DN to bind as, when doing simple bind authentication + * - ``ldapSuffix`` + - String to append to the user name when forming the DN to bind as, when doing simple bind authentication + + +Basic Method Configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Only roles with admin privileges or higher may enable LDAP Authentication. + +1. Set the ``authenticationMethod`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET authenticationMethod = 'ldap'; + +2. Set the ``ldapIpAddress`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapIpAddress = ''; + +3. Set the ``ldapPrefix`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapPrefix = '='; + +4. Set the ``ldapSuffix`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapSuffix = ''; + +5. To set the ``ldapPort`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapPort = + +6. To set the ``ldapConnTimeoutSec`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapConnTimeoutSec = <15>; + +7. Restart all sqreamd servers. + +Example +^^^^^^^ + +After completing the setup above, we can bind to a user by a distinguished name. For example, if the DN of the user is: + +.. code-block:: postgres + + CN=ElonMusk,OU=Sqream Users,DC=sqream,DC=loc + +We could set the ldapPrefix and ldapSuffix to + +.. code-block:: postgres + + ALTER SYSTEM SET ldapPrefix = 'CN='; + + ALTER SYSTEM SET ldapSuffix = ',OU=Sqream Users,DC=sqream,DC=loc'; + +Logging in will be possible using the username ElonMusk using sqream client + +.. code-block:: postgres + + ./sqream sql --username=ElonMusk --password=sqream123 --databasename=master --port=5000 + +Advanced Method +--------------- + +This method lets users be grouped into categories. Each category can then be given or denied access to SQreamDB, giving administrators control over access. + +Flags +^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag + - Description + * - ``authenticationMethod`` + - Configure an authentication method: ``sqream`` or ``ldap``. To configure LDAP authentication, choose ``ldap`` + * - ``ldapIpAddress`` + - Configure the IP address or the Fully Qualified Domain Name (FQDN) of your LDAP server and select a protocol: ``ldap`` or ``ldaps``. Sqream recommends using the encrypted ``ldaps`` protocol + * - ``ldapConnTimeoutSec`` + - Configure the LDAP connection timeout threshold (seconds). Default = 30 seconds + * - ``ldapPort`` + - LDAP server port number + * - ``ldapAdvancedMode`` + - Set ``ldapAdvancedMode`` = ``true`` + * - ``ldapBaseDn`` + - Root DN to begin the search for the user in, when doing advanced authentication + * - ``ldapBindDn`` + - DN of user with which to bind to the directory to perform the search when doing search + bind authentication + * - ``ldapSearchAttribute`` + - Attribute to match against the user name in the search when doing search + bind authentication. If no attribute is specified, ``the uid`` attribute will be used + * - ``ldapSearchFilter`` + - Filters ``ldapAdvancedMode`` authentication. ``ALTER SYSTEM SET ldapSearchFilter = '(=)(=)(…)';`` + * - ``ldapGetAttributeList`` + - Enables you to include LDAP user attributes, as they appear in LDAP, in your SQreamDB metadata. After having set this flag, you may execute the :ref:`ldap_get_attr` utility function which will show you the attribute values associated with each SQreamDB role. + + +Preparing LDAP Users +^^^^^^^^^^^^^^^^^^^^ + +If installing SQreamDB in an environment with LDAP already set up, it's best to ensure the new SQreamDB role names match the existing LDAP user names. + +It is also recommended to: + +* Group Active Directory users so that they may be filtered during setup, using the ``ldapSearchFilter`` flag. + +* Provide a unique attribute to each user name, such as an employee ID, to be easily searched for when using the ``ldapSearchAttribute`` flag. + +Preparing SQreamDB Roles +^^^^^^^^^^^^^^^^^^^^^^^^ + +For a SQreamDB admin to be able to manage role permissions, for every Active Directory user connecting to SQreamDB, there must be an existing SQreamDb role name that is consistent with existing LDAP user names. + +You may either :ref:`rename SQream roles` or create new roles, such as in the following example: + +1. Create a new role: + + .. code-block:: postgres + + CREATE ROLE role12345; + +2. Grant the new role login permission: + + .. code-block:: postgres + + GRANT LOGIN TO role12345; + +3. Grant the new role ``CONNECT`` permission: + + .. code-block:: postgres + + GRANT CONNECT ON DATABASE master TO role12345; + +Advanced Method Configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Only roles with admin privileges and higher may enable LDAP Authentication. + +1. Configure your LDAP server bind password to be stored in SQreamDB metadata: + + .. code-block:: postgres + + GRANT PASSWORD <'binding_user_password'> TO ldap_bind_dn_admin_password; + + This action emulates the execution of a ``GRANT`` command, but it's solely necessary for configuring the password. Note that ``ldap_bind_dn_admin_password`` is not an actual SQreamDB role. This password is encrypted within your SQreamDB metadata. + +2. Set the ``authenticationMethod`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET authenticationMethod = 'ldap'; + +3. Set the ``ldapAdvancedMode`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapAdvancedMode = true; + +4. Set the ``ldapIpAddress`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapIpAddress = ''; + +5. Set the ``ldapBindDn`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapBindDn = ; + +6. Set the ``ldapBaseDn`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapBaseDn = ''; + +7. Set the ``ldapSearchAttribute`` flag: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapSearchAttribute = ''; + +8. To set the ``ldapSearchFilter`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapSearchFilter = '(=)(=)[...]'; + +9. To set the ``ldapPort`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapPort = + +10. To set the ``ldapConnTimeoutSec`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapConnTimeoutSec = <15>; + +11. To set the ``ldapGetAttributeList`` flag (optional), run: + + .. code-block:: postgres + + ALTER SYSTEM SET ldapGetAttributeList = <'ldap_attribute1'>,<'ldap_attribute2'>,<'ldap_attribute3'>,[,...]; + + a. To see the LDAP user attributes associated with SQreamDB roles in your metadata, execute the :ref:`ldap_get_attr` utility function. + +12. Restart all sqreamd servers. + +Example +^^^^^^^ + +After completing the setup above we can try to bind to a user by locating it by one of its unique attributes. + +User DN = + +.. code-block:: postgres + + CN=ElonMusk,OU=Sqream Users,DC=sqream,DC=loc + +User has value of elonm for attribute ``sAMAccountName``. + + +.. code-block:: postgres + + GRANT PASSWORD 'LdapPassword12#4%' TO ldap_bind_dn_admin_password; + + ALTER SYSTEM SET authenticationMethod = 'ldap'; + + ALTER SYSTEM SET ldapAdvancedMode = true; + + ALTER SYSTEM SET ldapIpAddress = 'ldaps://192.168.10.20'; + + ALTER SYSTEM SET ldapPort = 5000 + + ALTER SYSTEM SET ldapBindDn = 'CN=LDAP admin,OU=network admin,DC=sqream,DC=loc'; + + ALTER SYSTEM SET ldapBaseDn = 'OU=Sqream Users,DC=sqream,DC=loc'; + + ALTER SYSTEM SET ldapSearchAttribute = 'sAMAccountName'; + + ALTER SYSTEM SET ldapConnTimeoutSec = 30; + + ALTER SYSTEM SET ldapSearchFilter = "(memberOf=CN=SqreamGroup,CN=Builtin,DC=sqream,DC=loc)(memberOf=CN=Admins,CN=Builtin,DC=sqream,DC=loc)"; + + +Logging in will be possible using the username elonm using sqream client + +.. code-block:: postgres + + ./sqream sql --username=elonm --password= --databasename=master --port=5000 + + +Disabling LDAP Authentication +============================= + +To disable LDAP authentication and configure sqream authentication: + +1. Execute the following syntax: + + .. code-block:: postgres + + ALTER SYSTEM SET authenticationMethod = 'sqream'; + +2. Restart all sqreamd servers. diff --git a/configuration_guides/level_db_write_buffer_size.rst b/configuration_guides/level_db_write_buffer_size.rst index c3cd60516..53e4cad57 100644 --- a/configuration_guides/level_db_write_buffer_size.rst +++ b/configuration_guides/level_db_write_buffer_size.rst @@ -1,3 +1,5 @@ +:orphan: + .. _level_db_write_buffer_size: ************************* diff --git a/configuration_guides/limit_query_memory_gb.rst b/configuration_guides/limit_query_memory_gb.rst index 7099674f2..2208954bd 100644 --- a/configuration_guides/limit_query_memory_gb.rst +++ b/configuration_guides/limit_query_memory_gb.rst @@ -1,8 +1,11 @@ +:orphan: + .. _limit_query_memory_gb: -************************* +************************************* Setting Query Memory Processing Limit -************************* +************************************* + The ``limitQueryMemoryGB`` flag prevents a query from processing more memory than the defined value. The following describes the ``limitQueryMemoryGB`` flag: diff --git a/configuration_guides/log_sys_level.rst b/configuration_guides/log_sys_level.rst index 07e1e5800..23b22da72 100644 --- a/configuration_guides/log_sys_level.rst +++ b/configuration_guides/log_sys_level.rst @@ -1,3 +1,5 @@ +:orphan: + .. _log_sys_level: ************************* diff --git a/configuration_guides/login_max_retries.rst b/configuration_guides/login_max_retries.rst new file mode 100644 index 000000000..2d7161a83 --- /dev/null +++ b/configuration_guides/login_max_retries.rst @@ -0,0 +1,14 @@ +:orphan: + +.. _login_max_retries: + +*********************************** +Adjusting Permitted Log-in Attempts +*********************************** + +The ``loginMaxRetries`` flag sets the permitted log-in attempts. + +The following describes the ``loginMaxRetries`` flag: + +* **Data type** - size_t +* **Default value** - ``5`` \ No newline at end of file diff --git a/configuration_guides/machine_ip.rst b/configuration_guides/machine_ip.rst index 66a8a9a10..8aaeb79bc 100644 --- a/configuration_guides/machine_ip.rst +++ b/configuration_guides/machine_ip.rst @@ -1,8 +1,11 @@ +:orphan: + .. _machine_ip: -************************* +************************************* Enabling Manually Setting Reported IP -************************* +************************************* + The ``machineIP`` flag enables you to manually set the reported IP. The following describes the ``machineIP`` flag: diff --git a/configuration_guides/max_avg_blob_size_to_compress_on_gpu.rst b/configuration_guides/max_avg_blob_size_to_compress_on_gpu.rst index 1081e0225..dd57d341f 100644 --- a/configuration_guides/max_avg_blob_size_to_compress_on_gpu.rst +++ b/configuration_guides/max_avg_blob_size_to_compress_on_gpu.rst @@ -1,8 +1,11 @@ +:orphan: + .. _max_avg_blob_size_to_compress_on_gpu: -************************* +*************************************** Setting CPU to Compress Defined Columns -************************* +*************************************** + The ``maxAvgBlobSizeToCompressOnGpu`` flag sets the CPU to compress columns with size above * . The following describes the ``maxAvgBlobSizeToCompressOnGpu`` flag: diff --git a/configuration_guides/max_connection_inactivity_seconds.rst b/configuration_guides/max_connection_inactivity_seconds.rst new file mode 100644 index 000000000..fab0e5951 --- /dev/null +++ b/configuration_guides/max_connection_inactivity_seconds.rst @@ -0,0 +1,14 @@ +:orphan: + +.. _max_connection_inactivity_seconds: + +********************************* +MAX CONNECTION INACTIVITY SECONDS +********************************* + +The ``maxConnectionInactivitySeconds`` determines the maximum period of session idleness, after which the connection is terminated. + +* **Data type** - size_t +* **Default value** - ``60*60*24 - 86400`` +* **Allowed values** - ``1-4000000000`` + diff --git a/configuration_guides/max_connections.rst b/configuration_guides/max_connections.rst new file mode 100644 index 000000000..0f4ea10b2 --- /dev/null +++ b/configuration_guides/max_connections.rst @@ -0,0 +1,14 @@ +:orphan: + +.. _max_connections: + +******************************** +MAX CONNECTIONS +******************************** + +The ``maxConnections`` parameter is optional and sets the maximum number of connections allowed per Worker. Once a Worker reaches its maximum capacity, any new connections are directed to other available Workers. + +* **Data type** - size_t +* **Default value** - ``1000`` +* **Allowed values** - ``1-∞`` + diff --git a/configuration_guides/max_statement_inactivity_seconds.rst b/configuration_guides/max_statement_inactivity_seconds.rst new file mode 100644 index 000000000..7c4d148a2 --- /dev/null +++ b/configuration_guides/max_statement_inactivity_seconds.rst @@ -0,0 +1,14 @@ +:orphan: + +.. _max_statement_inactivity_seconds: + +******************************** +MAX STATEMENT INACTIVITY SECONDS +******************************** + +The ``maxStatementInactivitySeconds`` parameter is optional and determines the maximum duration of statement inactivity before terminating the Worker. Its behavior is contingent on the configuration of ``healerActionGracefulShutdown``. If set to ``true``, it triggers a graceful Worker shutdown; if set to ``false``, the Worker continues without shutting down, accompanied by relevant log information regarding the stuck query. + +* **Data type** - size_t +* **Default value** - ``5*60*60 seconds (18000)`` +* **Allowed values** - ``1-4000000000`` + diff --git a/configuration_guides/memory_reset_trigger_mb.rst b/configuration_guides/memory_reset_trigger_mb.rst index bb8a383bb..b9952670b 100644 --- a/configuration_guides/memory_reset_trigger_mb.rst +++ b/configuration_guides/memory_reset_trigger_mb.rst @@ -1,8 +1,11 @@ +:orphan: + .. _memory_reset_trigger_mb: -************************* +*********************************** Setting Memory Used to Abort Server -************************* +*********************************** + The ``memoryResetTriggerMB`` flag sets the size of memory used during a query to trigger aborting the server. The following describes the ``memoryResetTriggerMB`` flag: diff --git a/configuration_guides/metadata_server_port.rst b/configuration_guides/metadata_server_port.rst index 20f9a2db8..0600a3a0e 100644 --- a/configuration_guides/metadata_server_port.rst +++ b/configuration_guides/metadata_server_port.rst @@ -1,8 +1,11 @@ +:orphan: + .. _metadata_server_port: -************************* +************************************************** Setting Port Used for Metadata Server Connection -************************* +************************************************** + The ``metadataServerPort`` flag sets the port used to connect to the metadata server. SQream recommends using port ranges above 1024 because ports below 1024 are usually reserved, although there are no strict limitations. You can use any positive number (1 - 65535) while setting this flag. The following describes the ``metadataServerPort`` flag: diff --git a/configuration_guides/mt_read.rst b/configuration_guides/mt_read.rst index 4aca17185..0c54ec07d 100644 --- a/configuration_guides/mt_read.rst +++ b/configuration_guides/mt_read.rst @@ -1,8 +1,11 @@ +:orphan: + .. _mt_read: -************************* +********************************************** Splitting Large Reads for Concurrent Execution -************************* +********************************************** + The ``mtRead`` flag splits large reads into multiple smaller ones and executes them concurrently. The following describes the ``mtRead`` flag: diff --git a/configuration_guides/mt_read_workers.rst b/configuration_guides/mt_read_workers.rst index 5f18fd4b3..033bf759e 100644 --- a/configuration_guides/mt_read_workers.rst +++ b/configuration_guides/mt_read_workers.rst @@ -1,8 +1,11 @@ +:orphan: + .. _mt_read_workers: -************************* +************************************************ Setting Worker Amount to Handle Concurrent Reads -************************* +************************************************ + The ``mtReadWorkers`` flag sets the number of workers to handle smaller concurrent reads. The following describes the ``mtReadWorkers`` flag: diff --git a/configuration_guides/orc_implicit_casts.rst b/configuration_guides/orc_implicit_casts.rst index 04cc903e9..1c09f7a97 100644 --- a/configuration_guides/orc_implicit_casts.rst +++ b/configuration_guides/orc_implicit_casts.rst @@ -1,8 +1,11 @@ +:orphan: + .. _orc_implicit_casts: -************************* +*********************************** Setting Implicit Casts in ORC Files -************************* +*********************************** + The ``orcImplicitCasts`` flag sets the implicit cast in orc files, such as **int** to **tinyint** and vice versa. The following describes the ``orcImplicitCasts`` flag: diff --git a/configuration_guides/previous_configuration_method.rst b/configuration_guides/previous_configuration_method.rst deleted file mode 100644 index ff48e7cc1..000000000 --- a/configuration_guides/previous_configuration_method.rst +++ /dev/null @@ -1,268 +0,0 @@ -.. _previous_configuration_method: - -************************** -Configuring SQream Using the Previous Configuration Method -************************** -The **Configuring SQream Using the Previous Configuration Method** page describes SQream’s previous method for configuring your instance of SQream, and includes the following topics: - -.. contents:: - :local: - :depth: 1 - -By default, configuration files are stored in ``/etc/sqream``. - -A very minimal configuration file looks like this: - -.. code-block:: json - - { - "compileFlags": { - }, - "runtimeFlags": { - }, - "runtimeGlobalFlags": { - }, - "server": { - "gpu": 0, - "port": 5000, - "cluster": "/home/sqream/sqream_storage", - "licensePath": "/etc/sqream/license.enc" - } - } - -* Each SQream DB worker (``sqreamd``) has a dedicated configuration file. - -* The configuration file contains four distinct sections, ``compileFlags``, ``runtimeFlags``, ``runtimeGlobalFlags``, and ``server``. - -In the example above, the worker will start on port 5000, and will use GPU #0. - -Frequently Set Parameters ------- -.. todo - list-table:: Compiler flags - :widths: auto - :header-rows: 1 - - * - Name - - Section - - Description - - Default - - Value range - - Example - * - - - - - - - - - - - - -.. list-table:: Server flags - :widths: auto - :header-rows: 1 - - * - Name - - Section - - Description - - Default - - Value range - - Example - * - ``gpu`` - - ``server`` - - Controls the GPU ordinal to use - - ✗ - - 0 to (number of GPUs in the machine -1). Check with ``nvidia-smi -L`` - - ``"gpu": 0`` - * - ``port`` - - ``server`` - - Controls the TCP port to listen on - - ✗ - - 1024 to 65535 - - ``"port" : 5000`` - * - ``ssl_port`` - - ``server`` - - Controls the SSL TCP port to listen on. Must be different from ``port`` - - ✗ - - 1024 to 65535 - - ``"ssl_port" : 5100`` - * - ``cluster`` - - ``server`` - - Specifies the cluster path root - - ✗ - - Valid local system path - - ``"cluster" : "/home/sqream/sqream_storage"`` - * - ``license_path`` - - ``server`` - - Specifies the license file for this worker - - ✗ - - Valid local system path to license file - - ``"license_path" : "/etc/sqream/license.enc"`` - -.. list-table:: Runtime global flags - :widths: auto - :header-rows: 1 - - * - Name - - Section - - Description - - Default - - Value range - - Example - * - ``spoolMemoryGb`` - - ``runtimeGlobalFlags`` - - Modifies RAM allocated for the worker for intermediate results. Statements that use more memory than this setting will spool to disk, which could degrade performance. We recommend not to exceed the amount of RAM in the machine. This setting must be set lower than the ``limitQueryMemoryGB`` setting. - - ``128`` - - 1 to maximum available RAM in gigabytes. - - ``"spoolMemoryGb": 250`` - * - ``limitQueryMemoryGB`` - - ``runtimeGlobalFlags`` - - Modifies the maximum amount of RAM allocated for a query. The recommended value for this is ``total host memory`` / ``sqreamd workers on host``. For example, for a machine with 512GB of RAM and 4 workers, the recommended setting is ``512/4 → 128``. - - ``10000`` - - ``1`` to ``10000`` - - ``"limitQueryMemoryGB" : 128`` - * - ``cudaMemQuota`` - - ``runtimeGlobalFlags`` - - Modifies the maximum amount of GPU RAM allocated for a worker. The recommended value is 99% for a GPU with a single worker, or 49% for a GPU with two workers. - - ``90`` % - - ``1`` to ``99`` - - ``"cudaMemQuota" : 99`` - * - ``showFullExceptionInfo`` - - ``runtimeGlobalFlags`` - - Shows complete error message with debug information. Use this for debugging. - - ``false`` - - ``true`` or ``false`` - - ``"showFullExceptionInfo" : true`` - * - ``initialSubscribedServices`` - - ``runtimeGlobalFlags`` - - Comma separated list of :ref:`service queues` that the worker is subscribed to - - ``"sqream"`` - - Comma separated list of service names, with no spaces. Services that don't exist will be created. - - ``"initialSubscribedServices": "sqream,etl,management"`` - * - ``logClientLevel`` - - ``runtimeGlobalFlags`` - - Used to control which log level should appear in the logs - - ``4`` (``INFO``) - - ``0`` SYSTEM (lowest) - ``4`` INFO (highest). See :ref:`information level table` for explanation about these log levels. - - ``"logClientLevel" : 3`` - * - ``nodeInfoLoggingSec`` - - ``runtimeGlobalFlags`` - - Sets an interval for automatically logging long-running statements' :ref:`show_node_info` output. Output is written as a message type ``200``. - - ``60`` (every minute) - - Positive whole number >=1. - - ``"nodeInfoLoggingSec" : 5`` - * - ``useLogMaxFileSize`` - - ``runtimeGlobalFlags`` - - Defines whether SQream logs should be cycled when they reach ``logMaxFileSizeMB`` size. When ``true``, set the ``logMaxFileSizeMB`` accordingly. - - ``false`` - - ``false`` or ``true`` - - ``"useLogMaxFileSize" : true`` - * - ``logMaxFileSizeMB`` - - ``runtimeGlobalFlags`` - - Sets the size threshold in megabytes after which a new log file will be opened. - - ``20`` - - ``1`` to ``1024`` (1MB to 1GB) - - ``"logMaxFileSizeMB" : 250`` - * - ``logFileRotateTimeFrequency`` - - ``runtimeGlobalFlags`` - - Control frequency of log rotation - - ``never`` - - ``daily``, ``weekly``, ``monthly``, ``never`` - - ``"logClientLevel" : 3`` - * - ``useMetadataServer`` - - ``runtimeGlobalFlags`` - - Specifies if this worker connects to a cluster (``true``) or is standalone (``false``). If set to ``true``, also set ``metadataServerIp`` - - ``true`` - - ``false`` or ``true`` - - ``"useMetadataServer" : true`` - * - ``metadataServerIp`` - - ``runtimeGlobalFlags`` - - Specifies the hostname or IP of the metadata server, when ``useMetadataServer`` is set to ``true``. - - ``127.0.0.1`` - - A valid IP or hostname - - ``"metadataServerIp": "127.0.0.1"`` - * - ``useConfigIP`` - - ``runtimeGlobalFlags`` - - Specifies if the metadata should use a pre-determined hostname or IP to refer to this worker. If set to ``true``, set the ``machineIp`` configuration accordingly. - - ``false`` - automatically derived by the TCP socket - - ``false`` or ``true`` - - ``"useConfigIP" : true`` - * - ``machineIP`` - - ``runtimeGlobalFlags`` - - Specifies the worker's external IP or hostname, when used from a remote network. - - No default - - A valid IP or hostname - - ``"machineIP": "10.0.1.4"`` - * - ``tempPath`` - - ``runtimeGlobalFlags`` - - Specifies an override for the temporary file path on the local machine. Set this to a local path to improve performance for spooling. - - Defaults to the central storage's built-in temporary folder - - A valid path to a folder on the local machine - - ``"tempPath": "/mnt/nvme0/temp"`` - - - -.. list-table:: Runtime flags - :widths: auto - :header-rows: 1 - - * - Name - - Section - - Description - - Default - - Value range - - Example - * - ``insertParsers`` - - ``runtimeFlags`` - - Sets the number of CSV parsing threads launched during bulk load - - 4 - - 1 to 32 - - ``"insertParsers" : 8`` - * - ``insertCompressors`` - - ``runtimeFlags`` - - Sets the number of compressor threads launched during bulk load - - 4 - - 1 to 32 - - ``"insertCompressors" : 8`` - * - ``statementLockTimeout`` - - ``runtimeGlobalFlags`` - - Sets the delay in seconds before SQream DB will stop waiting for a lock and return an error - - 3 - - >=1 - - ``"statementLockTimeout" : 10`` - - -.. list the main configuration options and how they are used - -.. point to the best practices as well - -.. warning:: JSON files can't contain any comments. - -Recommended Configuration File ------ - -.. code-block:: json - - { - "compileFlags":{ - }, - "runtimeFlags":{ - "insertParsers": 16, - "insertCompressors": 8 - }, - "runtimeGlobalFlags":{ - "limitQueryMemoryGB" : 121, - "spoolMemoryGB" : 108, - "cudaMemQuota": 90, - "initialSubscribedServices" : "sqream", - "useMetadataServer": true, - "metadataServerIp": "127.0.0.1", - "useConfigIP": true, - "machineIP": "127.0.0.1" - }, - "server":{ - "gpu":0, - "port":5000, - "ssl_port": 5100, - "cluster":"/home/sqream/sqream_storage", - "licensePath":"/etc/sqream/license.enc" - } - } \ No newline at end of file diff --git a/configuration_guides/query_timeout_minutes.rst b/configuration_guides/query_timeout_minutes.rst new file mode 100644 index 000000000..114ee5720 --- /dev/null +++ b/configuration_guides/query_timeout_minutes.rst @@ -0,0 +1,13 @@ +:orphan: + +.. _query_timeout_minutes: + +************************************************** +Query Timeout Minutes +************************************************** + +The ``QueryTimeoutMinutes`` session flag is designed to identify queries that have exceeded a specified time limit. Once the flag value is reached, the query automatically stops. + +* **Data type** - integer +* **Default value** - ``0`` (no query timeout) +* **Allowed values** - 1—4320 (1 minute up to 72 hours) \ No newline at end of file diff --git a/configuration_guides/session_tag.rst b/configuration_guides/session_tag.rst index 12a98f01c..534e73f09 100644 --- a/configuration_guides/session_tag.rst +++ b/configuration_guides/session_tag.rst @@ -1,3 +1,5 @@ +:orphan: + .. _session_tag: ************************* diff --git a/configuration_guides/spool_memory_gb.rst b/configuration_guides/spool_memory_gb.rst index 9aa651a74..269dfbd5f 100644 --- a/configuration_guides/spool_memory_gb.rst +++ b/configuration_guides/spool_memory_gb.rst @@ -1,3 +1,6 @@ +:orphan: + + .. _spool_memory_gb: ************************* @@ -5,7 +8,6 @@ Setting the Spool Memory ************************* The ``spoolMemoryGB`` flag sets the amount of memory (GB) available to the server for spooling. -The following describes the ``spoolMemoryGB`` flag: * **Data type** - uint * **Default value** - ``8`` \ No newline at end of file diff --git a/configuration_guides/spooling.rst b/configuration_guides/spooling.rst deleted file mode 100644 index 45c88d3ee..000000000 --- a/configuration_guides/spooling.rst +++ /dev/null @@ -1,69 +0,0 @@ -.. _spooling: - -************************** -Configuring the Spooling Feature -************************** -The **Configuring the Spooling Feature** page includes the following topics: - -.. contents:: - :local: - :depth: 1 - - -Overview ----------- -From the SQream Acceleration Studio you can allocate the amount of memory (GB) available to the server for spooling using the ``spoolMemoryGB`` flag. SQream recommends setting the ``spoolMemoryGB`` flag to 90% of the ``limitQueryMemoryGB`` flag. The ``limitQueryMemoryGB`` flag is the total memory you’ve allocated for processing queries. - -In addition, the ``limitQueryMemoryGB`` defines how much total system memory is used by each worker. SQream recommends setting ``limitQueryMemoryGB`` to 5% less than the total host memory divided by the amount of ``sqreamd`` workers on host. - -Note that ``spoolMemoryGB`` must bet set to less than the ``limitQueryMemoryGB``. - -Example Configurations ----------- -The **Example Configurations** section shows the following example configurations: - -.. contents:: - :local: - :depth: 1 - -Example 1 - Recommended Settings -~~~~~~~~~~~ -The following is an example of the recommended settings for a machine with 512GB of RAM and 4 workers: - -.. code-block:: console - - limitQueryMemoryGB - ⌊(512 * 0.95 / 4)⌋ → ~ 486 / 4 → 121 - spoolMemoryGB - ⌊( 0.9 * limitQueryMemoryGB )⌋ → ⌊( 0.9 * 121 )⌋ → 108 - -Example 2 - Setting Spool Memory -~~~~~~~~~~~ -The following is an example of setting ``spoolMemoryGB`` value in the current configuration method per-worker for 512GB of RAM and 4 workers: - -.. code-block:: console - - { - “cluster”: “/home/test_user/sqream_testing_temp/sqreamdb”, - “gpu”: 0, - “licensePath”: “home/test_user/SQream/tests/license.enc”, - “machineIP”: “127.0.0.1”, - “metadataServerIp”: “127.0.0.1”, - “metadataServerPort”: “3105, - “port”: 5000, - “useConfigIP”” true, - “limitQueryMemoryGB" : 121, - “spoolMemoryGB" : 108 - “legacyConfigFilePath”: “home/SQream_develop/SqrmRT/utils/json/legacy_congif.json” - } - -The following is an example of setting ``spoolMemoryGB`` value in the previous configuration method per-worker for 512GB of RAM and 4 workers: - -.. code-block:: console - - “runtimeFlags”: { - “limitQueryMemoryGB” : 121, - “spoolMemoryGB” : 108 - -For more information about configuring the ``spoolMemoryGB`` flag, see the following: - -* `Current configuration method `_ -* `Previous configuration method `_ \ No newline at end of file diff --git a/configuration_guides/sso.rst b/configuration_guides/sso.rst new file mode 100644 index 000000000..bf1fe288c --- /dev/null +++ b/configuration_guides/sso.rst @@ -0,0 +1,50 @@ +.. _sso: + +************** +Single Sign-On +************** + +Here you can learn how to configure a SSO login for SQreamDB Acceleration Studio by integrating with an identity provider (IdP). A SSO authentication allows users to authenticate once and then seamlessly access SQreamDB as one of multiple services. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +It is essential you have the following installed: + +* SQreamDB Acceleration Studio v5.9.0 +* There should be an NGINX (or similar) service installed on your Acceleration Studio machine, which will serve as a reverse proxy. This service will accept HTTPS traffic from external sources and communicate with Studio via HTTP internally +* You have :ref:`ldap` set as your authentication management service. + +Setting SQreamDB Acceleration Studio +==================================== + +#. In your ``sqream_legacy.json`` file, add the ``ssoValidateUrl`` flag with your IdP URL. + + Example: + + .. code-block:: json + + "ssoValidateUrl": "https://auth.pingone.eu/9db5d1c6-6dd6-4e40-b939-e0e4209e0ac5/as/userinfo" + +#. Set Acceleration Studio to use SSO by adding the following flag to your ``sqream_admin_config.json`` file: + + * ``mfaRedirectUrl`` flag with your redirect URL + + Example: + + .. code-block:: json + + "mfaRedirectUrl": "https://auth.pingone.eu/9db5d1c6-6dd6-4e40-b939-e0e4209e0ac5/as/authorize?client_id=e5636823-fb99-4d38-bbd1-6a46175eddab&redirect_uri=https://ivans.sq.l/login&response_type=token&scope=openid profile p1:read:user", + + If Acceleration Studio is not yet installed, you can set both URLs during its installation process. + + + +#. Restart SQreamDB. + +#. Restart SQreamDB Acceleration Studio. + diff --git a/configuration_guides/statement_lock_timeout.rst b/configuration_guides/statement_lock_timeout.rst index 639f5d02d..5697cebdf 100644 --- a/configuration_guides/statement_lock_timeout.rst +++ b/configuration_guides/statement_lock_timeout.rst @@ -1,8 +1,12 @@ +:orphan: + + .. _statement_lock_timeout: -************************* +********************************************************************* Setting Timeout Limit for Locking Objects before Executing Statements -************************* +********************************************************************* + The ``statementLockTimeout`` flag sets the timeout (seconds) for acquiring object locks before executing statements. The following describes the ``statementLockTimeout`` flag: diff --git a/configuration_guides/use_config_ip.rst b/configuration_guides/use_config_ip.rst index 8779899b1..43edccd98 100644 --- a/configuration_guides/use_config_ip.rst +++ b/configuration_guides/use_config_ip.rst @@ -1,8 +1,12 @@ +:orphan: + + .. _use_config_ip: -************************* +************************** Assigning Local Network IP -************************* +************************** + The ``useConfigIP`` flag activates the machineIP (``true``). Setting this flag to ``false`` ignores the machineIP and automatically assigns a local network IP. This cannot be activated in a cloud scenario (on-premises only). The following describes the ``useConfigIP`` flag: diff --git a/configuration_guides/use_legacy_decimal_literals.rst b/configuration_guides/use_legacy_decimal_literals.rst index 63650a95b..a52b9fb87 100644 --- a/configuration_guides/use_legacy_decimal_literals.rst +++ b/configuration_guides/use_legacy_decimal_literals.rst @@ -1,8 +1,12 @@ +:orphan: + + .. _use_legacy_decimal_literals: -************************* +********************************************************** Interpreting Decimal Literals as Double Instead of Numeric -************************* +********************************************************** + The ``useLegacyDecimalLiterals`` flag interprets decimal literals as **Double** instead of **Numeric**. This flag is used to preserve legacy behavior in existing customers. The following describes the ``useLegacyDecimalLiterals`` flag: diff --git a/configuration_guides/use_legacy_string_literals.rst b/configuration_guides/use_legacy_string_literals.rst index 164801084..165c02667 100644 --- a/configuration_guides/use_legacy_string_literals.rst +++ b/configuration_guides/use_legacy_string_literals.rst @@ -1,8 +1,12 @@ +:orphan: + + .. _use_legacy_string_literals: -************************* +**************************** Using Legacy String Literals -************************* +**************************** + The ``useLegacyStringLiterals`` flag interprets ASCII-only strings as **VARCHAR** instead of **TEXT**. This flag is used to preserve legacy behavior in existing customers. The following describes the ``useLegacyStringLiterals`` flag: diff --git a/configuration_guides/varchar_identifiers.rst b/configuration_guides/varchar_identifiers.rst deleted file mode 100644 index 889e5c16e..000000000 --- a/configuration_guides/varchar_identifiers.rst +++ /dev/null @@ -1,12 +0,0 @@ -.. _varchar_identifiers: - -************************* -Interpreting VARCHAR as TEXT -************************* -The ``varcharIdentifiers`` flag activates using **varchar** as an identifier. - -The following describes the ``varcharIdentifiers`` flag: - -* **Data type** - boolean -* **Default value** - ``true`` -* **Allowed values** - ``true``, ``false`` \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/dataiku/index.rst b/connecting_to_sqream/client_drivers/dataiku/index.rst new file mode 100644 index 000000000..a04ef9d31 --- /dev/null +++ b/connecting_to_sqream/client_drivers/dataiku/index.rst @@ -0,0 +1,56 @@ +.. _dataiku: + +******* +Dataiku +******* + +This Plugin accelerates data transfer from Amazon S3 to SqreamDB within Dataiku DSS. It enables direct loading of data from S3 to SqreamDB, ensuring rapid transfers without external steps. + +The Plugin includes a code environment that automatically installs the SqreamDB Python Connector (pysqream) alongside the Plugin. + +The following file formats are supported: + +* Avro +* JSON +* CSV (requires manual data type mapping as the default for all columns is ``TEXT``) + +Before You Begin +================= + +It is essential you have the follwoing: + +* Sqreamdb :ref:`java_jdbc` connection set up in DSS + +* Amazon S3 connection set up in DSS + +* Python 3.9 + +Establishing a Dataiku Connection +================================= + +In your Dataiku web interface: + +#. Upload the plugin from the following SQreamDB Git repository: + + .. code-block:: console + + -- Repository URL: + git@github.com:SQream/dataiku_plugin.git + + -- Path in repository: + s3_bulk_load + +#. Define a DSS S3 dataset. + +#. Add the plugin to your flow. + +#. Set the S3 Dataset as Input of the Plugin (mandatory). + +#. Assign a name for the output dataset stored in your SQreamDB connection. + +#. Provide AWS Access Key and Secret Key by either: + + a. Filling in the values in the Plugin form + + b. Set the Project Variables or set the Global Variables when DSS Variables are used + diff --git a/connecting_to_sqream/client_drivers/dotnet/index.rst b/connecting_to_sqream/client_drivers/dotnet/index.rst new file mode 100644 index 000000000..54b5b050a --- /dev/null +++ b/connecting_to_sqream/client_drivers/dotnet/index.rst @@ -0,0 +1,119 @@ +.. _net: + +********** +SQreamNET +********** + +The SQreamNET ADO.NET Data Provider lets you connect to SQream through your .NET environment. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================= + +* The SQreamNET provider requires a .NET version 6 or newer + +* Download the SQreamNET driver from the :ref:`client drivers page` + +Integrating SQreamNET +====================== + +#. After downloading the .NET driver, save the archived file to a known location. +#. In your IDE, add a SQreamNET.dll reference to your project. +#. If you wish to upgrade SQreamNET within an existing project, replace your existing .dll file with an updated one or change the project's reference location to a new one. + +Connecting to SQream For the First Time +======================================== + +An initial connection to SQream must be established by creating a **SqreamConnection** object using a connection string. + +Connection String Syntax +------------------------- + +.. code-block:: console + + Data Source=,;User=;Password=;Initial Catalog=;Integrated Security=true; + +Connection Parameters +----------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Item + - State + - Default + - Description + * - ```` + - Mandatory + - None + - Hostname/IP/FQDN and port of the SQream DB worker. For example, ``127.0.0.1:5000``, ``sqream.mynetwork.co:3108`` + * - ```` + - Mandatory + - None + - Database name to connect to. For example, ``master`` + * - ```` + - Mandatory + - None + - Username of a role to use for connection. For example, ``username=rhendricks`` + * - ```` + - Mandatory + - None + - Specifies the password of the selected role. For example, ``password=Tr0ub4dor&3`` + * - ```` + - Optional + - ``sqream`` + - Specifices service queue to use. For example, ``service=etl`` + * - ```` + - Optional + - ``false`` + - Specifies SSL for this connection. For example, ``ssl=true`` + * - ```` + - Optional + - ``true`` + - Connect via load balancer (use only if exists, and check port). + +Connection String Examples +--------------------------- + +The following is an example of a SQream cluster with load balancer and no service queues (with SSL): + +.. code-block:: console + + Data Source=sqream.mynetwork.co,3108;User=rhendricks;Password=Tr0ub4dor&3;Initial Catalog=master;Integrated Security=true;ssl=true;cluster=true; + + +The following is a minimal example for a local standalone SQream database: + +.. code-block:: console + + + Data Source=127.0.0.1,5000;User=rhendricks;Password=Tr0ub4dor&3;Initial Catalog=master; + +The following is an example of a SQream cluster with load balancer and a specific service queue named ``etl``, to the database named ``raviga`` + +.. code-block:: console + + Data Source=sqream.mynetwork.co,3108;User=rhendricks;Password=Tr0ub4dor&3;Initial Catalog=raviga;Integrated Security=true;service=etl;cluster=true; + +Sample C# Program +----------------- + +You can download the :download:`.NET Application Sample File ` below by right-clicking and saving it to your computer. + +.. literalinclude:: sample.cs + :language: C# + :caption: .NET Application Sample + :linenos: + +Limitations +=============== + +* Unicode characters are not supported when using ``INSERT INTO AS SELECT`` + +* To avoid possible casting issues, use ``getDouble`` when using ``FLOAT`` + +* The ``ARRAY`` data types is not supported. If your database schema includes ``ARRAY`` columns, you may encounter compatibility issues when using SQreamNET to connect to the database. \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/dotnet/sample.cs b/connecting_to_sqream/client_drivers/dotnet/sample.cs new file mode 100644 index 000000000..54a19e0da --- /dev/null +++ b/connecting_to_sqream/client_drivers/dotnet/sample.cs @@ -0,0 +1,93 @@ + public void Test() + { + var connection = OpenConnection("192.168.4.62", 5000, "sqream", "sqream", "master"); + + ExecuteSQLCommand(connection, "create or replace table tbl_example as select 1 as x , 'a' as y;"); + + var tableData = ReadExampleData(connection, "select * from tbl_example;"); + } + + /// + /// Builds a connection string to sqream server and opens a connection + /// + /// host to connect + /// port sqreamd is running on + /// role username + /// role password + /// database name + /// optional - set to true when the ip,port endpoint is a server picker process + /// + /// SQream connection object + /// Throws SqreamException if fails to open a connction + /// + public SqreamConnection OpenConnection(string ipAddress, int port, string username, string password, string databaseName, bool isCluster = false) + { + // create the connection string according to the format + var connectionString = string.Format( + "Data Source={0},{1};User={2};Password={3};Initial Catalog={4};Cluster={5}", + ipAddress, + port, + username, + password, + databaseName, + isCluster + ); + + // create a sqeram connection object + var connection = new SqreamConnection(connectionString); + + // open a connection + connection.Open(); + + // returns the connection object + return connection; + } + + /// + /// Executes a SQL command to sqream server + /// + /// connection to sqream server + /// sql command + /// thrown when the connection is not open + public void ExecuteSQLCommand(SqreamConnection connection, string sql) + { + // validates the connection is open and throws exception if not + if (connection.State != System.Data.ConnectionState.Open) + throw new InvalidOperationException(string.Format("connection to sqream is not open. connection.State: {0}", connection.State)); + + // creates a new command object utilizing the sql and the connection + var command = new SqreamCommand(sql, connection); + + // executes the command + command.ExecuteNonQuery(); + } + + /// + /// Executes a SQL command to sqream server, and reads the result set usiing DataReader + /// + /// connection to sqream server + /// sql command + /// thrown when the connection is not open + public List> ReadExampleData(SqreamConnection connection, string sql) + { + // validates the connection is open and throws exception if not + if (connection.State != System.Data.ConnectionState.Open) + throw new InvalidOperationException(string.Format("connection to sqream is not open. connection.State: {0}", connection.State)); + + // creates a new command object utilizing the sql and the connection + var command = new SqreamCommand(sql, connection); + + // creates a reader object to iterate over the result set + var reader = (SqreamDataReader)command.ExecuteReader(); + + // list of results + var result = new List>(); + + //iterate the reader and read the table int,string values into a result tuple object + while (reader.Read()) + result.Add(new Tuple(reader.GetInt32(0), reader.GetString(1))); + + // return the result set + return result; + } + diff --git a/connecting_to_sqream/client_drivers/index.rst b/connecting_to_sqream/client_drivers/index.rst new file mode 100644 index 000000000..cc9db574b --- /dev/null +++ b/connecting_to_sqream/client_drivers/index.rst @@ -0,0 +1,105 @@ +.. _client_drivers: + +************** +Client Drivers +************** + +The guides on this page describe how to use the SqreamDB client drivers and client applications. + +Client Driver Downloads +============================= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Driver + - Download + - Docs + - Notes + - Operating System + * - **SQream DB Java CLI** + - `SQream DB Java Command Line Interface `_ + - :ref:`sqream_sql_cli_reference` + - Replaces the Deprecated Haskell Command Line Tool + - All + * - **Apache Spark** + - `Apache Spark Connector `_ + - :ref:`spark` + - + - All + * - **Dataiku** + - Plugin Git repository: ``git@github.com:SQream/dataiku_plugin.git`` + - :ref:`dataiku` + - + - All + * - **JDBC** + - `sqream-jdbc `_ + - :ref:`java_jdbc` + - Recommended installation via ``mvn`` + - All + * - **Node.JS** + - `sqream-v4.2.4 `_ + - :ref:`nodejs` + - Recommended installation via ``npm`` + - All + * - **ODBC** + - `Windows ODBC Installer `_ , `Linux ODBC `_ + - :ref:`Windows`, :ref:`Linux` + - + - Windows, Linux + * - **Power BI** + - `Power BI Power Query Connector `_ + - :ref:`power_bi` + - + - All + * - **Python** + - `pysqream `_ + - :ref:`pysqream` + - Recommended installation via ``pip`` + - All + * - **Python-SQLAlchemy** + - `pysqream-sqlalchemy `_ + - :ref:`pysqream` + - Recommended installation via ``pip`` + - All + * - **SQreamNet** + - `.NET .dll file `_ + - :ref:`net` + - + - All + * - **Tableau** + - `Tableau Connector `_ + - :ref:`tableau` + - + - All + * - **Trino** + - `Trino Connector `_ + - :ref:`trino` + - + - All + + + + + + +.. toctree:: + :maxdepth: 4 + :titlesonly: + :hidden: + + dotnet/index + dataiku/index + jdbc/index + nodejs/index + odbc/index + python/index + spark/index + trino/index + + + +.. rubric:: Need help? + +If you couldn't find what you're looking for, contact `SQream Support `_ diff --git a/connecting_to_sqream/client_drivers/jdbc/index.rst b/connecting_to_sqream/client_drivers/jdbc/index.rst new file mode 100644 index 000000000..c1198bd1d --- /dev/null +++ b/connecting_to_sqream/client_drivers/jdbc/index.rst @@ -0,0 +1,192 @@ +.. _java_jdbc: + +**** +JDBC +**** + +The SQream JDBC driver lets you connect to SQream using many Java applications and tools. This page describes how to write a Java application using the JDBC interface. The JDBC driver requires Java 17 or newer. + +.. contents:: + :local: + :depth: 1 + +Installing the JDBC Driver +========================== + +The **Installing the JDBC Driver** section describes the following: + +.. contents:: + :local: + :depth: 1 + +Prerequisites +------------- + +The SQream JDBC driver requires Java 17 or newer, and SQream recommends using Oracle Java or OpenJDK.: + + +Getting the JAR file +-------------------- + +The SQream JDBC driver is available for download from the :ref:`client drivers download page`. This JAR file can be integrated into your Java-based applications or projects. + + +Setting Up the Class Path +------------------------- + +To use the driver, you must include the JAR named ``sqream-jdbc-.jar`` in the class path, either by inserting it in the ``CLASSPATH`` environment variable, or by using flags on the relevant Java command line. + +For example, if the JDBC driver has been unzipped to ``/home/sqream/sqream-jdbc-5.2.0.jar``, the following command is used to run application: + +.. code-block:: console + + $ export CLASSPATH=/home/sqream/sqream-jdbc-5.2.0.jar:$CLASSPATH + $ java my_java_app + +Alternatively, you can pass ``-classpath`` to the Java executable file: + +.. code-block:: console + + $ java -classpath .:/home/sqream/sqream-jdbc-5.2.0.jar my_java_app + +Connecting to SQream Using a JDBC Application +============================================= + +You can connect to SQream using one of the following JDBC applications: + +.. contents:: + :local: + :depth: 1 + +Driver Class +------------ + +Use ``com.sqream.jdbc.SQDriver`` as the driver class in the JDBC application. + +Connection String +----------------- + +JDBC drivers rely on a connection string. + +The following is the syntax for SQream: + +.. code-block:: text + + jdbc:Sqream:///;user=;password=;[; ...] + +Connection Parameters +^^^^^^^^^^^^^^^^^^^^^ + +The following table shows the connection string parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Item + - State + - Default + - Description + * - ```` + - Mandatory + - None + - Hostname and port of the SQream DB worker. For example, ``127.0.0.1:5000``, ``sqream.mynetwork.co:3108`` + * - ```` + - Mandatory + - None + - Database name to connect to. For example, ``master`` + * - ``username=`` + - Optional + - None + - Username of a role to use for connection. For example, ``username=SqreamRole`` + * - ``password=`` + - Optional + - None + - Specifies the password of the selected role. For example, ``password=SqreamRolePassword2023`` + * - ``service=`` + - Optional + - ``sqream`` + - Specifices service queue to use. For example, ``service=etl`` + * - ```` + - Optional + - ``false`` + - Specifies SSL for this connection. For example, ``ssl=true`` + * - ```` + - Optional + - ``true`` + - Connect via load balancer (use only if exists, and check port). + * - ```` + - Optional + - ``true`` + - Enables on-demand loading, and defines double buffer size for the result. The ``fetchSize`` parameter is rounded according to chunk size. For example, ``fetchSize=1`` loads one row and is rounded to one chunk. If the ``fetchSize`` is 100,600, a chunk size of 100,000 loads, and is rounded to, two chunks. + * - ```` + - Optional + - ``true`` + - Defines the bytes size for inserting a buffer before flushing data to the server. Clients running a parameterized insert (network insert) can define the amount of data to collect before flushing the buffer. + * - ```` + - Optional + - ``true`` + - Defines the logger level as either ``debug`` or ``trace``. + * - ```` + - Optional + - ``true`` + - Enables the file appender and defines the file name. The file name can be set as either the file name or the file path. + * - ```` + - Optional + - 0 + - Sets the duration, in seconds, for which a database connection can remain idle before it is terminated. If the parameter is set to its default value, idle connections will not be terminated. The idle connection timer begins counting after the completion of query execution. + +Connection String Examples +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following is an example of a SQream cluster with a load balancer and no service queues (with SSL): + +.. code-block:: text + + jdbc:Sqream://sqream.mynetwork.co:3108/master;user=rhendricks;password=Tr0ub4dor&3;ssl=true;cluster=true + +The following is a minimal example of a local standalone SQream database: + +.. code-block:: text + + jdbc:Sqream://127.0.0.1:5000/master;user=rhendricks;password=Tr0ub4dor&3 + +The following is an example of a SQream cluster with a load balancer and a specific service queue named ``etl``, to the database named ``raviga`` + +.. code-block:: text + + jdbc:Sqream://sqream.mynetwork.co:3108/raviga;user=rhendricks;password=Tr0ub4dor&3;cluster=true;service=etl + +Java Program Sample +-------------------- + +You can download the :download:`JDBC Application Sample File ` below by right-clicking and saving it to your computer. + +.. literalinclude:: sample.java + :language: java + :caption: JDBC Application Sample + :linenos: + +Prepared Statements +==================== + +Prepared statements, also known as parameterized queries, are a safer and more efficient way to execute SQL statements. They prevent SQL injection attacks by separating SQL code from data, and they can improve performance by reusing prepared statements. +In SQream, we use ``?`` as a placeholder for the relevant value in parameterized queries. +Prepared statements ``INSERT``, ``SELECT``, ``UPDATE`` and ``DELETE`` + +Prepared Statement Sample +--------------------------- + +The following is a Java code snippet employing a JDBC prepared statement object to ingest a batch of one million records into SQreamDB. + +You may download the :download:`Prepared statement ` by right-clicking and saving it to your computer. + +.. literalinclude:: samplePreparedStatement.java + :language: java + :caption: Prepared Statement Sample + :linenos: + +Prepared Statement Limitations +--------------------------- +* Prepared Statement do not support the use of :ref:`keywords_and_identifiers` as input parameters. +* ``SELECT``, ``UPDATE`` and ``DELETE`` statements require the use of ``add_batch`` prior to each execution. \ No newline at end of file diff --git a/third_party_tools/client_drivers/jdbc/sample.java b/connecting_to_sqream/client_drivers/jdbc/sample.java similarity index 97% rename from third_party_tools/client_drivers/jdbc/sample.java rename to connecting_to_sqream/client_drivers/jdbc/sample.java index 1b7af5804..9f43188b3 100644 --- a/third_party_tools/client_drivers/jdbc/sample.java +++ b/connecting_to_sqream/client_drivers/jdbc/sample.java @@ -1,67 +1,68 @@ -import java.sql.Connection; -import java.sql.DatabaseMetaData; -import java.sql.DriverManager; -import java.sql.Statement; -import java.sql.ResultSet; - -import java.io.IOException; -import java.security.KeyManagementException; -import java.security.NoSuchAlgorithmException; -import java.sql.SQLException; - - - -public class SampleTest { - - // Replace with your connection string - static final String url = "jdbc:Sqream://sqream.mynetwork.co:3108/master;user=rhendricks;password=Tr0ub4dor&3;ssl=true;cluster=true"; - - // Allocate objects for result set and metadata - Connection conn = null; - Statement stmt = null; - ResultSet rs = null; - DatabaseMetaData dbmeta = null; - - int res = 0; - - public void testJDBC() throws SQLException, IOException { - - // Create a connection - conn = DriverManager.getConnection(url,"rhendricks","Tr0ub4dor&3"); - - // Create a table with a single integer column - String sql = "CREATE TABLE test (x INT)"; - stmt = conn.createStatement(); // Prepare the statement - stmt.execute(sql); // Execute the statement - stmt.close(); // Close the statement handle - - // Insert some values into the newly created table - sql = "INSERT INTO test VALUES (5),(6)"; - stmt = conn.createStatement(); - stmt.execute(sql); - stmt.close(); - - // Get values from the table - sql = "SELECT * FROM test"; - stmt = conn.createStatement(); - rs = stmt.executeQuery(sql); - // Fetch all results one-by-one - while(rs.next()) { - res = rs.getInt(1); - System.out.println(res); // Print results to screen - } - rs.close(); // Close the result set - stmt.close(); // Close the statement handle - } - - - public static void main(String[] args) throws SQLException, KeyManagementException, NoSuchAlgorithmException, IOException, ClassNotFoundException{ - - // Load SQream DB JDBC driver - Class.forName("com.sqream.jdbc.SQDriver"); - - // Create test object and run - SampleTest test = new SampleTest(); - test.testJDBC(); - } +import java.sql.Connection; +import java.sql.DatabaseMetaData; +import java.sql.DriverManager; +import java.sql.Statement; +import java.sql.ResultSet; + +import java.io.IOException; +import java.security.KeyManagementException; +import java.security.NoSuchAlgorithmException; +import java.sql.SQLException; + + + +public class SampleTest { + + // Replace with your connection string + static final String url = "jdbc:Sqream://sqream.mynetwork.co:3108/master;user=rhendricks;password=Tr0ub4dor&3;ssl=true;cluster=true"; + + // Allocate objects for result set and metadata + Connection conn = null; + Statement stmt = null; + ResultSet rs = null; + DatabaseMetaData dbmeta = null; + + int res = 0; + + public void testJDBC() throws SQLException, IOException { + + // Create a connection + conn = DriverManager.getConnection(url,"rhendricks","Tr0ub4dor&3"); + + // Create a table with a single integer column + String sql = "CREATE TABLE test (x INT)"; + stmt = conn.createStatement(); // Prepare the statement + stmt.execute(sql); // Execute the statement + stmt.close(); // Close the statement handle + + // Insert some values into the newly created table + sql = "INSERT INTO test VALUES (5),(6)"; + stmt = conn.createStatement(); + stmt.execute(sql); + stmt.close(); + + // Get values from the table + sql = "SELECT * FROM test"; + stmt = conn.createStatement(); + rs = stmt.executeQuery(sql); + // Fetch all results one-by-one + while(rs.next()) { + res = rs.getInt(1); + System.out.println(res); // Print results to screen + } + rs.close(); // Close the result set + stmt.close(); // Close the statement handle + conn.close(); + } + + + public static void main(String[] args) throws SQLException, KeyManagementException, NoSuchAlgorithmException, IOException, ClassNotFoundException{ + + // Load SQream DB JDBC driver + Class.forName("com.sqream.jdbc.SQDriver"); + + // Create test object and run + SampleTest test = new SampleTest(); + test.testJDBC(); + } } \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/jdbc/samplepreparedstatement.java b/connecting_to_sqream/client_drivers/jdbc/samplepreparedstatement.java new file mode 100644 index 000000000..0b96ab9c2 --- /dev/null +++ b/connecting_to_sqream/client_drivers/jdbc/samplepreparedstatement.java @@ -0,0 +1,48 @@ +try (BufferedReader br = new BufferedReader(new FileReader(csv_file))) { + + int i = 0; + String csv_line; + int batchSize = 1000000; + String url = "jdbc:Sqream://" + host + ":" + port + + "/riru;cluster=false;user=sqream;password=sqream"; + LocalDateTime start_time = LocalDateTime.now(); + + Class.forName("com.sqream.jdbc.SQDriver"); + + Connection con = + DriverManager.getConnection(url, "sqream", "sqream"); + + String inStmt = + "INSERT INTO message values(?,?,?,?,?,?,?,?,?,?,?,?)"; + + PreparedStatement ps = con.prepareStatement(inStmt); + + while((csv_line = br.readLine()) != null){ + String[] message = csv_line.split(","); + + ps.setString(1, message[0].trim()); + ps.setInt(2, Integer.parseInt(message[1].trim())); + ps.setInt(3, Integer.parseInt(message[2].trim())); + ps.setInt(4, Integer.parseInt(message[3].trim())); + ps.setInt(5, Integer.parseInt(message[4].trim())); + ps.setString(6, message[5].trim()); + ps.setString(7, message[6].trim()); + ps.setString(8, message[7].trim()); + ps.setString(9, message[8].trim()); + ps.setString(10, message[9].trim()); + ps.setInt(11, Integer.parseInt(message[10].trim())); + ps.setInt(12, Integer.parseInt(message[11].trim())); + + ps.addBatch(); + + if(++i == batchSize) { + i = 0; + ps.executeBatch(); + ps.close(); + System.out.println(LocalDateTime.now().toString() + + ": Inserted " + batchSize + " records."); + ps = con.prepareStatement(inStmt); + } + + + } \ No newline at end of file diff --git a/third_party_tools/client_drivers/nodejs/index.rst b/connecting_to_sqream/client_drivers/nodejs/index.rst similarity index 90% rename from third_party_tools/client_drivers/nodejs/index.rst rename to connecting_to_sqream/client_drivers/nodejs/index.rst index cb7db193b..ad90bc4f7 100644 --- a/third_party_tools/client_drivers/nodejs/index.rst +++ b/connecting_to_sqream/client_drivers/nodejs/index.rst @@ -1,382 +1,382 @@ -.. _nodejs: - -************************* -Node.JS -************************* - -The SQream DB Node.JS driver allows Javascript applications and tools connect to SQream DB. -This tutorial shows you how to write a Node application using the Node.JS interface. - -The driver requires Node 10 or newer. - -.. contents:: In this topic: - :local: - -Installing the Node.JS driver -================================== - -Prerequisites ----------------- - -* Node.JS 10 or newer. Follow instructions at `nodejs.org `_ . - -Install with NPM -------------------- - -Installing with npm is the easiest and most reliable method. -If you need to install the driver in an offline system, see the offline method below. - -.. code-block:: console - - $ npm install @sqream/sqreamdb - -Install from an offline package -------------------------------------- - -The Node driver is provided as a tarball for download from the `SQream Drivers page `_ . - -After downloading the tarball, use ``npm`` to install the offline package. - -.. code-block:: console - - $ sudo npm install sqreamdb-4.0.0.tgz - - -Connect to SQream DB with a Node.JS application -==================================================== - -Create a simple test ------------------------------------------- - -Replace the connection parameters with real parameters for a SQream DB installation. - -.. code-block:: javascript - :caption: sqreamdb-test.js - - const Connection = require('@sqream/sqreamdb'); - - const config = { - host: 'localhost', - port: 3109, - username: 'rhendricks', - password: 'super_secret_password', - connectDatabase: 'raviga', - cluster: true, - is_ssl: true, - service: 'sqream' - }; - - const query1 = 'SELECT 1 AS test, 2*6 AS "dozen"'; - - const sqream = new Connection(config); - sqream.execute(query1).then((data) => { - console.log(data); - }, (err) => { - console.error(err); - }); - - -Run the test ----------------- - -A successful run should look like this: - -.. code-block:: console - - $ node sqreamdb-test.js - [ { test: 1, dozen: 12 } ] - - -API reference -==================== - -Connection parameters ---------------------------- - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Item - - Optional - - Default - - Description - * - ``host`` - - ✗ - - None - - Hostname for SQream DB worker. For example, ``127.0.0.1``, ``sqream.mynetwork.co`` - * - ``port`` - - ✗ - - None - - Port for SQream DB end-point. For example, ``3108`` for the load balancer, ``5000`` for a worker. - * - ``username`` - - ✗ - - None - - Username of a role to use for connection. For example, ``rhendricks`` - * - ``password`` - - ✗ - - None - - Specifies the password of the selected role. For example, ``Tr0ub4dor&3`` - * - ``connectDatabase`` - - ✗ - - None - - Database name to connect to. For example, ``master`` - * - ``service`` - - ✓ - - ``sqream`` - - Specifices service queue to use. For example, ``etl`` - * - ``is_ssl`` - - ✓ - - ``false`` - - Specifies SSL for this connection. For example, ``true`` - * - ``cluster`` - - ✓ - - ``false`` - - Connect via load balancer (use only if exists, and check port). For example, ``true`` - -Events -------------- - -The connector handles event returns with an event emitter - -getConnectionId - The ``getConnectionId`` event returns the executing connection ID. - -getStatementId - The ``getStatementId`` event returns the executing statement ID. - -getTypes - The ``getTypes`` event returns the results columns types. - -Example -^^^^^^^^^^^^^^^^^ - -.. code-block:: javascript - - const myConnection = new Connection(config); - - myConnection.runQuery(query1, function (err, data){ - myConnection.events.on('getConnectionId', function(data){ - console.log('getConnectionId', data); - }); - - myConnection.events.on('getStatementId', function(data){ - console.log('getStatementId', data); - }); - - myConnection.events.on('getTypes', function(data){ - console.log('getTypes', data); - }); - }); - -Input placeholders -------------------------- - -The Node.JS driver can replace parameters in a statement. - -Input placeholders allow values like user input to be passed as parameters into queries, with proper escaping. - -The valid placeholder formats are provided in the table below. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Placeholder - - Type - * - ``%i`` - - Identifier (e.g. table name, column name) - * - ``%s`` - - A text string - * - ``%d`` - - A number value - * - ``%b`` - - A boolean value - -See the :ref:`input placeholders example` below. - -Examples -=============== - -Setting configuration flags ------------------------------------ - -SQream DB configuration flags can be set per statement, as a parameter to ``runQuery``. - -For example: - -.. code-block:: javascript - - const setFlag = 'SET showfullexceptioninfo = true;'; - - const query_string = 'SELECT 1'; - - const myConnection = new Connection(config); - myConnection.runQuery(query_string, function (err, data){ - console.log(err, data); - }, setFlag); - - -Lazyloading ------------------------------------ - -To process rows without keeping them in memory, you can lazyload the rows with an async: - -.. code-block:: javascript - - - const Connection = require('@sqream/sqreamdb'); - - const config = { - host: 'localhost', - port: 3109, - username: 'rhendricks', - password: 'super_secret_password', - connectDatabase: 'raviga', - cluster: true, - is_ssl: true, - service: 'sqream' - }; - - const sqream = new Connection(config); - - const query = "SELECT * FROM public.a_very_large_table"; - - (async () => { - const cursor = await sqream.executeCursor(query); - let count = 0; - for await (let rows of cursor.fetchIterator(100)) { - // fetch rows in chunks of 100 - count += rows.length; - } - await cursor.close(); - return count; - })().then((total) => { - console.log('Total rows', total); - }, (err) => { - console.error(err); - }); - - -Reusing a connection ------------------------------------ - -It is possible to execeute multiple queries with the same connection (although only one query can be executed at a time). - -.. code-block:: javascript - - const Connection = require('@sqream/sqreamdb'); - - const config = { - host: 'localhost', - port: 3109, - username: 'rhendricks', - password: 'super_secret_password', - connectDatabase: 'raviga', - cluster: true, - is_ssl: true, - service: 'sqream' - }; - - const sqream = new Connection(config); - - (async () => { - - const conn = await sqream.connect(); - try { - const res1 = await conn.execute("SELECT 1"); - const res2 = await conn.execute("SELECT 2"); - const res3 = await conn.execute("SELECT 3"); - conn.disconnect(); - return {res1, res2, res3}; - } catch (err) { - conn.disconnect(); - throw err; - } - - })().then((res) => { - console.log('Results', res) - }, (err) => { - console.error(err); - }); - - -.. _input_placeholders_example: - -Using placeholders in queries ------------------------------------ - -Input placeholders allow values like user input to be passed as parameters into queries, with proper escaping. - -.. code-block:: javascript - - const Connection = require('@sqream/sqreamdb'); - - const config = { - host: 'localhost', - port: 3109, - username: 'rhendricks', - password: 'super_secret_password', - connectDatabase: 'raviga', - cluster: true, - is_ssl: true, - service: 'sqream' - }; - - const sqream = new Connection(config); - - const sql = "SELECT %i FROM public.%i WHERE name = %s AND num > %d AND active = %b"; - - sqream.execute(sql, "col1", "table2", "john's", 50, true); - - -The query that will run is ``SELECT col1 FROM public.table2 WHERE name = 'john''s' AND num > 50 AND active = true`` - - -Troubleshooting and recommended configuration -================================================ - - -Preventing ``heap out of memory`` errors --------------------------------------------- - -Some workloads may cause Node.JS to fail with the error: - -.. code-block:: none - - FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory - -To prevent this error, modify the heap size configuration by setting the ``--max-old-space-size`` run flag. - -For example, set the space size to 2GB: - -.. code-block:: console - - $ node --max-old-space-size=2048 my-application.js - -BIGINT support ------------------------- - -The Node.JS connector supports fetching ``BIGINT`` values from SQream DB. However, some applications may encounter an error when trying to serialize those values. - -The error that appears is: -.. code-block:: none - - TypeError: Do not know how to serialize a BigInt - -This is because JSON specification do not support BIGINT values, even when supported by Javascript engines. - -To resolve this issue, objects with BIGINT values should be converted to string before serializing, and converted back after deserializing. - -For example: - -.. code-block:: javascript - - const rows = [{test: 1n}] - const json = JSON.stringify(rows, , (key, value) => - typeof value === 'bigint' - ? value.toString() - : value // return everything else unchanged - )); - console.log(json); // [{"test": "1"}] - +.. _nodejs: + +******* +Node.JS +******* + +The SQream DB Node.JS driver allows Javascript applications and tools connect to SQream DB. +This tutorial shows you how to write a Node application using the Node.JS interface. + +The driver requires Node 10 or newer. + +.. contents:: In this topic: + :local: + +Installing the Node.JS driver +============================= + +Prerequisites +------------- + +* Node.JS 10 or newer. Follow instructions at `nodejs.org `_ . + +Install with NPM +---------------- + +Installing with npm is the easiest and most reliable method. +If you need to install the driver in an offline system, see the offline method below. + +.. code-block:: console + + $ npm install @sqream/sqreamdb + +Install from an offline package +------------------------------- + +The Node driver is provided as a tarball for download from the `SQream Drivers page `_ . + +After downloading the tarball, use ``npm`` to install the offline package. + +.. code-block:: console + + $ sudo npm install sqreamdb-4.0.0.tgz + + +Connect to SQream DB with a Node.JS application +=============================================== + +Create a simple test +-------------------- + +Replace the connection parameters with real parameters for a SQream DB installation. + +.. code-block:: javascript + :caption: sqreamdb-test.js + + const Connection = require('@sqream/sqreamdb'); + + const config = { + host: 'localhost', + port: 3109, + username: 'rhendricks', + password: 'super_secret_password', + connectDatabase: 'raviga', + cluster: true, + is_ssl: true, + service: 'sqream' + }; + + const query1 = 'SELECT 1 AS test, 2*6 AS "dozen"'; + + const sqream = new Connection(config); + sqream.execute(query1).then((data) => { + console.log(data); + }, (err) => { + console.error(err); + }); + + +Run the test +------------ + +A successful run should look like this: + +.. code-block:: console + + $ node sqreamdb-test.js + [ { test: 1, dozen: 12 } ] + + +API reference +============= + +Connection parameters +--------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Item + - Optional + - Default + - Description + * - ``host`` + - ✗ + - None + - Hostname for SQream DB worker. For example, ``127.0.0.1``, ``sqream.mynetwork.co`` + * - ``port`` + - ✗ + - None + - Port for SQream DB end-point. For example, ``3108`` for the load balancer, ``5000`` for a worker. + * - ``username`` + - ✗ + - None + - Username of a role to use for connection. For example, ``rhendricks`` + * - ``password`` + - ✗ + - None + - Specifies the password of the selected role. For example, ``Tr0ub4dor&3`` + * - ``connectDatabase`` + - ✗ + - None + - Database name to connect to. For example, ``master`` + * - ``service`` + - ✓ + - ``sqream`` + - Specifices service queue to use. For example, ``etl`` + * - ``is_ssl`` + - ✓ + - ``false`` + - Specifies SSL for this connection. For example, ``true`` + * - ``cluster`` + - ✓ + - ``false`` + - Connect via load balancer (use only if exists, and check port). For example, ``true`` + +Events +------ + +The connector handles event returns with an event emitter + +getConnectionId + The ``getConnectionId`` event returns the executing connection ID. + +getStatementId + The ``getStatementId`` event returns the executing statement ID. + +getTypes + The ``getTypes`` event returns the results columns types. + +Example +^^^^^^^ + +.. code-block:: javascript + + const myConnection = new Connection(config); + + myConnection.runQuery(query1, function (err, data){ + myConnection.events.on('getConnectionId', function(data){ + console.log('getConnectionId', data); + }); + + myConnection.events.on('getStatementId', function(data){ + console.log('getStatementId', data); + }); + + myConnection.events.on('getTypes', function(data){ + console.log('getTypes', data); + }); + }); + +Input placeholders +------------------ + +The Node.JS driver can replace parameters in a statement. + +Input placeholders allow values like user input to be passed as parameters into queries, with proper escaping. + +The valid placeholder formats are provided in the table below. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Placeholder + - Type + * - ``%i`` + - Identifier (e.g. table name, column name) + * - ``%s`` + - A text string + * - ``%d`` + - A number value + * - ``%b`` + - A boolean value + +See the :ref:`input placeholders example` below. + +Examples +======== + +Setting configuration flags +--------------------------- + +SQream DB configuration flags can be set per statement, as a parameter to ``runQuery``. + +For example: + +.. code-block:: javascript + + const setFlag = 'SET showfullexceptioninfo = true;'; + + const query_string = 'SELECT 1'; + + const myConnection = new Connection(config); + myConnection.runQuery(query_string, function (err, data){ + console.log(err, data); + }, setFlag); + + +Lazyloading +----------- + +To process rows without keeping them in memory, you can lazyload the rows with an async: + +.. code-block:: javascript + + + const Connection = require('@sqream/sqreamdb'); + + const config = { + host: 'localhost', + port: 3109, + username: 'rhendricks', + password: 'super_secret_password', + connectDatabase: 'raviga', + cluster: true, + is_ssl: true, + service: 'sqream' + }; + + const sqream = new Connection(config); + + const query = "SELECT * FROM public.a_very_large_table"; + + (async () => { + const cursor = await sqream.executeCursor(query); + let count = 0; + for await (let rows of cursor.fetchIterator(100)) { + // fetch rows in chunks of 100 + count += rows.length; + } + await cursor.close(); + return count; + })().then((total) => { + console.log('Total rows', total); + }, (err) => { + console.error(err); + }); + + +Reusing a connection +-------------------- + +It is possible to execeute multiple queries with the same connection (although only one query can be executed at a time). + +.. code-block:: javascript + + const Connection = require('@sqream/sqreamdb'); + + const config = { + host: 'localhost', + port: 3109, + username: 'rhendricks', + password: 'super_secret_password', + connectDatabase: 'raviga', + cluster: true, + is_ssl: true, + service: 'sqream' + }; + + const sqream = new Connection(config); + + (async () => { + + const conn = await sqream.connect(); + try { + const res1 = await conn.execute("SELECT 1"); + const res2 = await conn.execute("SELECT 2"); + const res3 = await conn.execute("SELECT 3"); + conn.disconnect(); + return {res1, res2, res3}; + } catch (err) { + conn.disconnect(); + throw err; + } + + })().then((res) => { + console.log('Results', res) + }, (err) => { + console.error(err); + }); + + +.. _input_placeholders_example: + +Using placeholders in queries +----------------------------- + +Input placeholders allow values like user input to be passed as parameters into queries, with proper escaping. + +.. code-block:: javascript + + const Connection = require('@sqream/sqreamdb'); + + const config = { + host: 'localhost', + port: 3109, + username: 'rhendricks', + password: 'super_secret_password', + connectDatabase: 'raviga', + cluster: true, + is_ssl: true, + service: 'sqream' + }; + + const sqream = new Connection(config); + + const sql = "SELECT %i FROM public.%i WHERE name = %s AND num > %d AND active = %b"; + + sqream.execute(sql, "col1", "table2", "john's", 50, true); + + +The query that will run is ``SELECT col1 FROM public.table2 WHERE name = 'john''s' AND num > 50 AND active = true`` + + +Troubleshooting and recommended configuration +============================================= + + +Preventing ``heap out of memory`` errors +---------------------------------------- + +Some workloads may cause Node.JS to fail with the error: + +.. code-block:: none + + FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory + +To prevent this error, modify the heap size configuration by setting the ``--max-old-space-size`` run flag. + +For example, set the space size to 2GB: + +.. code-block:: console + + $ node --max-old-space-size=2048 my-application.js + +BIGINT support +-------------- + +The Node.JS connector supports fetching ``BIGINT`` values from SQream DB. However, some applications may encounter an error when trying to serialize those values. + +The error that appears is: +.. code-block:: none + + TypeError: Do not know how to serialize a BigInt + +This is because JSON specification do not support BIGINT values, even when supported by Javascript engines. + +To resolve this issue, objects with BIGINT values should be converted to string before serializing, and converted back after deserializing. + +For example: + +.. code-block:: javascript + + const rows = [{test: 1n}] + const json = JSON.stringify(rows, , (key, value) => + typeof value === 'bigint' + ? value.toString() + : value // return everything else unchanged + )); + console.log(json); // [{"test": "1"}] + diff --git a/third_party_tools/client_drivers/nodejs/sample.js b/connecting_to_sqream/client_drivers/nodejs/sample.js similarity index 95% rename from third_party_tools/client_drivers/nodejs/sample.js rename to connecting_to_sqream/client_drivers/nodejs/sample.js index a8ec3db66..cf3e19099 100644 --- a/third_party_tools/client_drivers/nodejs/sample.js +++ b/connecting_to_sqream/client_drivers/nodejs/sample.js @@ -1,21 +1,21 @@ -const Connection = require('@sqream/sqreamdb'); - -const config = { - host: 'localhost', - port: 3109, - username: 'rhendricks', - password: 'super_secret_password', - connectDatabase: 'raviga', - cluster: true, - is_ssl: true, - service: 'sqream' - }; - -const query1 = 'SELECT 1 AS test, 2*6 AS "dozen"'; - -const sqream = new Connection(config); -sqream.execute(query1).then((data) => { - console.log(data); -}, (err) => { - console.error(err); +const Connection = require('@sqream/sqreamdb'); + +const config = { + host: 'localhost', + port: 3109, + username: 'rhendricks', + password: 'super_secret_password', + connectDatabase: 'raviga', + cluster: true, + is_ssl: true, + service: 'sqream' + }; + +const query1 = 'SELECT 1 AS test, 2*6 AS "dozen"'; + +const sqream = new Connection(config); +sqream.execute(query1).then((data) => { + console.log(data); +}, (err) => { + console.error(err); }); \ No newline at end of file diff --git a/third_party_tools/client_drivers/odbc/index.rst b/connecting_to_sqream/client_drivers/odbc/index.rst similarity index 66% rename from third_party_tools/client_drivers/odbc/index.rst rename to connecting_to_sqream/client_drivers/odbc/index.rst index 7623b4e99..d7ef130b1 100644 --- a/third_party_tools/client_drivers/odbc/index.rst +++ b/connecting_to_sqream/client_drivers/odbc/index.rst @@ -1,58 +1,53 @@ -.. _odbc: - -************************* -ODBC -************************* - -.. toctree:: - :maxdepth: 1 - :titlesonly: - :hidden: - - install_configure_odbc_windows - install_configure_odbc_linux - -SQream has an ODBC driver to connect to SQream DB. This tutorial shows how to install the ODBC driver for Linux or Windows for use with applications like Tableau, PHP, and others that use ODBC. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Platform - - Versions supported - - * - Windows - - * Windows 7 (64 bit) - * Windows 8 (64 bit) - * Windows 10 (64 bit) - * Windows Server 2008 R2 (64 bit) - * Windows Server 2012 - * Windows Server 2016 - * Windows Server 2019 - - * - Linux - - * Red Hat Enterprise Linux (RHEL) 7 - * CentOS 7 - * Ubuntu 16.04 - * Ubuntu 18.04 - -Other distributions may also work, but are not officially supported by SQream. - -.. contents:: In this topic: - :local: - -Downloading the ODBC driver -================================== - -The SQream DB ODBC driver is distributed by your SQream account manager. Before contacting your account manager, verify which platform the ODBC driver will be used on. Go to `SQream Support `_ or contact your SQream account manager to get the driver. - -The driver is provided as an executable installer for Windows, or a compressed tarball for Linux platforms. -After downloading the driver, follow the relevant instructions to install and configure the driver for your platform: - -Install and configure the ODBC driver -======================================= - -Continue based on your platform: - -* :ref:`install_odbc_windows` +.. _odbc: + +**** +ODBC +**** + +.. toctree:: + :maxdepth: 1 + :titlesonly: + :hidden: + + install_configure_odbc_windows + install_configure_odbc_linux + +SQream has an ODBC driver to connect to SQream DB. This tutorial shows how to install the ODBC driver for Linux or Windows for use with applications like Tableau, PHP, and others that use ODBC. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Platform + - Versions supported + + * - Windows + - * Windows 7 (64 bit) + * Windows 8 (64 bit) + * Windows 10 (64 bit) + * Windows Server 2008 R2 (64 bit) + * Windows Server 2012 + * Windows Server 2016 + * Windows Server 2019 + + * - Linux + - * Red Hat Enterprise Linux (RHEL) 8.x + +Other distributions may also work, but are not officially supported by SQream. + + +Getting the ODBC driver +======================= + +Download the relevant driver (Windows or Linux) from the :ref:`client drivers download page`. + +The driver is provided as an executable installer for Windows, or a compressed tarball for Linux platforms. +After downloading the driver, follow the relevant instructions to install and configure the driver for your platform: + +Install and configure the ODBC driver +======================================= + +Continue based on your platform: + +* :ref:`install_odbc_windows` * :ref:`install_odbc_linux` \ No newline at end of file diff --git a/third_party_tools/client_drivers/odbc/install_configure_odbc_linux.rst b/connecting_to_sqream/client_drivers/odbc/install_configure_odbc_linux.rst similarity index 87% rename from third_party_tools/client_drivers/odbc/install_configure_odbc_linux.rst rename to connecting_to_sqream/client_drivers/odbc/install_configure_odbc_linux.rst index 737768756..2a10908bb 100644 --- a/third_party_tools/client_drivers/odbc/install_configure_odbc_linux.rst +++ b/connecting_to_sqream/client_drivers/odbc/install_configure_odbc_linux.rst @@ -1,253 +1,249 @@ -.. _install_odbc_linux: - -**************************************** -Install and configure ODBC on Linux -**************************************** - -.. toctree:: - :maxdepth: 1 - :titlesonly: - :hidden: - - -The ODBC driver for Windows is provided as a shared library. - -This tutorial shows how to install and configure ODBC on Linux. - -.. contents:: In this topic: - :local: - :depth: 2 - -Prerequisites -============== - -.. _unixODBC: - -unixODBC ------------- - -The ODBC driver requires a driver manager to manage the DSNs. SQream DB's driver is built for unixODBC. - -Verify unixODBC is installed by running: - -.. code-block:: console - - $ odbcinst -j - unixODBC 2.3.4 - DRIVERS............: /etc/odbcinst.ini - SYSTEM DATA SOURCES: /etc/odbc.ini - FILE DATA SOURCES..: /etc/ODBCDataSources - USER DATA SOURCES..: /home/rhendricks/.odbc.ini - SQLULEN Size.......: 8 - SQLLEN Size........: 8 - SQLSETPOSIROW Size.: 8 - -Take note of the location of ``.odbc.ini`` and ``.odbcinst.ini``. In this case, ``/etc``. If ``odbcinst`` is not installed, follow the instructions for your platform below: - -.. contents:: Install unixODBC on: - :local: - :depth: 1 - -Install unixODBC on RHEL 7 / CentOS 7 -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: console - - $ yum install -y unixODBC unixODBC-devel - -Install unixODBC on Ubuntu -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: console - - $ sudo apt-get install unixodbc unixodbc-dev - - -Install the ODBC driver with a script -======================================= - -Use this method if you have never used ODBC on your machine before. If you have existing DSNs, see the manual install process below. - -#. Unpack the tarball - Copy the downloaded file to any directory, and untar it to a new directory: - - .. code-block:: console - - $ mkdir -p sqream_odbc64 - $ tar xf sqream_2019.2.1_odbc_3.0.0_x86_64_linux.tar.gz -C sqream_odbc64 - -#. Run the first-time installer. The installer will create an editable DSN. - - .. code-block:: console - - $ cd sqream_odbc64 - ./odbc_install.sh --install - - -#. Edit the DSN created by editing ``/etc/.odbc.ini``. See the parameter explanation in the section :ref:`ODBC DSN Parameters `. - - -Install the ODBC driver manually -======================================= - -Use this method when you have existing ODBC DSNs on your machine. - -#. Unpack the tarball - Copy the file you downloaded to the directory where you want to install it, and untar it: - - .. code-block:: console - - $ tar xf sqream_2019.2.1_odbc_3.0.0_x86_64_linux.tar.gz -C sqream_odbc64 - - Take note of the directory where the driver was unpacked. For example, ``/home/rhendricks/sqream_odbc64`` - -#. Locate the ``.odbc.ini`` and ``.odbcinst.ini`` files, using ``odbcinst -j``. - - #. In ``.odbcinst.ini``, add the following lines to register the driver (change the highlighted paths to match your specific driver): - - .. code-block:: ini - :emphasize-lines: 6,7 - - [ODBC Drivers] - SqreamODBCDriver=Installed - - [SqreamODBCDriver] - Description=Driver DSII SqreamODBC 64bit - Driver=/home/rhendricks/sqream_odbc64/sqream_odbc64.so - Setup=/home/rhendricks/sqream_odbc64/sqream_odbc64.so - APILevel=1 - ConnectFunctions=YYY - DriverODBCVer=03.80 - SQLLevel=1 - IconvEncoding=UCS-4LE - - #. In ``.odbc.ini``, add the following lines to configure the DSN (change the highlighted parameters to match your installation): - - .. code-block:: ini - :emphasize-lines: 6,7,8,9,10,11,12,13,14 - - [ODBC Data Sources] - MyTest=SqreamODBCDriver - - [MyTest] - Description=64-bit Sqream ODBC - Driver=/home/rhendricks/sqream_odbc64/sqream_odbc64.so - Server="127.0.0.1" - Port="5000" - Database="raviga" - Service="" - User="rhendricks" - Password="Tr0ub4dor&3" - Cluster=false - Ssl=false - - Parameters are in the form of ``parameter = value``. For details about the parameters that can be set for each DSN, see the section :ref:`ODBC DSN Parameters `. - - - #. Create a file called ``.sqream_odbc.ini`` for managing the driver settings and logging. - This file should be created alongside the other files, and add the following lines (change the highlighted parameters to match your installation): - - .. code-block:: ini - :emphasize-lines: 5,7 - - # Note that this default DriverManagerEncoding of UTF-32 is for iODBC. unixODBC uses UTF-16 by default. - # If unixODBC was compiled with -DSQL_WCHART_CONVERT, then UTF-32 is the correct value. - # Execute 'odbc_config --cflags' to determine if you need UTF-32 or UTF-16 on unixODBC - [Driver] - DriverManagerEncoding=UTF-16 - DriverLocale=en-US - ErrorMessagesPath=/home/rhendricks/sqream_odbc64/ErrorMessages - LogLevel=0 - LogNamespace= - LogPath=/tmp/ - ODBCInstLib=libodbcinst.so - - -Install the driver dependencies -================================== - -Add the ODBC driver path to ``LD_LIBRARY_PATH``: - -.. code-block:: console - - $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/rhendricks/sqream_odbc64/lib - -You can also add this previous command line to your ``~/.bashrc`` file in order to keep this installation working between reboots without re-entering the command manually - -Testing the connection -======================== - -Test the driver using ``isql``. - -If the DSN created is called ``MyTest`` as the example, run isql in this format: - -.. code-block:: console - - $ isql MyTest - - -.. _dsn_params: - -ODBC DSN Parameters -======================= - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Item - - Default - - Description - * - Data Source Name - - None - - An easily recognizable name that you'll use to reference this DSN. - * - Description - - None - - A description of this DSN for your convenience. This field can be left blank - * - User - - None - - Username of a role to use for connection. For example, ``User="rhendricks"`` - * - Password - - None - - Specifies the password of the selected role. For example, ``User="Tr0ub4dor&3"`` - * - Database - - None - - Specifies the database name to connect to. For example, ``Database="master"`` - * - Service - - ``sqream`` - - Specifices :ref:`service queue` to use. For example, ``Service="etl"``. Leave blank (``Service=""``) for default service ``sqream``. - * - Server - - None - - Hostname of the SQream DB worker. For example, ``Server="127.0.0.1"`` or ``Server="sqream.mynetwork.co"`` - * - Port - - None - - TCP port of the SQream DB worker. For example, ``Port="5000"`` or ``Port="3108"`` for the load balancer - * - Cluster - - ``false`` - - Connect via load balancer (use only if exists, and check port). For example, ``Cluster=true`` - * - Ssl - - ``false`` - - Specifies SSL for this connection. For example, ``Ssl=true`` - * - DriverManagerEncoding - - ``UTF-16`` - - Depending on how unixODBC is installed, you may need to change this to ``UTF-32``. - * - ErrorMessagesPath - - None - - Location where the driver was installed. For example, ``ErrorMessagePath=/home/rhendricks/sqream_odbc64/ErrorMessages``. - * - LogLevel - - 0 - - Set to 0-6 for logging. Use this setting when instructed to by SQream Support. For example, ``LogLevel=1`` - - .. hlist:: - :columns: 3 - - * 0 = Disable tracing - * 1 = Fatal only error tracing - * 2 = Error tracing - * 3 = Warning tracing - * 4 = Info tracing - * 5 = Debug tracing - * 6 = Detailed tracing - - - +.. _install_odbc_linux: + +*********************************** +Install and configure ODBC on Linux +*********************************** + +.. toctree:: + :maxdepth: 1 + :titlesonly: + :hidden: + + +The ODBC driver for Windows is provided as a shared library. + +This tutorial shows how to install and configure ODBC on Linux. + +.. contents:: + :local: + :depth: 1 + +Prerequisites +============== + +.. _unixODBC: + +unixODBC +-------- + +The ODBC driver requires a driver manager to manage the DSNs. SQreamDB's driver is built for unixODBC. + +Verify unixODBC is installed by running: + +.. code-block:: console + + $ odbcinst -j + unixODBC 2.3.4 + DRIVERS............: /etc/odbcinst.ini + SYSTEM DATA SOURCES: /etc/odbc.ini + FILE DATA SOURCES..: /etc/ODBCDataSources + USER DATA SOURCES..: /home/rhendricks/.odbc.ini + SQLULEN Size.......: 8 + SQLLEN Size........: 8 + SQLSETPOSIROW Size.: 8 + +Take note of the location of ``.odbc.ini`` and ``.odbcinst.ini``. In this case, ``/etc``. If ``odbcinst`` is not installed, follow the instructions for your platform below: + +.. contents:: Install unixODBC on: + :local: + :depth: 1 + +Install unixODBC on RHEL +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: console + + $ yum install -y unixODBC unixODBC-devel + +Install the ODBC driver with a script +======================================= + +Use this method if you have never used ODBC on your machine before. If you have existing DSNs, see the manual install process below. + +#. Unpack the tarball + Copy the downloaded file to any directory, and untar it to a new directory: + + .. code-block:: console + + $ mkdir -p sqream_odbc64 + $ tar -xf sqream_odbc_vX.Y_x86_64_linux.tar.gz --strip-components=1 -C sqream_odbc64/ + +#. Run the first-time installer. The installer will create an editable DSN. + + .. code-block:: console + + $ cd sqream_odbc64 + ./odbc_install.sh --install + + +#. Edit the DSN created by editing ``/etc/.odbc.ini``. See the parameter explanation in the section :ref:`ODBC DSN Parameters `. + + +Install the ODBC driver manually +======================================= + +Use this method when you have existing ODBC DSNs on your machine. + +#. Unpack the tarball + Copy the file you downloaded to the directory where you want to install it, and untar it: + + .. code-block:: console + + $ tar xf sqream_2019.2.1_odbc_3.0.0_x86_64_linux.tar.gz -C sqream_odbc64 + + Take note of the directory where the driver was unpacked. For example, ``/home/rhendricks/sqream_odbc64`` + +#. Locate the ``.odbc.ini`` and ``.odbcinst.ini`` files, using ``odbcinst -j``. + + #. In ``.odbcinst.ini``, add the following lines to register the driver (change the highlighted paths to match your specific driver): + + .. code-block:: ini + :emphasize-lines: 6,7 + + [ODBC Drivers] + SqreamODBCDriver=Installed + + [SqreamODBCDriver] + Description=Driver DSII SqreamODBC 64bit + Driver=/home/rhendricks/sqream_odbc64/sqream_odbc64.so + Setup=/home/rhendricks/sqream_odbc64/sqream_odbc64.so + APILevel=1 + ConnectFunctions=YYY + DriverODBCVer=03.80 + SQLLevel=1 + IconvEncoding=UCS-4LE + + #. In ``.odbc.ini``, add the following lines to configure the DSN (change the highlighted parameters to match your installation): + + .. code-block:: ini + :emphasize-lines: 6,7,8,9,10,11,12,13,14 + + [ODBC Data Sources] + MyTest=SqreamODBCDriver + + [MyTest] + Description=64-bit Sqream ODBC + Driver=/home/rhendricks/sqream_odbc64/sqream_odbc64.so + Server="127.0.0.1" + Port="5000" + Database="raviga" + Service="" + User="rhendricks" + Password="Tr0ub4dor&3" + Cluster=false + Ssl=false + + Parameters are in the form of ``parameter = value``. For details about the parameters that can be set for each DSN, see the section :ref:`ODBC DSN Parameters `. + + + #. Create a file called ``.sqream_odbc.ini`` for managing the driver settings and logging. + This file should be created alongside the other files, and add the following lines (change the highlighted parameters to match your installation): + + .. code-block:: ini + :emphasize-lines: 5,7 + + # Note that this default DriverManagerEncoding of UTF-32 is for iODBC. unixODBC uses UTF-16 by default. + # If unixODBC was compiled with -DSQL_WCHART_CONVERT, then UTF-32 is the correct value. + # Execute 'odbc_config --cflags' to determine if you need UTF-32 or UTF-16 on unixODBC + [Driver] + DriverManagerEncoding=UTF-16 + DriverLocale=en-US + ErrorMessagesPath=/home/rhendricks/sqream_odbc64/ErrorMessages + LogLevel=0 + LogNamespace= + LogPath=/tmp/ + ODBCInstLib=libodbcinst.so + + +Install the driver dependencies +================================== + +Add the ODBC driver path to ``LD_LIBRARY_PATH``: + +.. code-block:: console + + $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/rhendricks/sqream_odbc64/lib + +You can also add this previous command line to your ``~/.bashrc`` file in order to keep this installation working between reboots without re-entering the command manually + +Testing the connection +======================== + +Test the driver using ``isql``. + +If the DSN created is called ``MyTest`` as the example, run isql in this format: + +.. code-block:: console + + $ isql MyTest + + +.. _dsn_params: + +ODBC DSN Parameters +======================= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Item + - Default + - Description + * - Data Source Name + - None + - An easily recognizable name that you'll use to reference this DSN. + * - Description + - None + - A description of this DSN for your convenience. This field can be left blank + * - User + - None + - Username of a role to use for connection. For example, ``User="rhendricks"`` + * - Password + - None + - Specifies the password of the selected role. For example, ``User="Tr0ub4dor&3"`` + * - Database + - None + - Specifies the database name to connect to. For example, ``Database="master"`` + * - Service + - ``sqream`` + - Specifices :ref:`service queue` to use. For example, ``Service="etl"``. Leave blank (``Service=""``) for default service ``sqream``. + * - Server + - None + - Hostname of the SQreamDB worker. For example, ``Server="127.0.0.1"`` or ``Server="sqream.mynetwork.co"`` + * - Port + - None + - TCP port of the SQreamDB worker. For example, ``Port="5000"`` or ``Port="3108"`` for the load balancer + * - Cluster + - ``false`` + - Connect via load balancer (use only if exists, and check port). For example, ``Cluster=true`` + * - Ssl + - ``false`` + - Specifies SSL for this connection. For example, ``Ssl=true`` + * - DriverManagerEncoding + - ``UTF-16`` + - Depending on how unixODBC is installed, you may need to change this to ``UTF-32``. + * - ErrorMessagesPath + - None + - Location where the driver was installed. For example, ``ErrorMessagePath=/home/rhendricks/sqream_odbc64/ErrorMessages``. + * - LogLevel + - 0 + - Set to 0-6 for logging. Use this setting when instructed to by SQreamDB Support. For example, ``LogLevel=1`` + + .. hlist:: + :columns: 3 + + * 0 = Disable tracing + * 1 = Fatal only error tracing + * 2 = Error tracing + * 3 = Warning tracing + * 4 = Info tracing + * 5 = Debug tracing + * 6 = Detailed tracing + +Limitations +=============== + +Please note that the SQreamDB ODBC connector does not support the use of ARRAY data types. If your database schema includes ARRAY columns, you may encounter compatibility issues when using ODBC to connect to the database. + + diff --git a/third_party_tools/client_drivers/odbc/install_configure_odbc_windows.rst b/connecting_to_sqream/client_drivers/odbc/install_configure_odbc_windows.rst similarity index 90% rename from third_party_tools/client_drivers/odbc/install_configure_odbc_windows.rst rename to connecting_to_sqream/client_drivers/odbc/install_configure_odbc_windows.rst index 7749b44ab..9dd0c3f22 100644 --- a/third_party_tools/client_drivers/odbc/install_configure_odbc_windows.rst +++ b/connecting_to_sqream/client_drivers/odbc/install_configure_odbc_windows.rst @@ -1,134 +1,139 @@ -.. _install_odbc_windows: - -**************************************** -Install and Configure ODBC on Windows -**************************************** - -The ODBC driver for Windows is provided as a self-contained installer. - -This tutorial shows you how to install and configure ODBC on Windows. - -.. contents:: In this topic: - :local: - :depth: 2 - -Installing the ODBC Driver -================================== - -Prerequisites ----------------- - -.. _vcredist: - -Visual Studio 2015 Redistributables -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To install the ODBC driver you must first install Microsoft's **Visual C++ Redistributable for Visual Studio 2015**. To install Visual C++ Redistributable for Visual Studio 2015, see the `Install Instructions `_. - -Administrator Privileges -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The SQream DB ODBC driver requires administrator privileges on your computer to add the DSNs (data source names). - - -1. Run the Windows installer ------------------------------- - -Install the driver by following the on-screen instructions in the easy-to-follow installer. - -.. image:: /_static/images/odbc_windows_installer_screen1.png - -.. note:: The installer will install the driver in ``C:\Program Files\SQream Technologies\ODBC Driver`` by default. This path is changable during the installation. - -2. Selecting Components -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The installer includes additional components, like JDBC and Tableau customizations. - -.. image:: /_static/images/odbc_windows_installer_screen2.png - -You can deselect items you don't want to install, but the items named **ODBC Driver DLL** and **ODBC Driver Registry Keys** must remain selected for a complete installation of the ODBC driver. - -Once the installer finishes, you will be ready to configure the DSN for connection. - -.. _create_windows_odbc_dsn: - -3. Configuring the ODBC Driver DSN -====================================== - -ODBC driver configurations are done via DSNs. Each DSN represents one SQream DB database. - -#. Open up the Windows menu by clicking the Windows button on your keyboard (:kbd:`⊞ Win`) or pressing the Windows button with your mouse. - -#. Type **ODBC** and select **ODBC Data Sources (64-bit)**. Click the item to open up the setup window. - - .. image:: /_static/images/odbc_windows_startmenu.png - -#. The installer has created a sample User DSN named **SQreamDB** - - You can modify this DSN, or create a new one (:menuselection:`Add --> SQream ODBC Driver --> Next`) - - .. image:: /_static/images/odbc_windows_dsns.png - -#. Enter your connection parameters. See the reference below for a description of the parameters. - - .. image:: /_static/images/odbc_windows_dsn_config.png - -#. When completed, save the DSN by selecting :menuselection:`OK` - -.. tip:: Test the connection by clicking :menuselection:`Test` before saving. A successful test looks like this: - - .. image:: /_static/images/odbc_windows_dsn_test.png - -#. You can now use this DSN in ODBC applications like :ref:`Tableau `. - - - -Connection Parameters ------------------------ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Item - - Description - * - Data Source Name - - An easily recognizable name that you'll use to reference this DSN. Once you set this, it can not be changed. - * - Description - - A description of this DSN for your convenience. You can leave this blank. - * - User - - Username of a role to use for connection. For example, ``rhendricks`` - * - Password - - Specifies the password of the selected role. For example, ``Tr0ub4dor&3`` - * - Database - - Specifies the database name to connect to. For example, ``master`` - * - Service - - Specifices :ref:`service queue` to use. For example, ``etl``. Leave blank for default service ``sqream``. - * - Server - - Hostname of the SQream DB worker. For example, ``127.0.0.1`` or ``sqream.mynetwork.co`` - * - Port - - TCP port of the SQream DB worker. For example, ``5000`` or ``3108`` - * - User server picker - - Connect via load balancer (use only if exists, and check port) - * - SSL - - Specifies SSL for this connection - * - Logging options - - Use this screen to alter logging options when tracing the ODBC connection for possible connection issues. - - -Troubleshooting -================== - -Solving "Code 126" ODBC errors ---------------------------------- - -After installing the ODBC driver, you may experience the following error: - -.. code-block:: none - - The setup routines for the SQreamDriver64 ODBC driver could not be loaded due to system error - code 126: The specified module could not be found. - (c:\Program Files\SQream Technologies\ODBC Driver\sqreamOdbc64.dll) - -This is an issue with the Visual Studio Redistributable packages. Verify you've correctly installed them, as described in the :ref:`Visual Studio 2015 Redistributables ` section above. +.. _install_odbc_windows: + +**************************************** +Install and Configure ODBC on Windows +**************************************** + +The ODBC driver for Windows is provided as a self-contained installer. + +This tutorial shows you how to install and configure ODBC on Windows. + +.. contents:: + :local: + :depth: 1 + +Installing the ODBC Driver +================================== + +Prerequisites +---------------- + +.. _vcredist: + +Visual Studio 2015 Redistributables +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To install the ODBC driver you must first install Microsoft's **Visual C++ Redistributable for Visual Studio 2015**. To install Visual C++ Redistributable for Visual Studio 2015, see the `Install Instructions `_. + +Administrator Privileges +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The SQream DB ODBC driver requires administrator privileges on your computer to add the DSNs (data source names). + + +Running the Windows Installer +------------------------------ + +Install the driver by following the on-screen instructions in the easy-to-follow installer. + +.. image:: /_static/images/odbc_windows_installer_screen1.png + +.. note:: The installer will install the driver in ``C:\Program Files\SQream Technologies\ODBC Driver`` by default. This path is changable during the installation. + +Selecting Components +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The installer includes additional components, like JDBC and Tableau customizations. + +.. image:: /_static/images/odbc_windows_installer_screen2.png + +You can deselect items you don't want to install, but the items named **ODBC Driver DLL** and **ODBC Driver Registry Keys** must remain selected for a complete installation of the ODBC driver. + +Once the installer finishes, you will be ready to configure the DSN for connection. + +.. _create_windows_odbc_dsn: + +Configuring the ODBC Driver DSN +====================================== + +ODBC driver configurations are done via DSNs. Each DSN represents one SQream DB database. + +#. Open up the Windows menu by clicking the Windows button on your keyboard (:kbd:`⊞ Win`) or pressing the Windows button with your mouse. + +#. Type **ODBC** and select **ODBC Data Sources (64-bit)**. Click the item to open up the setup window. + + .. image:: /_static/images/odbc_windows_startmenu.png + +#. The installer has created a sample User DSN named **SQreamDB** + + You can modify this DSN, or create a new one (:menuselection:`Add --> SQream ODBC Driver --> Next`) + + .. image:: /_static/images/odbc_windows_dsns.png + +#. Enter your connection parameters. See the reference below for a description of the parameters. + + .. image:: /_static/images/odbc_windows_dsn_config.png + +#. When completed, save the DSN by selecting :menuselection:`OK` + +.. tip:: Test the connection by clicking :menuselection:`Test` before saving. A successful test looks like this: + + .. image:: /_static/images/odbc_windows_dsn_test.png + + + + + +Connection Parameters +----------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Item + - Description + * - Data Source Name + - An easily recognizable name that you'll use to reference this DSN. Once you set this, it can not be changed. + * - Description + - A description of this DSN for your convenience. You can leave this blank. + * - User + - Username of a role to use for connection. For example, ``rhendricks`` + * - Password + - Specifies the password of the selected role. For example, ``Tr0ub4dor&3`` + * - Database + - Specifies the database name to connect to. For example, ``master`` + * - Service + - Specifies :ref:`service queue` to use. For example, ``etl``. Leave blank for default service ``sqream``. + * - Server + - Hostname of the SQream DB worker. For example, ``127.0.0.1`` or ``sqream.mynetwork.co`` + * - Port + - TCP port of the SQream DB worker. For example, ``5000`` or ``3108`` + * - User server picker + - Connect via load balancer (use only if exists, and check port) + * - SSL + - Specifies SSL for this connection + * - Logging options + - Use this screen to alter logging options when tracing the ODBC connection for possible connection issues. + + +Troubleshooting +================== + +Solving "Code 126" ODBC errors +--------------------------------- + +After installing the ODBC driver, you may experience the following error: + +.. code-block:: none + + The setup routines for the SQreamDriver64 ODBC driver could not be loaded due to system error + code 126: The specified module could not be found. + (c:\Program Files\SQream Technologies\ODBC Driver\sqreamOdbc64.dll) + +This is an issue with the Visual Studio Redistributable packages. Verify you've correctly installed them, as described in the :ref:`Visual Studio 2015 Redistributables ` section above. + +Limitations +=============== + +Please note that the SQreamDB ODBC connector does not support the use of ARRAY data types. If your database schema includes ARRAY columns, you may encounter compatibility issues when using ODBC to connect to the database. \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/python/index.rst b/connecting_to_sqream/client_drivers/python/index.rst new file mode 100644 index 000000000..d18acdcba --- /dev/null +++ b/connecting_to_sqream/client_drivers/python/index.rst @@ -0,0 +1,558 @@ +.. _pysqream: + +***************** +Python (pysqream) +***************** + +The current Pysqream connector supports Python version 3.9 and newer. It includes a set of packages that allows Python programs to connect to SQreamDB. The base ``pysqream`` package conforms to Python DB-API specifications `PEP-249 `_. + +``pysqream`` is a pure Python connector that can be installed with ``pip`` on any operating system, including Linux, Windows, and macOS. ``pysqream-sqlalchemy`` is a SQLAlchemy dialect for ``pysqream``. + + +.. contents:: + :local: + :depth: 1 + +Installing the Python Connector +=============================== + +Prerequisites +------------- + +It is essential that you have the following installed: + +.. contents:: + :local: + :depth: 1 + +Python +~~~~~~ + +The connector requires Python version 3.9 or newer. + +To see your current Python version, run the following command: + +.. code-block:: console + + $ python --version + + +PIP +~~~ + +The Python connector is installed via ``pip``, the standard package manager for Python, which is used to install, upgrade and manage Python packages (libraries) and their dependencies. + +We recommend upgrading to the latest version of ``pip`` before installing. + +To verify that you have the latest version, run the following command: + +.. code-block:: console + + $ python3 -m pip install --upgrade pip + Collecting pip + Downloading https://files.pythonhosted.org/packages/00/b6/9cfa56b4081ad13874b0c6f96af8ce16cfbc1cb06bedf8e9164ce5551ec1/pip-19.3.1-py2.py3-none-any.whl (1.4MB) + |████████████████████████████████| 1.4MB 1.6MB/s + Installing collected packages: pip + Found existing installation: pip 19.1.1 + Uninstalling pip-19.1.1: + Successfully uninstalled pip-19.1.1 + Successfully installed pip-19.3.1 + + +.. note:: + * On macOS, you may want to use virtualenv to install Python and the connector, to ensure compatibility with the built-in Python environment + * If you encounter an error including ``SSLError`` or ``WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.`` - please be sure to reinstall Python with SSL enabled, or use virtualenv or Anaconda. + +OpenSSL for Linux +~~~~~~~~~~~~~~~~~ + +The Python connector relies on OpenSSL for secure connections to SQreamDB. Some distributions of Python do not include OpenSSL. + +To install OpenSSL on RHEL, run the following command: + + .. code-block:: console + + $ sudo yum install -y libffi-devel openssl-devel + +Installing via PIP with an internet connection +---------------------------------------------- + +The Python connector is available via `PyPi `_. + +To install the connector using pip, it is advisable to use the ``-U`` or ``--user`` flags instead of sudo, as it ensures packages are installed per user. However, it is worth noting that the connector can only be accessed under the same user. + +To install ``pysqream`` and ``pysqream-sqlalchemy`` with the ``--user`` flag, run the following command: + +.. code-block:: console + + $ pip3.9 install pysqream pysqream-sqlalchemy --user + +``pip3`` will automatically install all necessary libraries and modules. + +Installing via PIP without an internet connection +------------------------------------------------- + +#. To get the ``.whl`` package file, contact you SQreamDB support representative. + +#. Run the following command: + +.. code-block:: console + + tar -xf pysqream_connector_5.2.0.tar.gz + cd pysqream_connector_5.2.0 + #Install all packages with --no-index --find-links . + python3 -m pip install *.whl -U --no-index --find-links . + python3.9 -m pip install pysqream-5.2.0.zip -U --no-index --find-links . + python3.9 -m pip install pysqream-sqlalchemy-1.3.zip -U --no-index --find-links . + +Upgrading an Existing Installation +---------------------------------- + +The Python drivers are updated periodically. To upgrade an existing pysqream installation, use pip's ``-U`` flag: + +.. code-block:: console + + $ pip3.9 install pysqream pysqream-sqlalchemy -U + +.. _sqlalchemy: + +SQLAlchemy +========== + +SQLAlchemy is an Object-Relational Mapper (ORM) for Python. When you install the SQream dialect (``pysqream-sqlalchemy``) you can use frameworks such as Pandas, TensorFlow, and Alembic to query SQream directly. + +Before You Begin +---------------- + +Download `pysqream-sqlalchemy `_ + +Limitation +----------- + +* Does not support the ``ARRAY`` data type + + +Creating a Standard Connection +------------------------------ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``username`` + - Username of a role to use for connection + * - ``password`` + - Specifies the password of the selected role + * - ``host`` + - Specifies the hostname + * - ``port`` + - Specifies the port number + * - ``port_ssl`` + - An optional parameter + * - ``database`` + - Specifies the database name + * - ``clustered`` + - Establishing a multi-clustered connection. Input values: ``True``, ``False``. Default is ``False`` + * - ``service`` + - Specifies service queue to use + + +.. code-block:: python + + import sqlalchemy as sa + from sqlalchemy.engine.url import URL + + + engine_url = sa.engine.url.URL('sqream', + username='', + password='', + host='', + port=, + port_ssl=, + database='') + + engine = sa.create_engine(engine_url,connect_args={"clustered": False, "service": ""}) + + + +Pulling a Table into Pandas +--------------------------- + +The following example shows how to pull a table in Pandas. This example uses the URL method to create the connection string: + +.. code-block:: python + + import sqlalchemy as sa + import pandas as pd + from sqlalchemy.engine.url import URL + + + engine_url = sa.engine.url.URL('sqream', + username='sqream', + password='12345', + host='127.0.0.1', + port=3108, + database='master') + + engine = sa.create_engine(engine_url,connect_args={"clustered": True, "service": "admin"}) + table_df = pd.read_sql("select * from nba", con=engine) + +API +=== + +.. contents:: + :local: + :depth: 1 + +Using the Cursor +---------------- +The DB-API specification includes several methods for fetching results from the cursor. This sections shows an example using the ``nba`` table, which looks as follows: + +.. csv-table:: nba + :file: nba-t10.csv + :widths: auto + :header-rows: 1 + +As before, you must import the library and create a :py:meth:`~Connection`, followed by :py:meth:`~Connection.execute` on a simple ``SELECT *`` query: + +.. code-block:: python + + import pysqream + + + con = pysqream.connect(host='127.0.0.1', + port=3108, + database='master', + username='rhendricks', + password='Tr0ub4dor&3', + clustered=True) + + cur = con.cursor() # Create a new cursor + # The select statement: + statement = 'SELECT * FROM nba' + cur.execute(statement) + +When the statement has finished executing, you have a :py:meth:`Connection` cursor object waiting. A cursor is iterable, meaning that it advances the cursor to the next row when fetched. + +You can use :py:meth:`~Connection.fetchone` to fetch one record at a time: + +.. code-block:: python + + first_row = cur.fetchone() # Fetch one row at a time (first row) + + second_row = cur.fetchone() # Fetch one row at a time (second row) + +To fetch several rows at a time, use :py:meth:`~Connection.fetchmany`: + +.. code-block:: python + + # executing `fetchone` twice is equivalent to this form: + third_and_fourth_rows = cur.fetchmany(2) + +To fetch all rows at once, use :py:meth:`~Connection.fetchall`: + +.. code-block:: python + + # To get all rows at once, use `fetchall` + remaining_rows = cur.fetchall() + + cur.close() + + + # Close the connection when done + con.close() + +The following is an example of the contents of the row variables used in our examples: + +.. code-block:: pycon + + >>> print(first_row) + ('Avery Bradley', 'Boston Celtics', 0, 'PG', 25, '6-2', 180, 'Texas', 7730337) + >>> print(second_row) + ('Jae Crowder', 'Boston Celtics', 99, 'SF', 25, '6-6', 235, 'Marquette', 6796117) + >>> print(third_and_fourth_rows) + [('John Holland', 'Boston Celtics', 30, 'SG', 27, '6-5', 205, 'Boston University', None), ('R.J. Hunter', 'Boston Celtics', 28, 'SG', 22, '6-5', 185, 'Georgia State', 1148640)] + >>> print(remaining_rows) + [('Jonas Jerebko', 'Boston Celtics', 8, 'PF', 29, '6-10', 231, None, 5000000), ('Amir Johnson', 'Boston Celtics', 90, 'PF', 29, '6-9', 240, None, 12000000), ('Jordan Mickey', 'Boston Celtics', 55, 'PF', 21, '6-8', 235, 'LSU', 1170960), ('Kelly Olynyk', 'Boston Celtics', 41, 'C', 25, '7-0', 238, 'Gonzaga', 2165160), + [...] + +.. note:: Calling a fetch command after all rows have been fetched will return an empty array (``[]``). + +Reading Result Metadata +----------------------- + +When you execute a statement, the connection object also contains metadata about the result set, such as **column names**, **types**, etc). + +The metadata is stored in the :py:attr:`Connection.description` object of the cursor: + +.. code-block:: python + + import pysqream + + + con = pysqream.connect(host='127.0.0.1', + port=3108, + database='master', + username='rhendricks', + password='Tr0ub4dor&3', + clustered=True) + cur = con.cursor() + statement = 'SELECT * FROM nba' + cur.execute(statement) + print(cur.description) + # [('Name', 'STRING', 24, 24, None, None, True), ('Team', 'STRING', 22, 22, None, None, True), ('Number', 'NUMBER', 1, 1, None, None, True), ('Position', 'STRING', 2, 2, None, None, True), ('Age (as of 2018)', 'NUMBER', 1, 1, None, None, True), ('Height', 'STRING', 4, 4, None, None, True), ('Weight', 'NUMBER', 2, 2, None, None, True), ('College', 'STRING', 21, 21, None, None, True), ('Salary', 'NUMBER', 4, 4, None, None, True)] + +You can fetch a list of column names by iterating over the ``description`` list: + +.. code-block:: pycon + + >>> [ i[0] for i in cur.description ] + ['Name', 'Team', 'Number', 'Position', 'Age (as of 2018)', 'Height', 'Weight', 'College', 'Salary'] + +Loading Data into a Table +------------------------- + +This example shows how to load 10,000 rows of dummy data to an instance of SQreamDB. + +**To load data 10,000 rows of dummy data to an instance of SQreamDB:** + +1. Run the following: + + .. code-block:: python + + import pysqream + import sqlalchemy.orm as orm + from datetime import date, datetime + from time import time + + + con = pysqream.connect(host='127.0.0.1', + port=3108, + database='master', + username='rhendricks', + password='Tr0ub4dor&3', + clustered=True) + + cur = con.cursor() + +2. Create a table for loading: + + .. code-block:: python + + create = 'create or replace table perf (b bool, t tinyint, sm smallint, i int, bi bigint, f real, d double, s text(12), ss text, dt date, dtt datetime)' + cur.execute(create) + +3. Create a session: + + .. code-block:: python + + session = orm.sessionmaker(bind=engine)() + +4. Load your data into table using the ``INSERT`` command. + +5. Create dummy data matching the table you created: + + .. code-block:: python + + data = (False, 2, 12, 145, 84124234, 3.141, -4.3, "Marty McFly" , u"キウイは楽しい鳥です" , date(2019, 12, 17), datetime(1955, 11, 4, 1, 23, 0, 0)) + + row_count = 10**4 + +6. Get a new cursor: + + .. code-block:: python + + insert = 'insert into perf values (?,?,?,?,?,?,?,?,?,?,?)' + start = time() + cur.executemany(insert, [data] * row_count) + print (f"Total insert time for {row_count} rows: {time() - start} seconds") + +7. Close this cursor: + + .. code-block:: python + + cur.close() + +8. Verify that the data was inserted correctly: + + .. code-block:: python + + cur = con.cursor() + cur.execute('select count(*) from perf') + result = cur.fetchall() # `fetchall` collects the entire data set + print (f"Count of inserted rows: {result[0][0]}") + +9. Close the cursor: + + .. code-block:: python + + cur.close() + +10. Close the connection: + + .. code-block:: python + + con.close() + + + +Using SQLAlchemy ORM to Create and Populate Tables +-------------------------------------------------- + +This section shows how to use the ORM to create and populate tables from Python objects. + +**To use SQLAlchemy ORM to create and populate tables:** + +1. Run the following: + + .. code-block:: python + + import sqlalchemy as sa + import pandas as pd + + engine_url = "sqream://rhendricks:secret_password@localhost:5000/raviga" + + engine = sa.create_engine(engine_url) + +2. Build a metadata object and bind it: + + .. code-block:: python + + metadata = sa.MetaData() + metadata.bind = engine + +3. Create a table in the local metadata: + + .. code-block:: python + + employees = sa.Table( + 'employees', + metadata, + sa.Column('id', sa.Integer), + sa.Column('name', sa.TEXT(32)), + sa.Column('lastname', sa.TEXT(32)), + sa.Column('salary', sa.Float) + ) + + The ``create_all()`` function uses the SQreamDB engine object. + +4. Create all the defined table objects: + + .. code-block:: python + + metadata.create_all(engine) + +5. Populate your table. + +6. Build the data rows: + + .. code-block:: python + + insert_data = [ {'id': 1, 'name': 'Richard','lastname': 'Hendricks', 'salary': 12000.75}, + {'id': 3, 'name': 'Bertram', 'lastname': 'Gilfoyle', 'salary': 8400.0}, + {'id': 8, 'name': 'Donald', 'lastname': 'Dunn', 'salary': 6500.40}] + +7. Build the ``INSERT`` command: + + .. code-block:: python + + ins = employees.insert(insert_data) + +8. Execute the command: + + .. code-block:: python + + result = session.execute(ins) + +Prepared Statements +==================== + +Prepared statements, also known as parameterized queries, are a safer and more efficient way to execute SQL statements. They prevent SQL injection attacks by separating SQL code from data, and they can improve performance by reusing prepared statements. +In SQream, we use ``?`` as a placeholder for the relevant value in parameterized queries. +Prepared statements ``INSERT``, ``SELECT``, ``UPDATE`` and ``DELETE`` + +Prepared Statement Limitations +--------------------------- +* Prepared Statement do not support the use of :ref:`keywords_and_identifiers` as input parameters. + +Prepared Statements code example +-------------------------------- + +.. code-block:: python + + import pysqream + from datetime import date, datetime + from time import time + + # SQreamDB Connection Setting + con = pysqream.connect(host='', port=3108, database='master' + , username='', password='' + , clustered=True) + cur = con.cursor() + + # CREATE + create = 'create or replace table perf (b bool, t tinyint, sm smallint, i int, bi bigint, f real, d double, s text(12), ss text, dt date, dtt datetime)' + cur.execute(create) + + # DATA + data = (False, 2, 12, 145, 84124234, 3.141, -4.3, "STRING1" , "STRING2", date(2024, 11, 11), datetime(2024, 11, 11, 11, 11, 11, 11)) + row_count = 10**2 + + # Insert + insert = 'insert into perf values (?,?,?,?,?,?,?,?,?,?,?)' + start = time() + + # Prepared Statement + cur.executemany(insert, [data] * row_count) + print (f"Total insert time for {row_count} rows: {time() - start} seconds") + + + # Results(Table Count) + cur = con.cursor() + cur.execute('select count(*) from perf') + result = cur.fetchall() # `fetchall` collects the entire data set + print (f"Count of inserted rows: {result[0][0]}") + + + # SELECT + query = "SELECT * FROM perf WHERE s = ?" + params = [("STRING1",)] + + # Prepared Statement + cur.execute(query,params) + + + # Result + rows = cur.fetchall() + print(rows) + + for row in rows: + print(row) + + + # UPDATE + query = "UPDATE perf SET s = '?' WHERE s = '?'" + params = [("STRING3", "STRING2")] + + # Prepared Statement + cur.execute(query,params) + + print("Update completed.") + + + # DELETE + query = "DELETE FROM perf WHERE s = ?" + params = [("STRING1",)] + + # Prepared Statement + cur.execute(query,params) + + print("Delete completed.") + + # Conn Close + cur.close() + con.close() + + diff --git a/third_party_tools/client_drivers/python/nba-t10.csv b/connecting_to_sqream/client_drivers/python/nba-t10.csv similarity index 98% rename from third_party_tools/client_drivers/python/nba-t10.csv rename to connecting_to_sqream/client_drivers/python/nba-t10.csv index fe9ced442..024530355 100644 --- a/third_party_tools/client_drivers/python/nba-t10.csv +++ b/connecting_to_sqream/client_drivers/python/nba-t10.csv @@ -1,10 +1,10 @@ -Name,Team,Number,Position,Age,Height,Weight,College,Salary -Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0 -Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0 -John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University, -R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0 -Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0 -Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0 -Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0 -Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0 -Terry Rozier,Boston Celtics,12.0,PG,22.0,6-2,190.0,Louisville,1824360.0 +Name,Team,Number,Position,Age,Height,Weight,College,Salary +Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0 +Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0 +John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University, +R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0 +Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0 +Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0 +Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0 +Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0 +Terry Rozier,Boston Celtics,12.0,PG,22.0,6-2,190.0,Louisville,1824360.0 diff --git a/third_party_tools/client_drivers/python/test.py b/connecting_to_sqream/client_drivers/python/test.py similarity index 95% rename from third_party_tools/client_drivers/python/test.py rename to connecting_to_sqream/client_drivers/python/test.py index 51d0b4a92..d7de6305a 100644 --- a/third_party_tools/client_drivers/python/test.py +++ b/connecting_to_sqream/client_drivers/python/test.py @@ -1,37 +1,37 @@ -#!/usr/bin/env python - -import pysqream - -""" -Connection parameters include: -* IP/Hostname -* Port -* database name -* username -* password -* Connect through load balancer, or direct to worker (Default: false - direct to worker) -* use SSL connection (default: false) -* Optional service queue (default: 'sqream') -""" - -# Create a connection object - -con = pysqream.connect(host='127.0.0.1', port=5000, database='master' - , username='sqream', password='sqream' - , clustered=False) - -# Create a new cursor -cur = con.cursor() - -# Prepare and execute a query -cur.execute('select show_version()') - -result = cur.fetchall() # `fetchall` gets the entire data set - -print (f"Version: {result[0][0]}") - -# This should print the SQream DB version. For example ``Version: v2020.1``. - -# Finally, close the connection - +#!/usr/bin/env python + +import pysqream + +""" +Connection parameters include: +* IP/Hostname +* Port +* database name +* username +* password +* Connect through load balancer, or direct to worker (Default: false - direct to worker) +* use SSL connection (default: false) +* Optional service queue (default: 'sqream') +""" + +# Create a connection object + +con = pysqream.connect(host='127.0.0.1', port=5000, database='master' + , username='sqream', password='sqream' + , clustered=False) + +# Create a new cursor +cur = con.cursor() + +# Prepare and execute a query +cur.execute('select show_version()') + +result = cur.fetchall() # `fetchall` gets the entire data set + +print (f"Version: {result[0][0]}") + +# This should print the SQream DB version. For example ``Version: v2020.1``. + +# Finally, close the connection + con.close() \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/spark/index.rst b/connecting_to_sqream/client_drivers/spark/index.rst new file mode 100644 index 000000000..03ef309c3 --- /dev/null +++ b/connecting_to_sqream/client_drivers/spark/index.rst @@ -0,0 +1,274 @@ +.. _spark: + +***** +Spark +***** + +The Spark connector enables reading and writing data to and from SQreamDB and may be used for large-scale data processing. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================= + +To use Spark with SQreamDB, it is essential that you have the following installed: + +* SQreamDB version 2022.1.8 or later +* Spark version 3.3.1 or later +* `SQreamDB Spark Connector 5.0.0 `_ +* :ref:`JDBC` version 4.5.6 or later + +Configuration +============= + +The Spark JDBC connection properties empower you to customize your Spark connection. These properties facilitate various aspects, including database access, query execution, and result retrieval. Additionally, they provide options for authentication, encryption, and connection pooling. + +The following Spark connection properties are supported by SQreamDB: + +.. list-table:: + :widths: auto + :header-rows: 1 + + + * - Parameter + - Default + - Description + * - ``url`` + - + - The JDBC URL to connect to the database. + * - ``dbtable`` + - + - The JDBC URL to connect to the database. + * - ``query`` + - + - The SQL query to be executed when using the JDBC data source, instead of specifying a table or view name with the dbtable property. + * - ``driver`` + - + - The fully qualified class name of the JDBC driver to use when connecting to a relational database. + * - ``numPartitions`` + - + - The number of partitions to use when reading data from a data source. + * - ``queryTimeout`` + - 0 + - The maximum time in seconds for a JDBC query to execute before timing out. + * - ``fetchsize`` + - 0 + - The number of rows to fetch in a single JDBC fetch operation. + * - ``batchsize`` + - 1000000 + - The number of rows to write in a single JDBC batch operation when writing to a database. + * - ``sessionInitStatement`` + - + - A SQL statement to be executed once at the beginning of a JDBC session, such as to set session-level properties. + * - ``truncate`` + - ``false`` + - A boolean value indicating whether to truncate an existing table before writing data to it. + * - ``cascadeTruncate`` + - the default cascading truncate behaviour of the JDBC database in question, specified in the ``isCascadeTruncate`` in each JDBCDialect + - A boolean value indicating whether to recursively truncate child tables when truncating a table. + * - ``createTableOptions`` + - + - Additional options to include when creating a new table in a relational database. + * - ``createTableColumnTypes`` + - + - A map of column names to column data types to use when creating a new table in a relational database. + * - ``customSchema`` + - + - A custom schema to use when reading data from a file format that does not support schema inference, such as CSV or JSON. + * - ``pushDownPredicate`` + - ``true`` + - A boolean value indicating whether to push down filters to the data source. + * - ``pushDownAggregate`` + - ``false`` + - A boolean value indicating whether to push down aggregations to the data source. + * - ``pushDownLimit`` + - ``false`` + - A boolean value indicating whether to push down limits to the data source. + * - ``pushDownTableSample`` + - ``false`` + - Used to optimize the performance of SQL queries on large tables by pushing down the sampling operation closer to the data source, reducing the amount of data that needs to be processed. + * - ``connectionProvider`` + - + - A fully qualified class name of a custom connection provider to use when connecting to a data source. + * - ``c`` + - ``false`` + - A shorthand for specifying connection properties in the JDBC data source. + +Connecting Spark to SQreamDB +---------------------------- + +DataFrames, as Spark objects, play a crucial role in transferring data between different sources. The SQreamDB-Spark Connector facilitates the seamless integration of DataFrames, allowing the insertion of DataFrames into SQreamDB tables. Furthermore, it enables the export of tables or queries as DataFrames, providing compatibility with Spark for versatile data processing. + +1. To open the Spark Shell, run the following command under the ``Spark/bin`` directory: + +.. code-block:: console + + ./spark-shell --driver-class-path {driver path} --jars {Spark-Sqream-Connector.jar path} + + + //Example: + + ./spark-shell --driver-class-path /home/sqream/sqream-jdbc-4.5.6.jar --jars Spark-Sqream-Connector-1.0.jar + +2. To create a SQreamDB session, run the following commands in the Spark Shell: + +.. code-block:: console + + import scala.collection.JavaConverters.mapAsJavaMapConverter + val config = Map("spark.master"->"local").asJava + import com.sqream.driver.SqreamSession; + val sqreamSession=SqreamSession.getSession(config) + +Transferring Data +=================== + +Transferring Data From SQreamDB to Spark +------------------------------------------ + +1. Create a mapping of Spark options: + +.. code-block:: console + + val options = Map("query"->"select * from ", "url"->"jdbc:/master;user=;password=;cluster=false").asJava + +2. Create a Spark DataFrame: + +.. code-block:: console + + val df=sqreamSession.read(options) + +Transferring Data From Spark to SQreamDB +------------------------------------------ + +1. Create a mapping of Spark options, using the ``dbtable`` Spark option (``query`` is not allowed for writing): + +.. code-block:: console + + val options = Map("dbtable"-> ", "url"->"jdbc:/master;user=;password=;cluster=false").asJava + +2. Create a Spark DataFrame: + +.. code-block:: console + + import org.apache.spark.sql.SaveMode + val df=sqreamSession.write(df, options, SaveMode.Overwrite) + +Data Types and Mapping +======================== + +SQreamDB data types mapped to Spark + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - SQreamDB + - Spark + * - ``BIGINT`` + - ``LONGINT`` + * - ``BOOL`` + - ``BooleanType`` + * - ``DATE`` + - ``DateType`` + * - ``DOUBLE`` + - ``DoubleType`` + * - ``REAL`` + - ``FloateType`` + * - ``DECIMAL`` + - ``DeciamlType`` + * - ``INT`` + - ``Integer`` + * - ``SMALLINT`` + - ``ShortType`` + * - ``TINYINT`` + - ``ShortType`` + * - ``DATETIME`` + - ``TimestampType`` + +Spark data types mapped to SQreamDB + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Spark + - SQreamDB + * - ``BooleanType`` + - ``BOOL`` + * - ``ByteType`` + - ``SMALLINT`` + * - ``DateType`` + - ``DATE`` + * - ``DecimalType`` + - ``DECIMAL`` + * - ``DoubleType`` + - ``DOUBLE`` + * - ``FloatType`` + - ``REAL`` + * - ``IntegerType`` + - ``INT`` + * - ``LongType`` + - ``BIGINT`` + * - ``ShortType`` + - ``SMALLINT`` + * - ``StringType`` + - ``TEXT`` + * - ``TimestampType`` + - ``DATETIME`` + + +Example +======== + +JAVA + +.. code-block:: java + + import com.sqream.driver.SqreamSession; + import org.apache.spark.sql.Dataset; + import org.apache.spark.sql.Row; + import org.apache.spark.sql.SaveMode; + + import java.util.HashMap; + + public class main { + public static void main(String[] args) { + HashMap config = new HashMap<>(); + //spark configuration + //optional configuration here: https://spark.apache.org/docs/latest/configuration.html + config.put("spark.master", "spark://localhost:7077"); + config.put("spark.dynamicAllocation.enabled", "false"); + + config.put("spark.driver.port", "7077"); + config.put("spark.driver.host", "192.168.0.157"); + config.put("spark.driver.bindAddress", "192.168.0.157"); + + SqreamSession sqreamSession = SqreamSession.getSession(config); + + //spark properties + //optional properties here: https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html + HashMap props = new HashMap<>(); + + props.put("url", "jdbc:Sqream://192.168.0.157:3108/master;user=sqream;password=1234;cluster=true;"); + + //spark partition// + props.put("dbtable", "public.test_table"); + props.put("partitionColumn","sr_date_sk"); + props.put("numPartitions","2"); + props.put("lowerBound","2450820"); + props.put("upperBound","2452822"); + + + /*Read from sqream table*/ + Dataset dataFrame = sqreamSession.read(props); + dataFrame.show();/*By default, show() displays only the first 20 rows of the DataFrame. + This can be insufficient when working with large datasets. You can customize the number of rows displayed by passing an argument to show(n).*// + + + /*Add to sqream table*/ + sqreamSession.write(dataFrame, props, SaveMode.Append); + + } + } \ No newline at end of file diff --git a/connecting_to_sqream/client_drivers/trino/index.rst b/connecting_to_sqream/client_drivers/trino/index.rst new file mode 100644 index 000000000..cd424a57f --- /dev/null +++ b/connecting_to_sqream/client_drivers/trino/index.rst @@ -0,0 +1,112 @@ +.. _trino: + +***** +Trino +***** + +If you are using Trino for distributed SQL query processing and wish to use it to connect to SQreamDB, follow these instructions. + + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +It is essential you have the following installed: + +* SQreamDB version 4.1 or later +* Trino version 403 or later +* `Trino Connector `_ +* :ref:`JDBC` version 4.5.6 or later + +Installation +============ + +The Trino Connector must be installed on each cluster node dedicated to Trino. + +1. Create a dedicated directory for the Trino Connector. + +2. Download the Trino Connector and extract the content of the ZIP file to the dedicated directory. + +Connecting to SQreamDB +====================== + +Trino uses catalogs for referencing stored objects such as tables, databases, and functions. Each Trino catalog may be configured with access to a single SQreamDB database. If you wish Trino to have access to more than one SQreamDB database or server, you must create additional catalogs. + +Catalogs may be created using ``properties`` files. Start by creating a ``sqream.properties`` file and placing it under ``trino-server/etc/catalog``. + +The following is an example of a properties file: + +.. code-block:: java + + connector.name= + connection-url=jdbc:Sqream:///;[; ...] + connection-user= + connection-password= + +Supported Data Types and Mapping +================================ + +Use the appropriate Trino data type for executing queries. Upon execution, incompatible data types will be converted by Trino to SQreamDB data types. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Trino type + - SQreamDB type + * - ``BOOLEAN`` + - ``BOOL`` + * - ``TINYINT`` + - ``TINYINT`` + * - ``SMALLINT`` + - ``SMALLINT`` + * - ``INT`` + - ``INT`` + * - ``BIGINT`` + - ``BIGINT`` + * - ``REAL`` + - ``REAL`` + * - ``DOUBLE`` + - ``DOUBLE`` + * - ``DATE`` + - ``DATE`` + * - ``TIMESTAMP`` + - ``DATETIME`` + * - ``VARCHAR`` + - ``TEXT`` + * - ``DECIMAL(P,S)`` + - ``NUMERIC(P,S)`` + +Examples +======== + +The following is an example of the ``SHOW SCHEMAS FROM`` statement: + +.. code-block:: postgres + + SHOW SCHEMAS FROM sqream; + +The following is an example of the ``SHOW TABLES FROM`` statement: + +.. code-block:: postgres + + SHOW TABLES FROM sqream.public; + +The following is an example of the ``DESCRIBE sqream.public.t`` statement: + +.. code-block:: postgres + + DESCRIBE sqream.public.t; + +Limitations +=========== + +The Trino Connector does not support the following SQL statements: + +* ``GRANT`` +* ``REVOKE`` +* ``SHOW GRANTSHOW ROLES`` +* ``SHOW ROLE GRANTS`` diff --git a/third_party_tools/client_platforms/connect2.sas b/connecting_to_sqream/client_platforms/connect.sas similarity index 100% rename from third_party_tools/client_platforms/connect2.sas rename to connecting_to_sqream/client_platforms/connect.sas diff --git a/third_party_tools/client_platforms/connect.sas b/connecting_to_sqream/client_platforms/connect2.sas similarity index 95% rename from third_party_tools/client_platforms/connect.sas rename to connecting_to_sqream/client_platforms/connect2.sas index 78c670762..10fcdb0a2 100644 --- a/third_party_tools/client_platforms/connect.sas +++ b/connecting_to_sqream/client_platforms/connect2.sas @@ -1,27 +1,27 @@ -options sastrace='d,d,d,d' -sastraceloc=saslog -nostsuffix -msglevel=i -sql_ip_trace=(note,source) -DEBUG=DBMS_SELECT; - -options validvarname=any; - -libname sqlib jdbc driver="com.sqream.jdbc.SQDriver" - classpath="/opt/sqream/sqream-jdbc-4.0.0.jar" - URL="jdbc:Sqream://sqream-cluster.piedpiper.com:3108/raviga;cluster=true" - user="rhendricks" - password="Tr0ub4dor3" - schema="public" - PRESERVE_TAB_NAMES=YES - PRESERVE_COL_NAMES=YES; - -proc sql; - title 'Customers table'; - select * - from sqlib.customers; -quit; - -data sqlib.customers; - set sqlib.customers; +options sastrace='d,d,d,d' +sastraceloc=saslog +nostsuffix +msglevel=i +sql_ip_trace=(note,source) +DEBUG=DBMS_SELECT; + +options validvarname=any; + +libname sqlib jdbc driver="com.sqream.jdbc.SQDriver" + classpath="/opt/sqream/sqream-jdbc-4.0.0.jar" + URL="jdbc:Sqream://sqream-cluster.piedpiper.com:3108/raviga;cluster=true" + user="rhendricks" + password="Tr0ub4dor3" + schema="public" + PRESERVE_TAB_NAMES=YES + PRESERVE_COL_NAMES=YES; + +proc sql; + title 'Customers table'; + select * + from sqlib.customers; +quit; + +data sqlib.customers; + set sqlib.customers; run; \ No newline at end of file diff --git a/third_party_tools/client_platforms/connect3.sas b/connecting_to_sqream/client_platforms/connect3.sas similarity index 100% rename from third_party_tools/client_platforms/connect3.sas rename to connecting_to_sqream/client_platforms/connect3.sas diff --git a/connecting_to_sqream/client_platforms/denodo.rst b/connecting_to_sqream/client_platforms/denodo.rst new file mode 100644 index 000000000..35eb33301 --- /dev/null +++ b/connecting_to_sqream/client_platforms/denodo.rst @@ -0,0 +1,82 @@ +.. _denodo: + +*************** +Denodo Platform +*************** + +Denodo Platform is a data virtualization solution that enables integration, access, and real-time data delivery from disparate on-premises and cloud-based sources. + +Before You Begin +================ + +It is essential that you have the following installed: + +* Denodo 8.0 +* Java 17 + +Setting Up a Connection to SQreamDB +=================================== + +#. Under ``Denodo\DenodoPlatform8.0\lib\extensions\jdbc-drivers``, create a directory named ``sqream``. + +#. Download the SQreamDB JDBC Connector :ref:`.jar file ` and save it under the newly created ``sqream`` directory. + +#. In the Denodo Platform menu, go to **File** > **New** > **Data Source** > **JDBC**. + + A connection dialog box is displayed. + +#. Under the **Configuration** tab, select the **Connection** tab and fill in the data source information: + + .. list-table:: + :widths: auto + :header-rows: 1 + + * - Field name + - Description + - Value + - Example + * - Name + - The name of the data source + - ``sqream`` + - + * - Database adapter + - The database adapter allows Denodo Platform to communicate and interact with SQreamDB + - ``Generic`` + - + * - Driver class path + - The path to the location of the JDBC driver required for the connection to the data source + - + - ``path/to/jdbcdriver/sqream-jdbc-x.x.x`` + * - Driver class + - The class name of the JDBC driver used to connect to the data source + - ``com.sqream.jdbc.SQDriver`` + - + * - Database URI + - The URI that specifies the location and details of the database or data source to be connected + - ``jdbc:Sqream:///;[; ...]`` + - + * - Transaction isolation + - The level of isolation used to manage concurrent transactions in the database connection, ensuring data consistency and integrity + - ``Database default`` + - + * - Authentication + - Authentication method + - ``Use login and password`` + - + * - Login + - The SQreamDB role + - + - ``SqreamRole`` + * - Password + - The SQreamDB role password + - + - ``SqreamRolePassword2023`` + +5. To verify your newly created connection, select the **Test connection** button. + +.. note:: When adding the JDBC driver in Denodo, it's important to note that a restart of Denodo may be required. Additionally, in some cases, the SQream driver may not immediately appear in the list of available JDBC drivers. If you encounter this issue, a simple solution is to reboot the machine and attempt the process again. + +Limitation +========== + +When working with table joins involving columns with identical names and exporting a view as a REST service, the query transformation process can introduce ambiguity due to the indistinguishable column identifiers. This ambiguity may result in unresolved column references during query execution, necessitating thoughtful aliasing or disambiguation strategies to ensure accurate results. diff --git a/connecting_to_sqream/client_platforms/index.rst b/connecting_to_sqream/client_platforms/index.rst new file mode 100644 index 000000000..5c36ff281 --- /dev/null +++ b/connecting_to_sqream/client_platforms/index.rst @@ -0,0 +1,78 @@ +.. _client_platforms: + +************************************ +Client Platforms +************************************ + +SQreamDB is designed to work with the most common database tools and interfaces, allowing you direct access through a variety of drivers, connectors, visualization tools, and utilities. + +.. figure:: /_static/images/SQDBArchitecture.png + :align: right + :width: 800 + +Data Integration Tools +---------------------- + +:ref:`Informatica Cloud Services` + +:ref:`Pentaho Data Integration and Analytics` + +:ref:`Talend` + +:ref:`Semarchy` + +:ref:`SQL Workbench` + +Business Intelligence (BI) Tools +-------------------------------- + +:ref:`Denodo` + +:ref:`MicroStrategy` + +:ref:`Power BI Desktop` + +:ref:`SAP BusinessObjects` + +:ref:`SAS Viya` + +:ref:`Tableau` + +:ref:`TIBCO Spotfire` + +Data Analysis and Programming Languages +--------------------------------------- + +:ref:`PHP` + +:ref:`R` + + +.. toctree:: + :maxdepth: 4 + :titlesonly: + :hidden: + + denodo + informatica + microstrategy + pentaho + php + power_bi + r + sap_businessobjects + sas_viya + semarchy + sql_workbench + tableau + talend + tibco_spotfire + + + + + + + + + diff --git a/third_party_tools/client_platforms/informatica.rst b/connecting_to_sqream/client_platforms/informatica.rst similarity index 87% rename from third_party_tools/client_platforms/informatica.rst rename to connecting_to_sqream/client_platforms/informatica.rst index 6bc50b22a..b8b5e9cc2 100644 --- a/third_party_tools/client_platforms/informatica.rst +++ b/connecting_to_sqream/client_platforms/informatica.rst @@ -1,11 +1,12 @@ .. _informatica: -************************* -Connect to SQream Using Informatica Cloud Services -************************* +************************** +Informatica Cloud Services +************************** Overview -========= +======== + The **Connecting to SQream Using Informatica Cloud Services** page is quick start guide for connecting to SQream using Informatica cloud services. It describes the following: @@ -14,20 +15,17 @@ It describes the following: :local: Establishing a Connection between SQream and Informatica ------------------ +-------------------------------------------------------- + The **Establishing a Connection between SQream and Informatica** page describes how to establish a connection between SQream and the Informatica data integration Cloud. **To establish a connection between SQream and the Informatica data integration Cloud:** 1. Go to the `Informatica Cloud homepage `_. - :: - 2. Do one of the following: 1. Log in using your credentials. - - :: 2. Log in using your SAML Identity Provider. @@ -36,26 +34,16 @@ The **Establishing a Connection between SQream and Informatica** page describes The SQream dashboard is displayed. - - :: - - 4. In the menu on the left, click **Runtime Environments**. The **Runtime Environments** panel is displayed. - :: - 5. Click **Download Secure Agent**. - :: - 6. When the **Download the Secure Agent** panel is displayed, do the following: 1. Select a platform (Windows 64 or Linux 64). - - :: 2. Click **Copy** and save the token on your local hard drive. @@ -66,34 +54,25 @@ The **Establishing a Connection between SQream and Informatica** page describes 7. Click **Download**. The installation begins. - - :: + 8. When the **Informatica Cloud Secure Agent Setup** panel is displayed, click **Next**. - :: - - 9. Provide your **User Name** and **Install Token** and click **Register**. - :: - - 10. From the Runtime Environments panel, click **New Runtime Environment**. The **New Secure Agent Group** window is displayed. - - :: 11. On the New Secure Agent Group window, click **OK** to connect your Runtime Environment with the running agent. .. note:: If you do not download Secure Agent, you will not be able to connect your Runtime Environment with the running agent and continue establishing a connection between SQream and the Informatica data integration Cloud. Establishing a Connection In Your Environment ------------------ +--------------------------------------------- The **Establishing a Connection In Your Environment** describes the following: @@ -101,35 +80,29 @@ The **Establishing a Connection In Your Environment** describes the following: :local: Establishing an ODBC DSN Connection In Your Environment -~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + After establishing a connection between SQream and Informatica you can establish an ODBC DSN connection in your environment. **To establish an ODBC connection in your environment:** 1. Click **Add**. - - :: 2. Click **Configure**. .. note:: Verify that **Use Server Picker** is selected. 3. Click **Test**. - - :: 4. Verify that the connection has tested successfully. - - :: 5. Click **Save**. - - :: 6. Click **Actions** > **Publish**. Establishing a JDBC Connection In Your Environment -~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + After establishing a connection between SQream and Informatica you can establish a JDBC connection in your environment. **To establish a JDBC connection in your environment:** @@ -137,37 +110,26 @@ After establishing a connection between SQream and Informatica you can establish 1. Create a new DB connection by clicking **Connections** > **New Connection**. The **New Connection** window is displayed. - - :: 2. In the **JDBC_IC Connection Properties** section, in the **JDBC Connection URL** field, establish a JDBC connection by providing the correct connection string. - For connection string examples, see `Connection Strings `_. - - :: + For connection string examples, see :ref:`Connection Strings`. + 3. Click **Test**. - - :: 4. Verify that the connection has tested successfully. - - :: 5. Click **Save**. - - :: 6. Click **Actions** > **Publish**. Supported SQream Driver Versions ---------------- +-------------------------------- SQream supports the following SQream driver versions: * **JDBC** - Version 4.3.4 and above. - :: - * **ODBC** - Version 4.0.0 and above. diff --git a/third_party_tools/client_platforms/microstrategy.rst b/connecting_to_sqream/client_platforms/microstrategy.rst similarity index 84% rename from third_party_tools/client_platforms/microstrategy.rst rename to connecting_to_sqream/client_platforms/microstrategy.rst index 6d2be281f..87f688a7b 100644 --- a/third_party_tools/client_platforms/microstrategy.rst +++ b/connecting_to_sqream/client_platforms/microstrategy.rst @@ -1,185 +1,146 @@ -.. _microstrategy: - - -************************* -Connect to SQream Using MicroStrategy -************************* - -.. _ms_top: - -Overview ---------------- -This document is a Quick Start Guide that describes how to install MicroStrategy and connect a datasource to the MicroStrategy dasbhoard for analysis. - - - -The **Connecting to SQream Using MicroStrategy** page describes the following: - - -.. contents:: - :local: - - - - - - -What is MicroStrategy? -================ -MicroStrategy is a Business Intelligence software offering a wide variety of data analytics capabilities. SQream uses the MicroStrategy connector for reading and loading data into SQream. - -MicroStrategy provides the following: - -* Data discovery -* Advanced analytics -* Data visualization -* Embedded BI -* Banded reports and statements - - -For more information about Microstrategy, see `MicroStrategy `_. - - - -:ref:`Back to Overview ` - - - - - -Connecting a Data Source -======================= - -1. Activate the **MicroStrategy Desktop** app. The app displays the Dossiers panel to the right. - - :: - -2. Download the most current version of the `SQream JDBC driver `_. - - :: - -3. Click **Dossiers** and **New Dossier**. The **Untitled Dossier** panel is displayed. - - :: - -4. Click **New Data**. - - :: - -5. From the **Data Sources** panel, select **Databases** to access data from tables. The **Select Import Options** panel is displayed. - - :: - -6. Select one of the following: - - * Build a Query - * Type a Query - * Select Tables - - :: - -7. Click **Next**. - - :: - -8. In the Data Source panel, do the following: - - 1. From the **Database** dropdown menu, select **Generic**. The **Host Name**, **Port Number**, and **Database Name** fields are removed from the panel. - - :: - - 2. In the **Version** dropdown menu, verify that **Generic DBMS** is selected. - - :: - - 3. Click **Show Connection String**. - - :: - - 4. Select the **Edit connection string** checkbox. - - :: - - 5. From the **Driver** dropdown menu, select a driver for one of the following connectors: - - * **JDBC** - The SQream driver is not integrated with MicroStrategy and does not appear in the dropdown menu. However, to proceed, you must select an item, and in the next step you must specify the path to the SQream driver that you installed on your machine. - * **ODBC** - SQreamDB ODBC - - :: - - 6. In the **Connection String** text box, type the relevant connection string and path to the JDBC jar file using the following syntax: - - .. code-block:: console - - $ jdbc:Sqream:///;user=;password=sqream;[; ...] - - The following example shows the correct syntax for the JDBC connector: - - .. code-block:: console - - jdbc;MSTR_JDBC_JAR_FOLDER=C:\path\to\jdbc\folder;DRIVER=;URL={jdbc:Sqream:///;user=;password=;[; ...];} - - The following example shows the correct syntax for the ODBC connector: - - .. code-block:: console - - odbc:Driver={SqreamODBCDriver};DSN={SQreamDB ODBC};Server=;Port=;Database=;User=;Password=;Cluster=; - - For more information about the available **connection parameters** and other examples, see `Connection Parameters `_. - - 7. In the **User** and **Password** fields, fill out your user name and password. - - :: - - 8. In the **Data Source Name** field, type **SQreamDB**. - - :: - - 9. Click **Save**. The SQreamDB that you picked in the Data Source panel is displayed. - - -9. In the **Namespace** menu, select a namespace. The tables files are displayed. - - :: - -10. Drag and drop the tables into the panel on the right in your required order. - - :: - -11. **Recommended** - Click **Prepare Data** to customize your data for analysis. - - :: - -12. Click **Finish**. - - :: - -13. From the **Data Access Mode** dialog box, select one of the following: - - - * Connect Live - * Import as an In-memory Dataset - -Your populated dashboard is displayed and is ready for data discovery and analytics. - - - - - - -.. _supported_sqream_drivers: - -:ref:`Back to Overview ` - -Supported SQream Drivers -================ - -The following list shows the supported SQream drivers and versions: - -* **JDBC** - Version 4.3.3 and higher. -* **ODBC** - Version 4.0.0. - - -.. _supported_tools_and_operating_systems: - -:ref:`Back to Overview ` +.. _micro_strategy: + +************* +MicroStrategy +************* + +.. _ms_top: + +Overview +-------- + +This document is a Quick Start Guide that describes how to install MicroStrategy and connect a datasource to the MicroStrategy dasbhoard for analysis. + + + +The **Connecting to SQream Using MicroStrategy** page describes the following: + + +.. contents:: + :local: + + +What is MicroStrategy? +====================== + +MicroStrategy is a Business Intelligence software offering a wide variety of data analytics capabilities. SQream uses the MicroStrategy connector for reading and loading data into SQream. + +MicroStrategy provides the following: + +* Data discovery +* Advanced analytics +* Data visualization +* Embedded BI +* Banded reports and statements + + +For more information about Microstrategy, see `MicroStrategy `_. + + + +:ref:`Back to Overview ` + + + + + +Connecting a Data Source +======================== + +1. Activate the **MicroStrategy Desktop** app. The app displays the Dossiers panel to the right. + +2. Download the most current version of the `SQream JDBC driver `_. + +3. Click **Dossiers** and **New Dossier**. The **Untitled Dossier** panel is displayed. + +4. Click **New Data**. + +5. From the **Data Sources** panel, select **Databases** to access data from tables. The **Select Import Options** panel is displayed. + +6. Select one of the following: + + * Build a Query + * Type a Query + * Select Tables + +7. Click **Next**. + +8. In the Data Source panel, do the following: + + 1. From the **Database** dropdown menu, select **Generic**. The **Host Name**, **Port Number**, and **Database Name** fields are removed from the panel. + + 2. In the **Version** dropdown menu, verify that **Generic DBMS** is selected. + + 3. Click **Show Connection String**. + + 4. Select the **Edit connection string** checkbox. + + 5. From the **Driver** dropdown menu, select a driver for one of the following connectors: + + * **JDBC** - The SQream driver is not integrated with MicroStrategy and does not appear in the dropdown menu. However, to proceed, you must select an item, and in the next step you must specify the path to the SQream driver that you installed on your machine. + * **ODBC** - SQreamDB ODBC + + 6. In the **Connection String** text box, type the relevant connection string and path to the JDBC jar file using the following syntax: + + .. code-block:: console + + $ jdbc:Sqream:///;user=;password=sqream;[; ...] + + The following example shows the correct syntax for the JDBC connector: + + .. code-block:: console + + jdbc;MSTR_JDBC_JAR_FOLDER=C:\path\to\jdbc\folder;DRIVER=;URL={jdbc:Sqream:///;user=;password=;[; ...];} + + The following example shows the correct syntax for the ODBC connector: + + .. code-block:: console + + odbc:Driver={SqreamODBCDriver};DSN={SQreamDB ODBC};Server=;Port=;Database=;User=;Password=;Cluster=; + + For more information about the available **connection parameters** and other examples, see :ref:`Connection Parameters `. + + 7. In the **User** and **Password** fields, fill out your user name and password. + + 8. In the **Data Source Name** field, type **SQreamDB**. + + 9. Click **Save**. The SQreamDB that you picked in the Data Source panel is displayed. + + +9. In the **Namespace** menu, select a namespace. The tables files are displayed. + +10. Drag and drop the tables into the panel on the right in your required order. + +11. **Recommended** - Click **Prepare Data** to customize your data for analysis. + +12. Click **Finish**. + +13. From the **Data Access Mode** dialog box, select one of the following: + + + * Connect Live + * Import as an In-memory Dataset + +Your populated dashboard is displayed and is ready for data discovery and analytics. + + + + + + +.. _supported_sqream_drivers: + +:ref:`Back to Overview ` + +Supported SQream Drivers +======================== + +The following list shows the supported SQream drivers and versions: + +* **JDBC** - Version 4.3.3 and higher. +* **ODBC** - Version 4.0.0. + + +.. _supported_tools_and_operating_systems: + +:ref:`Back to Overview ` diff --git a/third_party_tools/client_platforms/odbc-sqream.tdc b/connecting_to_sqream/client_platforms/odbc-sqream.tdc similarity index 98% rename from third_party_tools/client_platforms/odbc-sqream.tdc rename to connecting_to_sqream/client_platforms/odbc-sqream.tdc index f1bbe279d..36cd55e33 100644 --- a/third_party_tools/client_platforms/odbc-sqream.tdc +++ b/connecting_to_sqream/client_platforms/odbc-sqream.tdc @@ -1,25 +1,25 @@ - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/third_party_tools/client_platforms/pentaho.rst b/connecting_to_sqream/client_platforms/pentaho.rst similarity index 62% rename from third_party_tools/client_platforms/pentaho.rst rename to connecting_to_sqream/client_platforms/pentaho.rst index 1cd95866f..89ae1e759 100644 --- a/third_party_tools/client_platforms/pentaho.rst +++ b/connecting_to_sqream/client_platforms/pentaho.rst @@ -1,249 +1,206 @@ -.. _pentaho_data_integration: - -************************* -Connect to SQream Using Pentaho Data Integration -************************* -.. _pentaho_top: - -Overview -========= -This document is a Quick Start Guide that describes how to install Pentaho, create a transformation, and define your output. - -The Connecting to SQream Using Pentaho page describes the following: - -* :ref:`Installing Pentaho ` -* :ref:`Installing and setting up the JDBC driver ` -* :ref:`Creating a transformation ` -* :ref:`Defining your output ` -* :ref:`Importing your data ` - -.. _install_pentaho: - -Installing Pentaho -~~~~~~~~~~~~~~~~~ -To install PDI, see the `Pentaho Community Edition (CE) Installation Guide `_. - -The **Pentaho Community Edition (CE) Installation Guide** describes how to do the following: - -* Downloading the PDI software. -* Installing the **JRE (Java Runtime Environment)** and **JDK (Java Development Kit)**. -* Setting up the JRE and JDK environment variables for PDI. - -:ref:`Back to Overview ` - -.. _install_set_up_jdbc_driver: - -Installing and Setting Up the JDBC Driver -~~~~~~~~~~~~~~~~~ -After installing Pentaho you must install and set up the JDBC driver. This section explains how to set up the JDBC driver using Pentaho. These instructions use Spoon, the graphical transformation and job designer associated with the PDI suite. - -You can install the driver by copying and pasting the SQream JDBC .jar file into your **/design-tools/data-integration/lib** directory. - -**NOTE:** Contact your SQream license account manager for the JDBC .jar file. - -:ref:`Back to Overview ` - -.. _create_transformation: - -Creating a Transformation -~~~~~~~~~~~~~~~~~~ -After installing Pentaho you can create a transformation. - -**To create a transformation:** - -1. Use the CLI to open the PDI client for your operating system (Windows): - - .. code-block:: console - - $ spoon.bat - -2. Open the spoon.bat file from its folder location. - -:: - -3. In the **View** tab, right-click **Transformations** and click **New**. - -A new transformation tab is created. - -4. In the **Design** tab, click **Input** to show its file contents. - -:: - -5. Drag and drop the **CSV file input** item to the new transformation tab that you created. - -:: - -6. Double-click **CSV file input**. The **CSV file input** panel is displayed. - -:: - -7. In the **Step name** field, type a name. - -:: - -8. To the right of the **Filename** field, click **Browse**. - -:: - -9. Select the file that you want to read from and click **OK**. - -:: - -10. In the CSV file input window, click **Get Fields**. - -:: - -11. In the **Sample data** window, enter the number of lines you want to sample and click **OK**. The default setting is **100**. - -The tool reads the file and suggests the field name and type. - -12. In the CSV file input window, click **Preview**. - -:: - -13. In the **Preview size** window, enter the number of rows you want to preview and click **OK**. The default setting is **1000**. - -:: - -14. Verify that the preview data is correct and click **Close**. - -:: - -15. Click **OK** in the **CSV file input** window. - -:ref:`Back to Overview ` - -.. _define_output: - -Defining Your Output ------------------ -After creating your transformation you must define your output. - -**To define your output:** - -1. In the **Design** tab, click **Output**. - - The Output folder is opened. - -2. Drag and drop **Table output** item to the Transformation window. - -:: - -3. Double-click **Table output** to open the **Table output** dialog box. - -:: - -4. From the **Table output** dialog box, type a **Step name** and click **New** to create a new connection. Your **steps** are the building blocks of a transformation, such as file input or a table output. - -The **Database Connection** window is displayed with the **General** tab selected by default. - -5. Enter or select the following information in the Database Connection window and click **Test**. - -The following table shows and describes the information that you need to fill out in the Database Connection window: - -.. list-table:: - :widths: 6 31 73 - :header-rows: 1 - - * - No. - - Element Name - - Description - * - 1 - - Connection name - - Enter a name that uniquely describes your connection, such as **sampledata**. - * - 2 - - Connection type - - Select **Generic database**. - * - 3 - - Access - - Select **Native (JDBC)**. - * - 4 - - Custom connection URL - - Insert **jdbc:Sqream:///;user=;password=;[; ...];**. The IP is a node in your SQream cluster and is the name or schema of the database you want to connect to. Verify that you have not used any leading or trailing spaces. - * - 5 - - Custom driver class name - - Insert **com.sqream.jdbc.SQDriver**. Verify that you have not used any leading or trailing spaces. - * - 6 - - Username - - Your SQreamdb username. If you leave this blank, you will be prompted to provide it when you connect. - * - 7 - - Password - - Your password. If you leave this blank, you will be prompted to provide it when you connect. - -The following message is displayed: - -.. image:: /_static/images/third_party_connectors/pentaho/connection_tested_successfully_2.png - -6. Click **OK** in the window above, in the Database Connection window, and Table Output window. - -:ref:`Back to Overview ` - -.. _import_data: - -Importing Data ------------------ -After defining your output you can begin importing your data. - -For more information about backing up users, permissions, or schedules, see `Backup and Restore Pentaho Repositories `_ - -**To import data:** - -1. Double-click the **Table output** connection that you just created. - -:: - -2. To the right of the **Target schema** field, click **Browse** and select a schema name. - -:: - -3. Click **OK**. The selected schema name is displayed in the **Target schema** field. - -:: - -4. Create a new hop connection between the **CSV file input** and **Table output** steps: - - 1. On the CSV file input step item, click the **new hop connection** icon. - - .. image:: /_static/images/third_party_connectors/pentaho/csv_file_input_options.png - - 2. Drag an arrow from the **CSV file input** step item to the **Table output** step item. - - .. image:: /_static/images/third_party_connectors/pentaho/csv_file_input_options_2.png - - 3. Release the mouse button. The following options are displayed. - - 4. Select **Main output of step**. - - .. image:: /_static/images/third_party_connectors/pentaho/main_output_of_step.png - -:: - -5. Double-click **Table output** to open the **Table output** dialog box. - -:: - -6. In the **Target table** field, define a target table name. - -:: - -7. Click **SQL** to open the **Simple SQL editor.** - -:: - -8. In the **Simple SQL editor**, click **Execute**. - - The system processes and displays the results of the SQL statements. - -9. Close all open dialog boxes. - -:: - -10. Click the play button to execute the transformation. - - .. image:: /_static/images/third_party_connectors/pentaho/execute_transformation.png - - The **Run Options** dialog box is displayed. - -11. Click **Run**. The **Execution Results** are displayed. - -:ref:`Back to Overview ` +.. _pentaho_data_integration: + +************************ +Pentaho Data Integration +************************ +.. _pentaho_top: + +Overview +======== + +This document is a Quick Start Guide that describes how to install Pentaho, create a transformation, and define your output. + +The Connecting to SQream Using Pentaho page describes the following: + +* :ref:`Installing Pentaho ` +* :ref:`Installing and setting up the JDBC driver ` +* :ref:`Creating a transformation ` +* :ref:`Defining your output ` +* :ref:`Importing your data ` + +.. _install_pentaho: + +Installing Pentaho +~~~~~~~~~~~~~~~~~~ + +To install PDI, see the `Pentaho Community Edition (CE) Installation Guide `_. + +The **Pentaho Community Edition (CE) Installation Guide** describes how to do the following: + +* Downloading the PDI software. +* Installing the **JRE (Java Runtime Environment)** and **JDK (Java Development Kit)**. +* Setting up the JRE and JDK environment variables for PDI. + +:ref:`Back to Overview ` + +.. _install_set_up_jdbc_driver: + +Installing and Setting Up the JDBC Driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After installing Pentaho you must install and set up the JDBC driver. This section explains how to set up the JDBC driver using Pentaho. These instructions use Spoon, the graphical transformation and job designer associated with the PDI suite. + +You can install the driver by copying and pasting the SQream JDBC .jar file into your **/design-tools/data-integration/lib** directory. + +:ref:`Back to Overview ` + +.. _create_transformation: + +Creating a Transformation +~~~~~~~~~~~~~~~~~~~~~~~~~ + +After installing Pentaho you can create a transformation. + +**To create a transformation:** + +1. Use the CLI to open the PDI client for your operating system (Windows): + + .. code-block:: console + + $ spoon.bat + +2. Open the spoon.bat file from its folder location. + +3. In the **View** tab, right-click **Transformations** and click **New**. + + A new transformation tab is created. + +4. In the **Design** tab, click **Input** to show its file contents. + +5. Drag and drop the **CSV file input** item to the new transformation tab that you created. + +6. Double-click **CSV file input**. The **CSV file input** panel is displayed. + +7. In the **Step name** field, type a name. + +8. To the right of the **Filename** field, click **Browse**. + +9. Select the file that you want to read from and click **OK**. + +10. In the CSV file input window, click **Get Fields**. + +11. In the **Sample data** window, enter the number of lines you want to sample and click **OK**. The default setting is **100**. + + The tool reads the file and suggests the field name and type. + +12. In the CSV file input window, click **Preview**. + +13. In the **Preview size** window, enter the number of rows you want to preview and click **OK**. The default setting is **1000**. + +14. Verify that the preview data is correct and click **Close**. + +15. Click **OK** in the **CSV file input** window. + +:ref:`Back to Overview ` + +.. _define_output: + +Defining Your Output +-------------------- + +After creating your transformation you must define your output. + +**To define your output:** + +1. In the **Design** tab, click **Output**. + + The Output folder is opened. + +2. Drag and drop **Table output** item to the Transformation window. + +3. Double-click **Table output** to open the **Table output** dialog box. + +4. From the **Table output** dialog box, type a **Step name** and click **New** to create a new connection. Your **steps** are the building blocks of a transformation, such as file input or a table output. + + The **Database Connection** window is displayed with the **General** tab selected by default. + +5. Enter or select the following information in the Database Connection window and click **Test**. + + The following table shows and describes the information that you need to fill out in the Database Connection window: + + .. list-table:: + :widths: 6 31 73 + :header-rows: 1 + + * - No. + - Element Name + - Description + * - 1 + - Connection name + - Enter a name that uniquely describes your connection, such as **sampledata**. + * - 2 + - Connection type + - Select **Generic database**. + * - 3 + - Access + - Select **Native (JDBC)**. + * - 4 + - Custom connection URL + - Insert **jdbc:Sqream:///;user=;password=;[; ...];**. The IP is a node in your SQream cluster and is the name or schema of the database you want to connect to. Verify that you have not used any leading or trailing spaces. + * - 5 + - Custom driver class name + - Insert **com.sqream.jdbc.SQDriver**. Verify that you have not used any leading or trailing spaces. + * - 6 + - Username + - Your SQreamdb username. If you leave this blank, you will be prompted to provide it when you connect. + * - 7 + - Password + - Your password. If you leave this blank, you will be prompted to provide it when you connect. + + +6. Click **OK** in the window above, in the Database Connection window, and Table Output window. + +:ref:`Back to Overview ` + +.. _import_data: + +Importing Data +-------------- + +After defining your output you can begin importing your data. + +For more information about backing up users, permissions, or schedules, see `Backup and Restore Pentaho Repositories `_ + +**To import data:** + +1. Double-click the **Table output** connection that you just created. + +2. To the right of the **Target schema** field, click **Browse** and select a schema name. + +3. Click **OK**. The selected schema name is displayed in the **Target schema** field. + +4. Create a new hop connection between the **CSV file input** and **Table output** steps: + + a. On the CSV file input step item, click the **new hop connection** icon. + + + b. Drag an arrow from the **CSV file input** step item to the **Table output** step item. + + + c. Release the mouse button. The following options are displayed. + + + d. Select **Main output of step**. + +5. Double-click **Table output** to open the **Table output** dialog box. + +6. In the **Target table** field, define a target table name. + +7. Click **SQL** to open the **Simple SQL editor.** + +8. In the **Simple SQL editor**, click **Execute**. + + The system processes and displays the results of the SQL statements. + +9. Close all open dialog boxes. + +10. Click the play button to execute the transformation. + + + The **Run Options** dialog box is displayed. + +11. Click **Run**. + + The **Execution Results** are displayed. + +:ref:`Back to Overview ` diff --git a/connecting_to_sqream/client_platforms/php.rst b/connecting_to_sqream/client_platforms/php.rst new file mode 100644 index 000000000..f9f90ce85 --- /dev/null +++ b/connecting_to_sqream/client_platforms/php.rst @@ -0,0 +1,70 @@ +.. _php: + +*** +PHP +*** + +Overview +======== + +PHP is an open source scripting language that executes scripts on servers. The **Connect to PHP** page explains how to connect to a SQream cluster, and describes the following: + +.. contents:: + :local: + :depth: 1 + +Installing PHP +-------------- + +**To install PHP:** + +1. Download the JDBC driver installer from the `SQream Drivers page `_. + +2. Create a DSN. + +3. Install the **uODBC** extension for your PHP installation. + + For more information, navigate to `PHP Documentation `_ and see the topic menu on the right side of the page. + +Configuring PHP +--------------- + +You can configure PHP in one of the following ways: + +* When compiling, configure PHP to enable uODBC using ``./configure --with-pdo-odbc=unixODBC,/usr/local``. + +* Install ``php-odbc`` and ``php-pdo`` along with PHP using your distribution package manager. SQream recommends a minimum of version 7.1 for the best results. + +.. note:: PHP's string size limitations truncates fetched text, which you can override by doing one of the following: + + * Increasing the **php.ini** default setting, such as the *odbc.defaultlrl* to **10000**. + + * Setting the size limitation in your code before making your connection using **ini_set("odbc.defaultlrl", "10000");**. + + * Setting the size limitation in your code before fetchng your result using **odbc_longreadlen($result, "10000");**. + +Operating PHP +------------- + +After configuring PHP, you can test your connection. + +**To test your connection:** + +1. Create a test connection file using the correct parameters for your SQream installation, as shown below: + + .. literalinclude:: test.php + :language: php + :emphasize-lines: 4 + :linenos: + + For more information, download the sample :download:`PHP example connection file ` shown above. + + The following is an example of a valid DSN line: + + .. code-block:: php + + $dsn = "odbc:Driver={SqreamODBCDriver};Server=192.168.0.5;Port=5000;Database=master;User=rhendricks;Password=super_secret;Service=sqream"; + +2. Run the PHP file either directly with PHP (``php test.php``) or through a browser. + + For more information about supported DSN parameters, see :ref:`dsn_params`. \ No newline at end of file diff --git a/third_party_tools/client_platforms/power_bi.rst b/connecting_to_sqream/client_platforms/power_bi.rst similarity index 55% rename from third_party_tools/client_platforms/power_bi.rst rename to connecting_to_sqream/client_platforms/power_bi.rst index 3b9f662bd..d29aca30c 100644 --- a/third_party_tools/client_platforms/power_bi.rst +++ b/connecting_to_sqream/client_platforms/power_bi.rst @@ -1,11 +1,9 @@ .. _power_bi: -************************* -Connect to SQream Using Power BI Desktop -************************* +********** +BI Desktop +********** -Overview -========= **Power BI Desktop** lets you connect to SQream and use underlying data as with other data sources in Power BI Desktop. SQream integrates with Power BI Desktop to do the following: @@ -22,7 +20,7 @@ SQream integrates with Power BI Desktop to do the following: SQream uses Power BI for extracting data sets using the following methods: -* **Direct query** - Direct queries lets you connect easily with no errors, and refreshes Power BI artifacts, such as graphs and reports, in a considerable amount of time in relation to the time taken for queries to run using the `SQream SQL CLI Reference guide `_. +* **Direct query** - Direct queries let you connect easily with no errors, and refresh Power BI artifacts, such as graphs and reports, in a considerable amount of time in relation to the time taken for queries to run using the :ref:`SQream SQL CLI Reference guide `. :: @@ -35,7 +33,8 @@ The **Connect to SQream Using Power BI** page describes the following: :depth: 1 Prerequisites -------------------- +------------- + To connect to SQream, the following must be installed: * **ODBC data source administrator** - 32 or 64, depending on your operating system. For Windows users, the ODBC data source administrator is embedded within the operating system. @@ -43,49 +42,47 @@ To connect to SQream, the following must be installed: * **SQream driver** - The SQream application required for interacting with the ODBC according to the configuration specified in the ODBC administrator tool. Installing Power BI Desktop -------------------- +--------------------------- + **To install Power BI Desktop:** -1. Download `Power BI Desktop 64x `_. +#. Download `Power BI Desktop 64x `_. :: -2. Download and configure your ODBC driver. +#. Download and configure your ODBC driver. - For more information about configuring your ODBC driver, see `ODBC `_. + For information about downloading and configuring your ODBC driver, see :ref:`ODBC ` or contact `SQream Support `_. -3. Navigate to **Windows** > **Documents** and create a folder called **Power BI Desktop Custom Connectors**. +#. Navigate to **Windows** > **Documents** and create a folder named **Power BI Desktop** with a subfolder named **Custom Connectors**. :: - -4. In the **Power BI Desktop** folder, create a folder called **Custom Connectors**. - -5. From the Client Drivers page, download the **PowerQuery.mez** file. +#. From the Client Drivers page, :ref:`download` the **PowerQuery.mez** file. :: -5. Save the PowerQuery.mez file in the **Custom Connectors** folder you created in Step 3. +#. Save the PowerQuery.mez file in the **Custom Connectors** folder you created in Step 3. :: -6. Open the Power BI application. +#. Open the Power BI application. :: -7. Navigate to **File** > **Options and Settings** > **Option** > **Security** > **Data Extensions**, and select **(Not Recommended) Allow any extension to load without validation or warning**. +#. Navigate to **File** > **Options and Settings** > **Option** > **Security** > **Data Extensions**, and select **(Not Recommended) Allow any extension to load without validation or warning**. :: -8. Restart the Power BI Desktop application. +#. Restart the Power BI Desktop application. :: -9. From the **Get Data** menu, select **SQream**. +#. From the **Get Data** menu, select **SQream**. :: -10. Click **Connect** and provide the information shown in the following table: +#. Click **Connect** and provide the information shown in the following table: .. list-table:: :widths: 6 31 @@ -104,18 +101,19 @@ Installing Power BI Desktop * - Passwords - Provide a password for your user. -11. Under **Data Connectivity mode**, select **DirectQuery mode**. +#. Under **Data Connectivity mode**, select **DirectQuery mode**. :: -12. Click **Connect**. +#. Click **Connect**. :: -13. Provide your user name and password and click **Connect**. +#. Provide your user name and password and click **Connect**. Best Practices for Power BI ---------------- +--------------------------- + SQream recommends using Power BI in the following ways for acquiring the best performance metrics: * Creating bar, pie, line, or plot charts when illustrating one or more columns. @@ -128,16 +126,4 @@ SQream recommends using Power BI in the following ways for acquiring the best pe * Creating a unified view using **PowerQuery** to connect different data sources into a single dashboard. -Supported SQream Driver Versions ---------------- -SQream supports the following SQream driver versions: - -* The **PowerQuery Connector** is an additional layer on top of the ODBC. - - :: - -* SQream Driver Installation (ODBC v4.1.1) - Contact your administrator for the link to download ODBC v4.1.1. -Related Information -------------------- -For more information, see the `Glossary `_. \ No newline at end of file diff --git a/third_party_tools/client_platforms/r.rst b/connecting_to_sqream/client_platforms/r.rst similarity index 90% rename from third_party_tools/client_platforms/r.rst rename to connecting_to_sqream/client_platforms/r.rst index 6abe27031..074baf5ff 100644 --- a/third_party_tools/client_platforms/r.rst +++ b/connecting_to_sqream/client_platforms/r.rst @@ -1,151 +1,151 @@ -.. _r: - -***************************** -Connect to SQream Using R -***************************** - -You can use R to interact with a SQream DB cluster. - -This tutorial is a guide that will show you how to connect R to SQream DB. - -.. contents:: In this topic: - :local: - -JDBC -========= - - -#. Get the :ref:`SQream DB JDBC driver`. - -#. - In R, install RJDBC - - .. code-block:: rconsole - - > install.packages("RJDBC") - Installing package into 'C:/Users/r/...' - (as 'lib' is unspecified) - - package 'RJDBC' successfully unpacked and MD5 sums checked - -#. - Import the RJDBC library - - .. code-block:: rconsole - - > library(RJDBC) - -#. - Set the classpath and initialize the JDBC driver which was previously installed. For example, on Windows: - - .. code-block:: rconsole - - > cp = c("C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") - > .jinit(classpath=cp) - > drv <- JDBC("com.sqream.jdbc.SQDriver","C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") -#. - Open a connection with a :ref:`JDBC connection string` and run your first statement - - .. code-block:: rconsole - - > con <- dbConnect(drv,"jdbc:Sqream://127.0.0.1:3108/master;user=rhendricks;password=Tr0ub4dor&3;cluster=true") - - > dbGetQuery(con,"select top 5 * from t") - xint xtinyint xsmallint xbigint - 1 1 82 5067 1 - 2 2 14 1756 2 - 3 3 91 22356 3 - 4 4 84 17232 4 - 5 5 13 14315 5 - -#. - Close the connection - - .. code-block:: rconsole - - > close(con) - -A full example ------------------ - -.. code-block:: rconsole - - > library(RJDBC) - > cp = c("C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") - > .jinit(classpath=cp) - > drv <- JDBC("com.sqream.jdbc.SQDriver","C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") - > con <- dbConnect(drv,"jdbc:Sqream://127.0.0.1:3108/master;user=rhendricks;password=Tr0ub4dor&3;cluster=true") - > dbGetQuery(con,"select top 5 * from t") - xint xtinyint xsmallint xbigint - 1 1 82 5067 1 - 2 2 14 1756 2 - 3 3 91 22356 3 - 4 4 84 17232 4 - 5 5 13 14315 5 - > close(con) - -ODBC -========= - -#. Install the :ref:`SQream DB ODBC driver` for your operating system, and create a DSN. - -#. - In R, install RODBC - - .. code-block:: rconsole - - > install.packages("RODBC") - Installing package into 'C:/Users/r/...' - (as 'lib' is unspecified) - - package 'RODBC' successfully unpacked and MD5 sums checked - -#. - Import the RODBC library - - .. code-block:: rconsole - - > library(RODBC) - -#. - Open a connection handle to an existing DSN (``my_cool_dsn`` in this example) - - .. code-block:: rconsole - - > ch <- odbcConnect("my_cool_dsn",believeNRows=F) - -#. - Run your first statement - - .. code-block:: rconsole - - > sqlQuery(ch,"select top 5 * from t") - xint xtinyint xsmallint xbigint - 1 1 82 5067 1 - 2 2 14 1756 2 - 3 3 91 22356 3 - 4 4 84 17232 4 - 5 5 13 14315 5 - -#. - Close the connection - - .. code-block:: rconsole - - > close(ch) - -A full example ------------------ - -.. code-block:: rconsole - - > library(RODBC) - > ch <- odbcConnect("my_cool_dsn",believeNRows=F) - > sqlQuery(ch,"select top 5 * from t") - xint xtinyint xsmallint xbigint - 1 1 82 5067 1 - 2 2 14 1756 2 - 3 3 91 22356 3 - 4 4 84 17232 4 - 5 5 13 14315 5 - > close(ch) +.. _r: + +** +R +** + +You can use R to interact with a SQream DB cluster. + +This tutorial is a guide that will show you how to connect R to SQream DB. + +.. contents:: In this topic: + :local: + +JDBC +==== + + +#. Get the :ref:`SQream DB JDBC driver`. + +#. + In R, install RJDBC + + .. code-block:: rconsole + + > install.packages("RJDBC") + Installing package into 'C:/Users/r/...' + (as 'lib' is unspecified) + + package 'RJDBC' successfully unpacked and MD5 sums checked + +#. + Import the RJDBC library + + .. code-block:: rconsole + + > library(RJDBC) + +#. + Set the classpath and initialize the JDBC driver which was previously installed. For example, on Windows: + + .. code-block:: rconsole + + > cp = c("C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") + > .jinit(classpath=cp) + > drv <- JDBC("com.sqream.jdbc.SQDriver","C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") +#. + Open a connection with a :ref:`JDBC connection string` and run your first statement + + .. code-block:: rconsole + + > con <- dbConnect(drv,"jdbc:Sqream://127.0.0.1:3108/master;user=rhendricks;password=Tr0ub4dor&3;cluster=true") + + > dbGetQuery(con,"select top 5 * from t") + xint xtinyint xsmallint xbigint + 1 1 82 5067 1 + 2 2 14 1756 2 + 3 3 91 22356 3 + 4 4 84 17232 4 + 5 5 13 14315 5 + +#. + Close the connection + + .. code-block:: rconsole + + > close(con) + +A full example +-------------- + +.. code-block:: rconsole + + > library(RJDBC) + > cp = c("C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") + > .jinit(classpath=cp) + > drv <- JDBC("com.sqream.jdbc.SQDriver","C:\\Program Files\\SQream Technologies\\JDBC Driver\\2020.1-3.2.0\\sqream-jdbc-3.2.jar") + > con <- dbConnect(drv,"jdbc:Sqream://127.0.0.1:3108/master;user=rhendricks;password=Tr0ub4dor&3;cluster=true") + > dbGetQuery(con,"select top 5 * from t") + xint xtinyint xsmallint xbigint + 1 1 82 5067 1 + 2 2 14 1756 2 + 3 3 91 22356 3 + 4 4 84 17232 4 + 5 5 13 14315 5 + > close(con) + +ODBC +==== + +#. Install the :ref:`SQream DB ODBC driver` for your operating system, and create a DSN. + +#. + In R, install RODBC + + .. code-block:: rconsole + + > install.packages("RODBC") + Installing package into 'C:/Users/r/...' + (as 'lib' is unspecified) + + package 'RODBC' successfully unpacked and MD5 sums checked + +#. + Import the RODBC library + + .. code-block:: rconsole + + > library(RODBC) + +#. + Open a connection handle to an existing DSN (``my_cool_dsn`` in this example) + + .. code-block:: rconsole + + > ch <- odbcConnect("my_cool_dsn",believeNRows=F) + +#. + Run your first statement + + .. code-block:: rconsole + + > sqlQuery(ch,"select top 5 * from t") + xint xtinyint xsmallint xbigint + 1 1 82 5067 1 + 2 2 14 1756 2 + 3 3 91 22356 3 + 4 4 84 17232 4 + 5 5 13 14315 5 + +#. + Close the connection + + .. code-block:: rconsole + + > close(ch) + +A full example +-------------- + +.. code-block:: rconsole + + > library(RODBC) + > ch <- odbcConnect("my_cool_dsn",believeNRows=F) + > sqlQuery(ch,"select top 5 * from t") + xint xtinyint xsmallint xbigint + 1 1 82 5067 1 + 2 2 14 1756 2 + 3 3 91 22356 3 + 4 4 84 17232 4 + 5 5 13 14315 5 + > close(ch) diff --git a/connecting_to_sqream/client_platforms/sap_businessobjects.rst b/connecting_to_sqream/client_platforms/sap_businessobjects.rst new file mode 100644 index 000000000..260ce5963 --- /dev/null +++ b/connecting_to_sqream/client_platforms/sap_businessobjects.rst @@ -0,0 +1,63 @@ +.. _sap_businessobjects: + +******************* +SAP BusinessObjects +******************* + +The **Connecting to SQream Using SAP BusinessObjects** guide includes the following sections: + +.. contents:: + :local: + :depth: 1 + +Overview +======== + +The **Connecting to SQream Using SAP BusinessObjects** guide describes the best practices for configuring a connection between SQream and the SAP BusinessObjects BI platform. SAP BO's multi-tier architecture includes both client and server components, and this guide describes integrating SQream with SAP BO's object client tools using a generic JDBC connector. The instructions in this guide are relevant to both the **Universe Design Tool (UDT)** and the **Information Design Tool (IDT)**. This document only covers how to establish a connection using the generic out-of-the-box JDBC connectors, and does not cover related business object products, such as the **Business Objects Data Integrator**. + +The **Define a new connection** window below shows the generic JDBC driver, which you can use to establish a new connection to a database. + +.. image:: /_static/images/SAP_BO_2.png + +SAP BO also lets you customize the interface to include a SQream data source. + +Establishing a New Connection Using a Generic JDCB Connector +============================================================ + +This section shows an example of using a generic JDBC connector to establish a new connection. + +**To establish a new connection using a generic JDBC connector:** + +1. In the fields, provide a user name, password, database URL, and JDBC class. + + The following is the correct format for the database URL: + + .. code-block:: console + +
jdbc:Sqream://:3108/
+	  
+   SQream recommends quickly testing your connection to SQream by selecting the Generic JDBC data source in the **Define a new connection** window. When you connect using a generic JDBC data source you do not need to modify your configuration files, but are limited to the out-of-the-box settings defined in the default **jdbc.prm** file.
+   
+   .. note:: Modifying the jdbc.prm file for the generic driver impacts all other databases using the same driver.
+
+For more information, see `Connection String Examples `_.
+
+2. (Optonal)If you are using the generic JDBC driver specific to SQream, modify the jdbc.sbo file to include the SQream JDBC driver location by adding the following lines under the Database section of the file:
+
+   .. code-block:: console
+
+      Database Active="Yes" Name="SQream JDBC data source">
+      
+      
+      C:\Program Files\SQream Technologies\JDBC Driver\2021.2.0-4.5.3\sqream-jdbc-4.5.3.jar
+      
+      
+      
+      com.sqream.jdbc.SQDriver
+
+      
+      
+
+3. Restart the BusinessObjects server.
+
+   When the connection is established, **SQream** is listed as a driver selection.
\ No newline at end of file
diff --git a/connecting_to_sqream/client_platforms/sas_viya.rst b/connecting_to_sqream/client_platforms/sas_viya.rst
new file mode 100644
index 000000000..7fc3cd912
--- /dev/null
+++ b/connecting_to_sqream/client_platforms/sas_viya.rst
@@ -0,0 +1,153 @@
+.. _connect_to_sas_viya:
+
+********
+SAS Viya
+********
+
+SAS Viya is a cloud-enabled analytics engine used for producing useful insights.
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Installing SAS Viya
+===================
+
+The **Installing SAS Viya** section describes the following:
+
+Downloading SAS Viya
+--------------------
+
+Integrating with SQreamDB has been tested with SAS Viya v.03.05 and newer.
+
+To download SAS Viya, see `SAS Viya `_.
+
+Installing the JDBC Driver
+--------------------------
+
+The SQreamDB JDBC driver is required for establishing a connection between SAS Viya and SQreamDB.
+
+**To install the JDBC driver:**
+
+#. Download the :ref:`JDBC driver`.
+
+    ::
+
+#. Unzip the JDBC driver into a location on the SAS Viya server.
+   
+   SQreamDB recommends creating the directory ``/opt/sqream`` on the SAS Viya server.
+   
+Configuring SAS Viya
+====================
+
+After installing the JDBC driver, you must configure the JDBC driver from the SAS Studio so that it can be used with SQreamDB BStudio.
+
+**To configure the JDBC driver from the SAS Studio:**
+
+#. Sign in to the SAS Studio.
+
+#. From the **New** menu, click **SAS Program**.
+	
+#. Configure the SQreamDB JDBC connector by adding the following rows:
+
+   .. literalinclude:: connect3.sas
+      :language: php
+
+Operating SAS Viya
+==================
+ 
+The **Operating SAS Viya** section describes the following:
+   
+Using SAS Viya Visual Analytics
+-------------------------------
+
+This section describes how to use SAS Viya Visual Analytics.
+
+**To use SAS Viya Visual Analytics:**
+
+#. Log in to SAS Viya Visual Analytics using your credentials:
+
+2. Click **New Report**.
+
+3. Click **Data**.
+
+4. Click **Data Sources**.
+
+5. Click the **Connect** icon.
+
+6. From the **Type** menu, select **Database**.
+
+7. Provide the required information and select **Persist this connection beyond the current session**.
+
+8. Click **Advanced** and provide the required information.
+
+9. Add the following additional parameters by clicking **Add Parameters**:
+
+.. list-table::
+   :widths: 10 90
+   :header-rows: 1   
+   
+   * - Name
+     - Value
+   * - class
+     - ``com.sqream.jdbc.SQDriver``
+   * - classPath
+     - ````   
+   * - url
+     - ``\jdbc:Sqream://**:**/**;cluster=true``
+   * - username
+     - ````
+   * - password
+     - ````
+   
+10. Click **Test Connection**.
+
+11. If the connection is successful, click **Save**.
+
+
+.. _troubleshooting_sas_viya:
+
+Troubleshooting SAS Viya
+========================
+
+The **Best Practices and Troubleshooting** section describes the following best practices and troubleshooting procedures when connecting to SQreamDB using SAS Viya:
+
+Inserting Only Required Data
+----------------------------
+
+When using SAS Viya, SQreamDB recommends using only data that you need, as described below:
+
+* Insert only the data sources you need into SAS Viya, excluding tables that don’t require analysis.
+
+    ::
+
+* To increase query performance, add filters before analyzing. Every modification you make while analyzing data queries the SQreamDB database, sometimes several times. Adding filters to the datasource before exploring limits the amount of data analyzed and increases query performance.
+
+Creating a Separate Service for SAS Viya
+----------------------------------------
+
+SQreamDB recommends creating a separate service for SAS Viya with the DWLM. This reduces the impact that Tableau has on other applications and processes, such as ETL. In addition, this works in conjunction with the load balancer to ensure good performance.
+
+Locating the SQreamDB JDBC Driver
+---------------------------------
+
+In some cases, SAS Viya cannot locate the SQreamDB JDBC driver, generating the following error message:
+
+.. code-block:: text
+
+   java.lang.ClassNotFoundException: com.sqream.jdbc.SQDriver
+
+**To locate the SQreamDB JDBC driver:**
+
+1. Verify that you have placed the JDBC driver in a directory that SAS Viya can access.
+
+2. Verify that the classpath in your SAS program is correct, and that SAS Viya can access the file that it references.
+
+3. Restart SAS Viya.
+
+For more troubleshooting assistance, see the `SQreamDB Support Portal `_.
+
+Supporting TEXT
+---------------
+
+In SAS Viya versions lower than 4.0, casting ``TEXT`` to ``CHAR`` changes the size to 1,024, such as when creating a table including a ``TEXT`` column. This is resolved by casting ``TEXT`` into ``CHAR`` when using the JDBC driver.
\ No newline at end of file
diff --git a/connecting_to_sqream/client_platforms/semarchy.rst b/connecting_to_sqream/client_platforms/semarchy.rst
new file mode 100644
index 000000000..05aee35cd
--- /dev/null
+++ b/connecting_to_sqream/client_platforms/semarchy.rst
@@ -0,0 +1,89 @@
+.. _semarchy:
+
+***************
+Semarchy
+***************
+
+Semarchy's Intelligent Data eXchange (IDX) facilitates seamless data integration and interoperability across systems. IDX ensures reliable data exchange between different applications, enhancing overall data quality, governance, and adaptability for critical business operations.
+
+Before You Begin
+================
+
+It is essential that you use Semarchy version 2023.01 or later.
+
+Setting Up a Connection to SQreamDB
+===================================
+
+#. Install the Semarchy SQreamDB component as described in `Semarchy documentation `_.
+
+#. Install SQreamDB :ref:`java_jdbc`.
+
+JDBC Connection String
+======================
+
+The following is a SQreamDB JDBC connection string template:
+
+.. code-block:: text
+
+   jdbc:Sqream:///;user=;password=;[; ...]
+
+Connection Parameters
+^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Item
+     - State
+     - Default
+     - Description
+   * - ````
+     - Mandatory
+     - None
+     - Hostname and port of the SQream DB worker. For example, ``127.0.0.1:5000``, ``sqream.mynetwork.co:3108``
+   * - ````
+     - Mandatory
+     - None
+     - Database name to connect to. For example, ``master``
+   * - ``username=``
+     - Optional
+     - None
+     - Username of a role to use for connection. For example, ``username=SqreamRole`` 
+   * - ``password=``
+     - Optional
+     - None
+     - Specifies the password of the selected role. For example, ``password=SqreamRolePassword2023``
+   * - ``service=``
+     - Optional
+     - ``sqream``
+     - Specifices service queue to use. For example, ``service=etl``
+   * - ````
+     - Optional
+     - ``false``
+     - Specifies SSL for this connection. For example, ``ssl=true``
+   * - ````
+     - Optional
+     - ``true``
+     - Connect via load balancer (use only if exists, and check port).
+   * - ````
+     - Optional
+     - ``true``
+     - Enables on-demand loading, and defines double buffer size for the result. The ``fetchSize`` parameter is rounded according to chunk size. For example, ``fetchSize=1`` loads one row and is rounded to one chunk. If the ``fetchSize`` is 100,600, a chunk size of 100,000 loads, and is rounded to, two chunks.
+   * - ````
+     - Optional
+     - ``true``
+     -  Defines the bytes size for inserting a buffer before flushing data to the server. Clients running a parameterized insert (network insert) can define the amount of data to collect before flushing the buffer.
+   * - ````
+     - Optional
+     - ``true``
+     -  Defines the logger level as either ``debug`` or ``trace``.
+   * - ````
+     - Optional
+     - ``true``
+     -  Enables the file appender and defines the file name. The file name can be set as either the file name or the file path.
+   * - ````
+     - Optional
+     - 0
+     - Sets the duration, in seconds, for which a database connection can remain idle before it is terminated. If the parameter is set to its default value, idle connections will not be terminated. The idle connection timer begins counting after the completion of query execution.
+
diff --git a/third_party_tools/client_platforms/sql_workbench.rst b/connecting_to_sqream/client_platforms/sql_workbench.rst
similarity index 73%
rename from third_party_tools/client_platforms/sql_workbench.rst
rename to connecting_to_sqream/client_platforms/sql_workbench.rst
index d46d45ae6..b5006a7b0 100644
--- a/third_party_tools/client_platforms/sql_workbench.rst
+++ b/connecting_to_sqream/client_platforms/sql_workbench.rst
@@ -1,135 +1,130 @@
-.. _connect_to_sql_workbench:
-
-*****************************
-Connect to SQream Using SQL Workbench
-*****************************
-
-You can use SQL Workbench to interact with a SQream DB cluster. SQL Workbench/J is a free SQL query tool, and is designed to run on any JRE-enabled environment. 
-
-This tutorial is a guide that will show you how to connect SQL Workbench to SQream DB.
-
-.. contents:: In this topic:
-   :local:
-
-Installing SQL Workbench with the SQream DB installer (Windows only)
-=====================================================================
-
-SQream DB's driver installer for Windows can install the Java prerequisites and SQL Workbench for you.
-
-#. Get the JDBC driver installer available for download from the `SQream Drivers page `_. The Windows installer takes care of the Java prerequisites and subsequent configuration.
-
-#. Install the driver by following the on-screen instructions in the easy-to-follow installer.
-   By default, the installer does not install SQL Workbench. Make sure to select the item!
-   
-   .. image:: /_static/images/jdbc_windows_installer_screen.png
-
-.. note:: The installer will install SQL Workbench in ``C:\Program Files\SQream Technologies\SQLWorkbench`` by default. You can change this path during the installation.
-
-#. Once finished, SQL Workbench is installed and contains the necessary configuration for connecting to SQream DB clusters.
-
-#. Start SQL Workbench from the Windows start menu. Be sure to select **SQL Workbench (64)** if you're on 64-bit Windows.
-   
-   .. image:: /_static/images/sql_workbench_launch.png
-
-You are now ready to create a profile for your cluster. Continue to :ref:`Creating a new connection profile `.
-
-Installing SQL Workbench manually (Linux, MacOS)
-===================================================
-
-Install Java Runtime 
-------------------------
-
-Both SQL Workbench and the SQream DB JDBC driver require Java 1.8 or newer. You can install either Oracle Java or OpenJDK.
-
-**Oracle Java**
-
-Download and install Java 8 from Oracle for your platform - https://www.java.com/en/download/manual.jsp
-
-**OpenJDK**
-
-For Linux and BSD, see https://openjdk.java.net/install/
-
-For Windows, SQream recommends Zulu 8 https://www.azul.com/downloads/zulu-community/?&version=java-8-lts&architecture=x86-64-bit&package=jdk
-
-Get the SQream DB JDBC driver
--------------------------------
-
-SQream DB's JDBC driver is provided as a zipped JAR file, available for download from the `SQream Drivers page `_. 
-
-Download and extract the JAR file from the zip archive.
-
-Install SQL Workbench
------------------------
-
-#. Download the latest stable release from https://www.sql-workbench.eu/downloads.html . The **Generic package for all systems** is recommended.
-
-#. Extract the downloaded ZIP archive into a directory of your choice.
-
-#. Start SQL workbench. If you are using 64 bit windows, run ``SQLWorkbench64.exe`` instead of ``SQLWOrkbench.exe``.
-
-Setting up the SQream DB JDBC driver profile
----------------------------------------------
-
-#. Define a connection profile - :menuselection:`&File --> &Connect window (Alt+C)`
-   
-   .. image:: /_static/images/sql_workbench_connect_window1.png
-
-#. Open the drivers management window - :menuselection:`&Manage Drivers`
-   
-   .. image:: /_static/images/sql_workbench_manage_drivers.png
-   
-   
-   
-#. Create the SQream DB driver profile
-   
-   .. image:: /_static/images/sql_workbench_create_driver.png
-   
-   #. Click on the Add new driver button ("New" icon)
-   
-   #. Name the driver as you see fit. We recommend calling it SQream DB , where  is the version you have installed.
-   
-   #. 
-      Add the JDBC drivers from the location where you extracted the SQream DB JDBC JAR.
-      
-      If you used the SQream installer, the file will be in ``C:\Program Files\SQream Technologies\JDBC Driver\``
-   
-   #. Click the magnifying glass button to detect the classname automatically. Other details are purely optional
-   
-   #. Click OK to save and return to "new connection screen"
-
-
-.. _new_connection_profile:
-
-Create a new connection profile for your cluster
-=====================================================
-
-   .. image:: /_static/images/sql_workbench_connection_profile.png
-
-#. Create new connection by clicking the New icon (top left)
-
-#. Give your connection a descriptive name
-
-#. Select the SQream Driver that was created in the previous screen
-
-#. Type in your connection string. To find out more about your connection string (URL), see the :ref:`Connection string documentation `.
-
-#. Text the connection details
-
-#. Click OK to save the connection profile and connect to SQream DB
-
-Suggested optional configuration
-==================================
-
-If you installed SQL Workbench manually, you can set a customization to help SQL Workbench show information correctly in the DB Explorer panel.
-
-#. Locate your workbench.settings file
-   On Windows, typically: ``C:\Users\\.sqlworkbench\workbench.settings``
-   On Linux, ``$HOME/.sqlworkbench``
-   
-#. Add the following line at the end of the file:
-   
-   .. code-block:: text
-      
-      workbench.db.sqreamdb.schema.retrieve.change.catalog=true
-
-#. Save the file and restart SQL Workbench
+.. _connect_to_sql_workbench:
+
+*************
+SQL Workbench
+*************
+
+You can use SQL Workbench to interact with a SQream DB cluster. SQL Workbench/J is a free SQL query tool, and is designed to run on any JRE-enabled environment. 
+
+This tutorial is a guide that will show you how to connect SQL Workbench to SQream DB.
+
+.. contents:: In this topic:
+   :local:
+
+Installing SQL Workbench with the SQream Installer
+==================================================
+
+This section applies to Windows only.
+
+SQream DB's driver installer for Windows can install the Java prerequisites and SQL Workbench for you.
+
+#. Get the JDBC driver installer available for download from the `SQream Drivers page `_. The Windows installer takes care of the Java prerequisites and subsequent configuration.
+
+#. Install the driver by following the on-screen instructions in the easy-to-follow installer.
+   By default, the installer does not install SQL Workbench. Make sure to select the item!
+   
+   .. image:: /_static/images/jdbc_windows_installer_screen.png
+
+.. note:: The installer will install SQL Workbench in ``C:\Program Files\SQream Technologies\SQLWorkbench`` by default. You can change this path during the installation.
+
+#. Once finished, SQL Workbench is installed and contains the necessary configuration for connecting to SQream DB clusters.
+
+#. Start SQL Workbench from the Windows start menu. Be sure to select **SQL Workbench (64)** if you're on 64-bit Windows.
+   
+   .. image:: /_static/images/sql_workbench_launch.png
+
+You are now ready to create a profile for your cluster. Continue to :ref:`Creating a new connection profile `.
+
+Installing SQL Workbench Manually
+=================================
+
+This section applies to Linux and MacOS only.
+
+Install Java Runtime 
+--------------------
+
+Both SQL Workbench and the SQream DB JDBC driver require Java 17 or newer. You can install either Oracle Java or OpenJDK.
+
+
+Get the SQream DB JDBC Driver
+-----------------------------
+
+SQream DB's JDBC driver is provided as a zipped JAR file, available for download from the `SQream Drivers page `_. 
+
+Download and extract the JAR file from the zip archive.
+
+Install SQL Workbench
+---------------------
+
+#. Download the latest stable release from https://www.sql-workbench.eu/downloads.html . The **Generic package for all systems** is recommended.
+
+#. Extract the downloaded ZIP archive into a directory of your choice.
+
+#. Start SQL workbench. If you are using 64 bit windows, run ``SQLWorkbench64.exe`` instead of ``SQLWOrkbench.exe``.
+
+Setting up the SQream DB JDBC Driver Profile
+--------------------------------------------
+
+#. Define a connection profile - :menuselection:`&File --> &Connect window (Alt+C)`
+   
+   .. image:: /_static/images/sql_workbench_connect_window1.png
+
+#. Open the drivers management window - :menuselection:`&Manage Drivers`
+   
+   .. image:: /_static/images/sql_workbench_manage_drivers.png
+   
+   
+   
+#. Create the SQream DB driver profile
+   
+   .. image:: /_static/images/sql_workbench_create_driver.png
+   
+   #. Click on the Add new driver button ("New" icon)
+   
+   #. Name the driver as you see fit. We recommend calling it SQream DB , where  is the version you have installed.
+   
+   #. 
+      Add the JDBC drivers from the location where you extracted the SQream DB JDBC JAR.
+      
+      If you used the SQream installer, the file will be in ``C:\Program Files\SQream Technologies\JDBC Driver\``
+   
+   #. Click the magnifying glass button to detect the classname automatically. Other details are purely optional
+   
+   #. Click OK to save and return to "new connection screen"
+
+
+.. _new_connection_profile:
+
+Create a New Connection Profile for Your Cluster
+================================================
+
+   .. image:: /_static/images/sql_workbench_connection_profile.png
+
+#. Create new connection by clicking the New icon (top left)
+
+#. Give your connection a descriptive name
+
+#. Select the SQream Driver that was created in the previous screen
+
+#. Type in your connection string.
+
+#. Text the connection details
+
+#. Click OK to save the connection profile and connect to SQream DB
+
+Suggested Optional Configuration
+================================
+
+If you installed SQL Workbench manually, you can set a customization to help SQL Workbench show information correctly in the DB Explorer panel.
+
+#. Locate your workbench.settings file
+   On Windows, typically: ``C:\Users\\.sqlworkbench\workbench.settings``
+   On Linux, ``$HOME/.sqlworkbench``
+   
+#. Add the following line at the end of the file:
+   
+   .. code-block:: text
+      
+      workbench.db.sqreamdb.schema.retrieve.change.catalog=true
+
+#. Save the file and restart SQL Workbench
diff --git a/connecting_to_sqream/client_platforms/tableau.rst b/connecting_to_sqream/client_platforms/tableau.rst
new file mode 100644
index 000000000..7b1efbb39
--- /dev/null
+++ b/connecting_to_sqream/client_platforms/tableau.rst
@@ -0,0 +1,118 @@
+.. _tableau:
+
+*******
+Tableau
+*******
+
+SQream's Tableau connector, based on standard JDBC, enables storing and fast querying large volumes of data. This connector is useful for users who want to integrate and analyze data from various sources within the Tableau platform. With the Tableau connector, users can easily connect to databases and cloud applications and perform high-speed queries on large datasets. Additionally, the connector allows for seamless integration with Tableau, enabling users to visualize their data.
+
+SQream supports both Tableau Desktop and Tableau Server on Windows, MacOS, and Linux distributions.
+
+For more information on SQream's integration with Tableau, see `Tableau Connectors `_.
+
+.. contents::
+   :local:
+   :depth: 1
+
+Prerequisites
+-------------
+
+It is essential that you have the following installed:
+
+* Tableau version 9.2 or newer 
+
+Setting Up JDBC
+----------------
+
+#. Download the SQream JDBC Connector :ref:`.jar file `.
+#. Place the JDBC ``.jar`` file in the Tableau driver directory.
+
+   Based on your operating system, you may find the Tableau driver directory in one of the following locations:
+   
+   * Tableau Desktop on MacOS: ``~/Library/Tableau/Drivers``
+   * Tableau Desktop on Windows: ``C:\Program Files\Tableau\Drivers``
+   * Tableau on Linux: ``/opt/tableau/tableau_driver/jdbc``
+
+Installing the Tableau Connector
+--------------------------------
+
+#. Download the :ref:`Tableau Connector ` ``SQreamDB.taco`` file.
+   
+#. Based on the installation method that you used for installing Tableau, place the Tableau Connector ``SQreamDB.taco`` file in the Tableau connector directory:
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Product / Platform
+     - Path
+   * - Tableau Desktop for Windows
+     - ``C:\Users[user]\Documents\My Tableau Repository\Connectors``
+   * - Tableau Desktop for Mac
+     - ``/Users/[user]/Documents/My Tableau Repository/Connectors``
+   * - Tableau Prep for Windows
+     - ``C:\Users[user]\Documents\My Tableau Prep Repository\Connectors``
+   * - Tableau Prep for Mac
+     - ``/Users/[user]/Documents/My Tableau Prep Repository/Connectors``
+   * - Flow web authoring on Tableau Server
+     - ``/data/tabsvc/flowqueryservice/Connectors``
+   * - Tableau Prep Conductor on Tableau Server
+     - ``/data/tabsvc/flowprocessor/Connectors``
+   * - Tableau Server
+     - ``C:\\Tableau\Tableau Server\\Connectors``
+
+3. Restart Tableau Desktop or Tableau server.
+
+Connecting to SQream
+--------------------
+
+
+#. Start Tableau Desktop.
+	
+#. In the **Connect** menu, under the **To a Server** option , click **More**.
+
+   Additional connection options are displayed.
+	
+#. Select **SQream DB by SQream Technologies**.
+
+   A connection dialog box is displayed.
+	
+#. In the connection dialog box, fill in the following fields:
+
+  .. list-table:: 
+     :widths: 15 38 38
+     :header-rows: 1
+   
+     * - Field name
+       - Description
+       - Example
+     * - Server
+       - Defines the SQreamDB worker machine IP. 
+	   
+	  Avoid using the loopback address (127.0.0.1) or "localhost" as a server address since it typically refers to the local machine where Tableau is installed and may create issues and limitations
+       - ``192.162.4.182`` or ``sqream.mynetwork.com``
+     * - Port
+       - Defines the TCP port of the SQream worker
+       - ``3108`` when using a load balancer, or ``5100`` when connecting directly to a worker with SSL
+     * - Database
+       - Defines the database to establish a connection with
+       - ``master``
+     * - Cluster
+       - Enables (``true``) or disables (``false``) the load balancer. After enabling or disabling the load balance, verify the connection
+       - 
+     * - Username
+       - Specifies the username of a role to use when connecting
+       - ``rhendricks``	 
+     * - Password
+       - Specifies the password of the selected role
+       - ``Tr0ub4dor&3``
+     * - Require SSL 
+       - Sets SSL as a requirement for establishing this connection
+       - 
+
+5. Click **Sign In**.
+
+   The connection is established, and the data source page is displayed.
+
+   
+
diff --git a/connecting_to_sqream/client_platforms/talend.rst b/connecting_to_sqream/client_platforms/talend.rst
new file mode 100644
index 000000000..1b66c0086
--- /dev/null
+++ b/connecting_to_sqream/client_platforms/talend.rst
@@ -0,0 +1,100 @@
+.. _talend:
+
+******
+Talend
+******
+
+Overview
+========
+ 
+This page describes how to use Talend to interact with a SQream cluster. The Talend connector is used for reading data from a SQream cluster and loading data into SQream. In addition, this page provides a viability report on Talend's comparability with SQream for stakeholders.
+
+The **Connecting to SQream Using Talend** describes the following:
+
+.. contents::
+   :local:
+   :depth: 1
+
+Creating a New Metadata JDBC DB Connection
+------------------------------------------
+
+**To create a new metadata JDBC DB connection:**
+
+1. In the **Repository** panel, nagivate to **Metadata** and right-click **Db connections**.
+	
+2. Select **Create connection**.
+	
+3. In the **Name** field, type a name.
+
+   Note that the name cannot contain spaces.
+
+4. In the **Purpose** field, type a purpose and click **Next**.
+
+   Note that you cannot continue to the next step until you define both a Name and a Purpose.
+
+5. In the **DB Type** field, select **JDBC**.
+
+6. In the **JDBC URL** field, type the relevant connection string.
+
+   For connection string examples, see `Connection Strings `_.
+   
+7. In the **Drivers** field, click the **Add** button.
+
+   The **"newLine"** entry is added.
+
+8. One the **"newLine'** entry, click the ellipsis.
+
+   The **Module** window is displayed.
+
+9. From the Module window, select **Artifact repository(local m2/nexus)** and select **Install a new module**.
+
+10. Click the ellipsis.
+
+    Your hard drive is displayed.	
+
+11. Navigate to a **JDBC jar file** (such as **sqream-jdbc-4.5.3.jar**)and click **Open**.
+
+12. Click **Detect the module install status**.
+
+13. Click **OK**.
+
+    The JDBC that you selected is displayed in the **Driver** field.
+
+14. Click **Select class name**.
+
+15. Click **Test connection**.
+
+    If a driver class is not found (for example, you didn't select a JDBC jar file), the following error message is displayed:
+
+    After creating a new metadata JDBC DB connection, you can do the following:
+
+    * Use your new metadata connection.
+	   
+    * Drag it to the **job** screen.
+	   
+    * Build Talend components.
+ 
+    For more information on loading data from JSON files to the Talend Open Studio, see `How to Load Data from JSON Files in Talend `_.
+
+Supported SQream Drivers
+------------------------
+
+The following list shows the supported SQream drivers and versions:
+
+* **JDBC** - Version 4.3.3 and higher.
+   
+* **ODBC** - Version 4.0.0. This version requires a Bridge to connect. For more information on the required Bridge, see `Connecting Talend on Windows to an ODBC Database `_.
+
+Supported Data Sources
+----------------------
+
+Talend Cloud connectors let you create reusable connections with a wide variety of systems and environments, such as those shown below. This lets you access and read records of a range of diverse data.
+
+* **Connections:** Connections are environments or systems for storing datasets, including databases, file systems, distributed systems and platforms. Because these systems are reusable, you only need to establish connectivity with them once.
+
+* **Datasets:** Datasets include database tables, file names, topics (Kafka), queues (JMS) and file paths (HDFS). For more information on the complete list of connectors and datasets that Talend supports, see `Introducing Talend Connectors `_.
+
+Known Issues
+------------
+
+As of 6/1/2021 schemas were not displayed for tables with identical names.
\ No newline at end of file
diff --git a/third_party_tools/client_platforms/test.php b/connecting_to_sqream/client_platforms/test.php
similarity index 96%
rename from third_party_tools/client_platforms/test.php
rename to connecting_to_sqream/client_platforms/test.php
index fef04e699..88ec88338 100644
--- a/third_party_tools/client_platforms/test.php
+++ b/connecting_to_sqream/client_platforms/test.php
@@ -1,16 +1,16 @@
- 
+ 
diff --git a/third_party_tools/client_platforms/tibco_spotfire.rst b/connecting_to_sqream/client_platforms/tibco_spotfire.rst
similarity index 87%
rename from third_party_tools/client_platforms/tibco_spotfire.rst
rename to connecting_to_sqream/client_platforms/tibco_spotfire.rst
index b0a707c51..e4d1eec3a 100644
--- a/third_party_tools/client_platforms/tibco_spotfire.rst
+++ b/connecting_to_sqream/client_platforms/tibco_spotfire.rst
@@ -1,11 +1,13 @@
 .. _tibco_spotfire:
 
 
-*************************
-Connecting to SQream Using TIBCO Spotfire
-*************************
+**************
+TIBCO Spotfire
+**************
+
 Overview
-=========
+========
+
 The **TIBCO Spotfire** software is an analytics solution that enables visualizing and exploring data through dashboards and advanced analytics.
 
 This document is a Quick Start Guide that describes the following:
@@ -15,10 +17,11 @@ This document is a Quick Start Guide that describes the following:
    :depth: 1
    
 Establishing a Connection between TIBCO Spotfire and SQream
------------------
+-----------------------------------------------------------
+
 TIBCO Spotfire supports the following versions:
 
-* **JDBC driver** - Version 4.5.3
+* **JDBC driver** - Version 4.5.2 
 * **ODBC driver** - Version 4.1.1
 
 SQream supports TIBCO Spotfire version 7.12.0.
@@ -30,13 +33,15 @@ The **Establishing a JDBC Connection between TIBCO Spotfire and SQream** section
    :depth: 1   
    
 Creating a JDBC Connection
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 For TIBCO Spotfire to recognize SQream, you must add the correct JDBC jar file to Spotfire's loaded binary folder. The following is an example of a path to the Spotfire loaded binaries folder: ``C:\tibco\tss\7.12.0\tomcat\bin``.
 
 For the complete TIBCO Spotfire documentation, see `TIBCO Spotfire® JDBC Data Access Connectivity Details `_. 
 
 Creating an ODBC Connection
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 **To create an ODBC connection**
 
 1. Install and configure ODBC on Windows.
@@ -45,8 +50,6 @@ Creating an ODBC Connection
    
 #. Launch the TIBCO Spotfire application.
 
-    ::
-
 #. From the **File** menu click **Add Data Tables**.
 
    The **Add Database Tables** window is displayed.
@@ -61,20 +64,14 @@ Creating an ODBC Connection
    
 #. Select **System or user data source** and from the drop-down menu select the DSN of your data source (SQreamDB).
 
-    ::
-
 #. Provide your database username and password and click **OK**.
 
-    ::
-
 #. In the **Open Database** window, click **OK**.
 
    The **Specify Tables and Columns** window is displayed.
 
 #. In the **Specify Tables and Columns** window, select the checkboxes corresponding to the tables and columns that you want to include in your SQL statement.
 
-    ::
-
 #. In the **Data source name** field, set your data source name and click **OK**.
 
    Your data source is displayed in the **Data tables** area.
@@ -84,7 +81,8 @@ Creating an ODBC Connection
 .. note:: Verify that you have checked the SQL statement. 
 
 Creating the SQream Data Source Template
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 After creating a connection, you can create your SQream data source template.
 
 **To create your SQream data source template:**
@@ -102,13 +100,9 @@ After creating a connection, you can create your SQream data source template.
   * Override an existing template:
    
     1. In the template text field, select an existing template.
-	
-	    ::
 		
     2. Copy and paste your data source template text.
 	 
-	     ::
-	 
   * Create a new template:
    
     1. Click **New**.
@@ -118,8 +112,6 @@ After creating a connection, you can create your SQream data source template.
        .. _creating_sqream_data_source_template:
 		
     2. In the **Name** field, define your template name.
-	
-	    ::
 		
     3. In the **Data Source Template** text field, copy and paste your data source template text.
 	
@@ -128,104 +120,90 @@ After creating a connection, you can create your SQream data source template.
        .. code-block:: console
 	
           
-            SQream   
-            com.sqream.jdbc.SQDriver   
-            jdbc:Sqream://<host>:<port>/database;user=sqream;password=sqream;cluster=true   
-            true   
-            true   
-            false   
-            TABLE,EXTERNAL_TABLE   
+            SQream
+            com.sqream.jdbc.SQDriver
+            jdbc:Sqream://<host>:<port>/database;user=sqream;password=sqream;cluster=true
+            true
+            true
+            false
+            TABLE,EXTERNAL_TABLE
             
              
-                Bool   
-                Integer   
+                Bool
+                Integer
               
               
-                VARCHAR(2048)   
-                String   
+                TEXT(2048)
+                String
               
               
-                INT   
-                Integer   
+                INT
+                Integer
               
               
-                BIGINT   
-                LongInteger   
+                BIGINT
+                LongInteger
               
               
-                Real   
-                Real   
+                Real
+                Real
               
 	           
-                Decimal   
-                Float   
+                Decimal
+                Float
               
                
-                Numeric   
-                Float   
+                Numeric
+                Float
               
               
-                Date   
-                DATE   
+                Date
+                DATE
               
               
-                DateTime   
-                DateTime   
+                DateTime
+                DateTime
               
              
-               
+            
           			
 	
 4. Click **Save configuration**.
-
-    ::
 	
 5. Close and restart your Spotfire server.
 
 Creating a Data Source
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~
+
 After creating the SQream data source template, you can create a data source.
 
 **To create a data source:**
 
 1. Launch the TIBCO Spotfire application.
 
-    ::
-
 #. From the **Tools** menu, select **Information Designer**.
 
    The **Information Designer** window is displayed.
-
-    ::
 	
 #. From the **New** menu, click **Data Source**.
 
    The **Data Source** tab is displayed.
-
-    ::
 	
 #. Provide the following information:
 
    * **Name** - define a unique name.
-   
-      ::
 	  
    * **Type** - use the same type template name you used while configuring your template. See **Step 3** in :ref:`Creating the SQream Data Source Template`.
-   
-      ::
 	  
    * **Connection URL** - use the standard JDBC connection string, ``:/database``.
-   
-      ::
 	  
    * **No. of connections** - define a number between **1** and **100**. SQream recommends setting your number of connections to **100**.
-   
-      ::
 	  
    * **Username and Password** - define your SQream username and password.   
 
 Creating an Information Link
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 After creating a data source, you can create an information link.
 
 **To create an information link**:
@@ -234,8 +212,6 @@ After creating a data source, you can create an information link.
 
    The **Information Designer** window is displayed.
 
-    ::
-
 #. From the **New** menu, click **Information Link**.
 
    The **Information link** tab is displayed.
@@ -247,8 +223,6 @@ After creating a data source, you can create an information link.
    Note the following:
    
    * You can select procedures from the Elements region.
-   
-      ::
 	  
    * You can remove an element by selecting an element and clicking **Remove**.   
 
@@ -258,13 +232,9 @@ After creating a data source, you can create an information link.
 
 5. *Optional* - In the **Description** region, type the description of the information link.
 
-    ::
-
 #. *Optional* - To filter your data, expand the **Filters** section and do the following:
 
     1. From the **Information Link** region, select the element you added in Step 3 above.
-	
-	    ::
 		
     2. Click **Add**.
 	
@@ -275,8 +245,6 @@ After creating a data source, you can create an information link.
        The selected column is added to the Filters list.
 	   
     4. Repeat steps 2 and 3 to add filters to additional columns.
-	
-	    ::
 		
     5. For each column, from the **Filter Type** drop-down list, select **range** or **values**.
 	
@@ -301,8 +269,6 @@ After creating a data source, you can create an information link.
        The selected column is added to the Prompts list.
 	   
     #. Repeat **Step 1** to add prompts to additional columns.
-	
-	    ::
 		
     #. Do the following for each column:
 	
@@ -322,29 +288,18 @@ After creating a data source, you can create an information link.
    
 9. *Optional* - Expand the **Parameters** section and define your parameters.
 
-     ::
-
 10. *Optional* - Expand the **Properties** section and define your properties.
 
-     ::
-
 11. *Optional* - Expand the **Caching** section and enable or disable whether your information link can be cached.
 
-     ::
-
 12. Click **Save**.
 
     The **Save As** window is displayed.
 
 13. In the tree, select where you want to save the information link.
 
-     ::
-
 14. In the **Name** field, type a name and description for the information link.
 
-     ::
-
-
 15. Click **Save**.
 
     The new information link is added to the library and can be accessed by other users.
@@ -354,7 +309,8 @@ After creating a data source, you can create an information link.
 For more information on the Information Link attributes, see `Information Link Tab `_.
 
 Troubleshooting
--------------
+---------------
+
 The **Troubleshooting** section describes the following scenarios:
 
 .. contents::
@@ -362,7 +318,8 @@ The **Troubleshooting** section describes the following scenarios:
    :depth: 1 
 
 The JDBC Driver does not Support Boolean, Decimal, or Numeric Types
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 When attempting to load data, the the Boolean, Decimal, or Numeric column types are not supported and generate the following error:
 
 .. code-block:: console
@@ -381,7 +338,8 @@ For more information, see the following:
 * **Supported data types** - :ref:`Data Types`.
 
 Information Services do not Support Live Queries
-~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 TIBCO Spotfire data connectors support live queries, but no APIs currently exist for creating custom data connectors. This is resolved by creating a customized SQream adapter using TIBCO's **Data Virtualization (TDV)** or the **Spotfire Advanced Services (ADS)**. These can be used from the built-in TDV connector to enable live queries.
 
 This resolution applies to JDBC and ODBC drivers.
diff --git a/third_party_tools/connectivity_ecosystem.jpg b/connecting_to_sqream/connectivity_ecosystem.jpg
similarity index 100%
rename from third_party_tools/connectivity_ecosystem.jpg
rename to connecting_to_sqream/connectivity_ecosystem.jpg
diff --git a/connecting_to_sqream/index.rst b/connecting_to_sqream/index.rst
new file mode 100644
index 000000000..5fe149e5f
--- /dev/null
+++ b/connecting_to_sqream/index.rst
@@ -0,0 +1,22 @@
+.. _connecting_to_sqream:
+
+
+**********************
+Connecting to SQreamDB
+**********************
+ 
+SQream supports the most common database tools and interfaces, giving you direct access through a variety of drivers, connectors, and visualization tools and utilities. 
+
+
+.. toctree::
+   :maxdepth: 2
+   :glob:
+   :titlesonly:
+   
+   client_platforms/index
+   client_drivers/index
+
+
+.. rubric:: Need help?
+
+If you need a tool that SQream does not support, contact `SQream Support `_ or your SQreamDB account manager for more information.
\ No newline at end of file
diff --git a/data_ingestion/avro.rst b/data_ingestion/avro.rst
new file mode 100644
index 000000000..64aac34a1
--- /dev/null
+++ b/data_ingestion/avro.rst
@@ -0,0 +1,436 @@
+.. _avro:
+
+****
+Avro
+****
+
+**Avro** is a well-known data serialization system that relies on schemas. Due to its flexibility as an efficient data storage method, SQream supports the Avro binary data format as an alternative to JSON. Avro files are represented using the **Object Container File** format, in which the Avro schema is encoded alongside binary data. Multiple files loaded in the same transaction are serialized using the same schema. If they are not serialized using the same schema, an error message is displayed. SQream uses the **.avro** extension for ingested Avro files.
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Foreign Data Wrapper Prerequisites
+==================================
+
+Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites:
+
+* **File Existence:** Verify that the file you are ingesting data from exists at the specified path.
+
+* **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues.
+* **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process.
+
+* **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion.
+
+Making Avro Files Accessible to Workers
+=======================================
+
+To give workers access to files, every node must have the same view of the storage being used.
+
+The following apply for Avro files to be accessible to workers:
+
+* For files hosted on NFS, ensure that the mount is accessible from all servers.
+
+* For HDFS, ensure that SQream servers have access to the HDFS name node with the correct **user-id**. For more information, see :ref:`hdfs`.
+
+* For S3, ensure network access to the S3 endpoint. For more information, see :ref:`s3`.
+
+For more information about restricted worker access, see :ref:`workload_manager`.
+
+Preparing Your Table
+====================
+
+You can build your table structure on both local and foreign tables:
+
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Creating a Table
+----------------
+   
+Before loading data, you must build the ``CREATE TABLE`` to correspond with the file structure of the inserted table.
+
+The example in this section is based on the source ``nba.avro`` table shown below:
+
+.. csv-table:: nba.avro
+   :file: nba-t10.csv
+   :widths: auto
+   :header-rows: 1 
+
+The following example shows the correct file structure used to create the ``CREATE TABLE`` statement based on the **nba.avro** table:
+
+.. code-block:: postgres
+   
+	CREATE TABLE nba (
+	  name TEXT(40),
+	  team TEXT(40),
+	  number BIGINT,
+	  position TEXT(2),
+	  age BIGINT,
+	  height TEXT(4),
+	  weight BIGINT,
+	  college TEXT(40),
+	  salary FLOAT
+	)
+	WRAPPER
+	  avro_fdw
+	OPTIONS
+	  (LOCATION = 's3://sqream-docs/nba.avro');
+
+.. tip:: 
+
+   An exact match must exist between the SQream and Avro types. For unsupported column types, you can set the type to any type and exclude it from subsequent queries.
+
+.. note:: The **nba.avro** file is stored on S3 at ``s3://sqream-demo-data/nba.avro``.
+
+Creating a Foreign Table
+------------------------
+
+Before loading data, you must build the ``CREATE FOREIGN TABLE`` to correspond with the file structure of the inserted table.
+
+The example in this section is based on the source ``nba.avro`` table shown below:
+
+.. csv-table:: nba.avro
+   :file: nba-t10.csv
+   :widths: auto
+   :header-rows: 1 
+
+The following example shows the correct file structure used to create the ``CREATE FOREIGN TABLE`` statement based on the **nba.avro** table:
+
+.. code-block:: postgres
+   
+	CREATE FOREIGN TABLE nba (
+	  name TEXT(40),
+	  team TEXT(40),
+	  number BIGINT,
+	  position TEXT(2),
+	  age BIGINT,
+	  height TEXT(4),
+	  weight BIGINT,
+	  college TEXT(40),
+	  salary FLOAT
+	)
+	WRAPPER
+	  avro_fdw
+	OPTIONS
+	  (LOCATION = 's3://sqream-docs/nba.avro');
+
+.. tip:: 
+
+   An exact match must exist between the SQream and Avro types. For unsupported column types, you can set the type to any type and exclude it from subsequent queries.
+
+.. note:: The **nba.avro** file is stored on S3 at ``s3://sqream-demo-data/nba.avro``.
+
+.. note:: The examples in the sections above are identical except for the syntax used to create the tables.
+
+Mapping Between SQream and Avro Data Types
+==========================================
+
+Mapping between SQream and Avro data types depends on the Avro data type:
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Primitive Data Types
+--------------------
+
+The following table shows the supported **Primitive** data types:
+
++-------------+------------------------------------------------------+
+| Avro Type   | SQream Type                                          |
+|             +-----------+---------------+-----------+--------------+
+|             | Number    | Date/Datetime | String    | Boolean      |
++=============+===========+===============+===========+==============+
+| ``null``    | Supported | Supported     | Supported | Supported    |
++-------------+-----------+---------------+-----------+--------------+
+| ``boolean`` |           |               | Supported | Supported    |
++-------------+-----------+---------------+-----------+--------------+
+| ``int``     | Supported |               | Supported |              |
++-------------+-----------+---------------+-----------+--------------+
+| ``long``    | Supported |               | Supported |              |
++-------------+-----------+---------------+-----------+--------------+
+| ``float``   | Supported |               | Supported |              |
++-------------+-----------+---------------+-----------+--------------+
+| ``double``  | Supported |               | Supported |              |
++-------------+-----------+---------------+-----------+--------------+
+| ``bytes``   |           |               |           |              |
++-------------+-----------+---------------+-----------+--------------+
+| ``string``  |           | Supported     | Supported |              |
++-------------+-----------+---------------+-----------+--------------+
+
+Complex Data Types
+------------------
+
+The following table shows the supported **Complex** data types:
+
++------------+-------------------------------------------------------+
+|            | SQream Type                                           |
+|            +------------+----------------+-------------+-----------+
+|Avro Type   | Number     |  Date/Datetime |   String    | Boolean   |
++============+============+================+=============+===========+
+| ``record`` |            |                |             |           |
++------------+------------+----------------+-------------+-----------+
+| ``enum``   |            |                | Supported   |           |
++------------+------------+----------------+-------------+-----------+
+| ``array``  |            |                |             |           |
++------------+------------+----------------+-------------+-----------+
+| ``map``    |            |                |             |           |
++------------+------------+----------------+-------------+-----------+
+| ``union``  |  Supported | Supported      | Supported   | Supported |
++------------+------------+----------------+-------------+-----------+
+| ``fixed``  |            |                |             |           |
++------------+------------+----------------+-------------+-----------+
+
+Logical Data Types
+------------------
+
+The following table shows the supported **Logical** data types:
+
++----------------------------+-------------------------------------------------+
+| Avro Type                  | SQream Type                                     |
+|                            +-----------+---------------+-----------+---------+
+|                            | Number    | Date/Datetime | String    | Boolean |
++============================+===========+===============+===========+=========+
+| ``decimal``                | Supported |               | Supported |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``uuid``                   |           |               | Supported |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``date``                   |           | Supported     | Supported |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``time-millis``            |           |               |           |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``time-micros``            |           |               |           |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``timestamp-millis``       |           | Supported     | Supported |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``timestamp-micros``       |           | Supported     | Supported |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``local-timestamp-millis`` |           |               |           |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``local-timestamp-micros`` |           |               |           |         |
++----------------------------+-----------+---------------+-----------+---------+
+| ``duration``               |           |               |           |         |
++----------------------------+-----------+---------------+-----------+---------+
+
+.. note:: Number types include **tinyint**, **smallint**, **int**, **bigint**, **real** and **float**, and **numeric**. String types include **text**.
+
+Mapping Objects to Rows
+=======================
+
+When mapping objects to rows, each Avro object or message must contain one ``record`` type object corresponding to a single row in SQream. The ``record`` fields are associated by name to their target table columns. Additional unmapped fields will be ignored. Note that using the JSONPath option overrides this.
+
+Ingesting Data into SQream
+==========================
+
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Syntax
+------
+
+Before ingesting data into SQream from an Avro file, you must create a table using the following syntax:
+
+.. code-block:: postgres
+   
+	COPY
+	  [.] 
+	FROM
+	WRAPPER
+	  fdw_;
+	  
+After creating a table you can ingest data from an Avro file into SQream using the following syntax:
+
+.. code-block:: postgres
+
+   avro_fdw
+   
+Example
+-------
+
+The following is an example of creating a table:
+
+.. code-block:: postgres
+   
+	COPY
+	  < table_name >
+	FROM
+	WRAPPER
+	  fdw_name
+	OPTIONS
+	  ([  [, ...] ]);
+
+The following is an example of loading data from an Avro file into SQream:
+
+.. code-block:: postgres
+
+    WRAPPER avro_fdw
+    OPTIONS
+    (
+      LOCATION =  's3://sqream-docs/nba.avro'
+    );
+	  
+For more examples, see :ref:`additional_examples`.
+
+Parameters
+==========
+
+The following table shows the Avro parameter:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - Description
+   * - ``schema_name``
+     - The schema name for the table. Defaults to ``public`` if not specified.
+
+Best Practices
+==============
+
+Because foreign tables do not automatically verify the file integrity or structure, SQream recommends manually verifying your table output when ingesting Avro files into SQream. This lets you determine if your table output is identical to your originally inserted table.
+
+The following is an example of the output based on the **nba.avro** table:
+
+.. code-block:: psql
+   
+	SELECT * FROM ext_nba LIMIT 10;
+	
+	Name          | Team           | Number | Position | Age | Height | Weight | College           | Salary  
+	--------------+----------------+--------+----------+-----+--------+--------+-------------------+---------
+	Avery Bradley | Boston Celtics |      0 | PG       |  25 | 6-2    |    180 | Texas             |  7730337
+	Jae Crowder   | Boston Celtics |     99 | SF       |  25 | 6-6    |    235 | Marquette         |  6796117
+	John Holland  | Boston Celtics |     30 | SG       |  27 | 6-5    |    205 | Boston University |         
+	R.J. Hunter   | Boston Celtics |     28 | SG       |  22 | 6-5    |    185 | Georgia State     |  1148640
+	Jonas Jerebko | Boston Celtics |      8 | PF       |  29 | 6-10   |    231 |                   |  5000000
+	Amir Johnson  | Boston Celtics |     90 | PF       |  29 | 6-9    |    240 |                   | 12000000
+	Jordan Mickey | Boston Celtics |     55 | PF       |  21 | 6-8    |    235 | LSU               |  1170960
+	Kelly Olynyk  | Boston Celtics |     41 | C        |  25 | 7-0    |    238 | Gonzaga           |  2165160
+	Terry Rozier  | Boston Celtics |     12 | PG       |  22 | 6-2    |    190 | Louisville        |  1824360
+	Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
+
+.. note:: If your table output has errors, verify that the structure of the Avro files correctly corresponds to the foreign table structure that you created.
+
+.. _additional_examples:
+
+Additional Examples
+===================
+
+This section includes the following additional examples of loading data into SQream:
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Omitting Unsupported Column Types
+---------------------------------
+
+When loading data, you can omit columns using the ``NULL as`` argument. You can use this argument to omit unsupported columns from queries that access foreign tables. By omitting them, these columns will not be called and will avoid generating a "type mismatch" error.
+
+In the example below, the ``Position`` column is not supported due its type.
+
+.. code-block:: postgres
+   
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  Name,
+	  Team,
+	  Number,
+	  NULL as Position,
+	  Age,
+	  Height,
+	  Weight,
+	  College,
+	  Salary
+	FROM
+	  ext_nba;
+
+Modifying Data Before Loading
+-----------------------------
+
+One of the main reasons for staging data using the ``FOREIGN TABLE`` argument is to examine and modify table contents before loading it into SQream.
+
+For example, we can replace pounds with kilograms using the :ref:`create_table_as` statement
+
+In the example below, the ``Position`` column is set to the default ``NULL``.
+
+.. code-block:: postgres
+   
+	CREATE FOREIGN TABLE nba AS
+	SELECT
+	  name,
+	  team,
+	  number,
+	  NULL as Position,
+	  age,
+	  height,
+	  (weight / 2.205) as weight,
+	  college,
+	  salary
+	FROM
+	  ext_nba
+	ORDER BY
+	  weight;
+
+Loading a Table from a Directory of Avro Files on HDFS
+------------------------------------------------------
+
+The following is an example of loading a table from a directory of Avro files on HDFS:
+
+.. code-block:: postgres
+
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  avro_fdw
+	OPTIONS
+	  (
+	    LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.avro'
+	  );
+   
+	CREATE TABLE
+	  users AS 
+	SELECT
+	  * 
+	FROM
+	  ext_users;
+
+For more configuration option examples, navigate to the :ref:`create_foreign_table` page and see the **Parameters** table.
+
+Loading a Table from a Directory of Avro Files on S3
+----------------------------------------------------
+
+The following is an example of loading a table from a directory of Avro files on S3:
+
+.. code-block:: postgres
+
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  avro_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3:/sqream-docs/users/*.avro',
+	    AWS_ID = 'our_aws_id',
+	    AWS_SECRET = 'our_aws_secret'
+	  );
+   
+   
+	CREATE TABLE
+	  users AS
+	SELECT
+	  *
+	FROM
+	  ext_users;
+   
+   
diff --git a/data_ingestion/connection_string.ini b/data_ingestion/connection_string.ini
new file mode 100644
index 000000000..21e35c679
--- /dev/null
+++ b/data_ingestion/connection_string.ini
@@ -0,0 +1,72 @@
+# Postgresql, Oracle, Teradata, SAP HANA, Microsoft SQL Server, Sybase and SQreamDB Connection Strings
+# (only one source connection string should be specified)
+
+# postgres (and also Greenplum)
+connectionStringSource=jdbc:postgresql:///?user=&password=&ssl=
+
+# oracle
+connectionStringSource=jdbc:oracle:thin:@///?user=&password=&ssl=
+
+# Oracle Autonomous Database
+
+connectionStringSource=jdbc:oracle:thin:@?tns_admin=&user=&password=
+
+# teradata
+connectionStringSource=jdbc:teradata:///DATABASE=,DBS_PORT=,user=,password=
+
+# sap hana
+connectionStringSource=jdbc:sap://:/?user=&password=
+
+# microsoft sql server
+connectionStringSource=jdbc:sqlserver://:;databaseName=;user=;password=;encrypt=;trustServerCertificate=
+
+# sybase
+connectionStringSource=jdbc:sybase:Tds::/?user=&password=
+
+# sqream
+connectionStringSqream=jdbc:Sqream:///;cluster=;user=;password=
+
+
+
+# Catalog Database Parameters
+
+# Connection string (only one catalog connection string should be specified)
+# Catalog database connection string on Oracle:
+connectionStringCatalog=jdbc:oracle:thin:@///?user=&password=
+
+# Catalog database connection string on SQreamDB:
+connectionStringCatalog=jdbc:Sqream:///;cluster=;user=;password=
+
+
+
+# CDC and Incremental Parameters
+cdcCatalogTable=public.CDC_TABLES
+cdcTrackingTable=public.CDC_TRACKING
+cdcPrimaryKeyTable=public.CDC_TABLE_PRIMARY_KEYS
+
+# Summary table
+loadSummaryTable=public.SQLOAD_SUMMARY
+
+
+
+# OPTIONAL - Data transfer options
+filter=1=1
+count=true
+limit=2000
+threadCount=1
+rowid=false
+batchSize=500
+fetchSize=100000
+chunkSize=0
+caseSensitive=false
+truncate=true
+drop=true
+loadTypeName=full
+cdcDelete=true
+usePartitions=false
+lockCheck=false
+lockTable=true
+loadDttm=false
+useDbmsLob=false
+
+.. more flags
\ No newline at end of file
diff --git a/data_ingestion/csv.rst b/data_ingestion/csv.rst
index f44c3c9e9..5ea3733fa 100644
--- a/data_ingestion/csv.rst
+++ b/data_ingestion/csv.rst
@@ -1,17 +1,30 @@
 .. _csv:
 
-**********************
-Inserting Data from a CSV File
-**********************
+***
+CSV
+***
 
-This guide covers inserting data from CSV files into SQream DB using the :ref:`copy_from` method. 
+This guide covers ingesting data from CSV files into SQream DB using the :ref:`copy_from` method. 
 
 
-.. contents:: In this topic:
+.. contents:: 
    :local:
+   :depth: 1
 
-1. Prepare CSVs
-=====================
+Foreign Data Wrapper Prerequisites
+===================================
+
+Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites:
+
+* **File Existence:** Verify that the file you are ingesting data from exists at the specified path.
+
+* **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues.
+* **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process.
+
+* **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion.
+
+Prepare CSVs
+============
 
 Prepare the source CSVs, with the following requirements:
 
@@ -44,8 +57,8 @@ Prepare the source CSVs, with the following requirements:
    .. note:: If a text field is quoted but contains no content (``""``) it is considered an empty text field. It is not considered ``NULL``.
 
 
-2. Place CSVs where SQream DB workers can access
-=======================================================
+Place CSVs where SQream DB workers can access
+=============================================
 
 During data load, the :ref:`copy_from` command can run on any worker (unless explicitly speficied with the :ref:`workload_manager`).
 It is important that every node has the same view of the storage being used - meaning, every SQream DB worker should have access to the files.
@@ -56,8 +69,8 @@ It is important that every node has the same view of the storage being used - me
 
 * For S3, ensure network access to the S3 endpoint. See our :ref:`s3` guide for more information.
 
-3. Figure out the table structure
-===============================================
+Figure out the table structure
+==============================
 
 Prior to loading data, you will need to write out the table structure, so that it matches the file structure.
 
@@ -81,22 +94,22 @@ We will make note of the file structure to create a matching ``CREATE TABLE`` st
 
 .. code-block:: postgres
    
-   CREATE TABLE nba
-   (
-      Name varchar(40),
-      Team varchar(40),
-      Number tinyint,
-      Position varchar(2),
-      Age tinyint,
-      Height varchar(4),
-      Weight real,
-      College varchar(40),
-      Salary float
-    );
-
-
-4. Bulk load the data with COPY FROM
-====================================
+	CREATE TABLE
+	  nba (
+		Name text(40),
+		Team text(40),
+		Number tinyint,
+		Position text(2),
+		Age tinyint,
+		Height text(4),
+		Weight real,
+		College text(40),
+		Salary float
+	  );
+
+
+Bulk load the data with COPY FROM
+=================================
 
 The CSV is a standard CSV, but with two differences from SQream DB defaults:
 
@@ -105,64 +118,106 @@ The CSV is a standard CSV, but with two differences from SQream DB defaults:
 * The first row of the file is a header containing column names, which we'll want to skip.
 
 .. code-block:: postgres
-   
-   COPY nba
-      FROM 's3://sqream-demo-data/nba.csv'
-      WITH RECORD DELIMITER '\r\n'
-           OFFSET 2;
 
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+		RECORD_DELIMITER = '\r\n',
+	    OFFSET = 2;
+	);
 
-Repeat steps 3 and 4 for every CSV file you want to import.
-
-
-Loading different types of CSV files
-=======================================
 
-:ref:`copy_from` contains several configuration options. See more in :ref:`the COPY FROM elements section`.
+Repeat steps 3 and 4 for every CSV file you want to import.
 
 
-Loading a standard CSV file from a local filesystem
----------------------------------------------------------
+Loading a standard CSV File From a Local Filesystem
+---------------------------------------------------
 
 .. code-block:: postgres
    
-   COPY table_name FROM '/home/rhendricks/file.csv';
+	COPY
+	  table_name 
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS (LOCATION = '/home/rhendricks/file.csv');
 
 
 Loading a PSV (pipe separated value) file
--------------------------------------------
+-----------------------------------------
 
 .. code-block:: postgres
    
-   COPY table_name FROM '/home/rhendricks/file.psv' WITH DELIMITER '|';
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DELIMITER = '|'
+	);
 
 Loading a TSV (tab separated value) file
--------------------------------------------
+----------------------------------------
 
 .. code-block:: postgres
    
-   COPY table_name FROM '/home/rhendricks/file.tsv' WITH DELIMITER '\t';
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DELIMITER = '\t';
+	);
 
 Loading a text file with non-printable delimiter
------------------------------------------------------
+------------------------------------------------
 
 In the file below, the separator is ``DC1``, which is represented by ASCII 17 decimal or 021 octal.
 
 .. code-block:: postgres
    
-   COPY table_name FROM 'file.txt' WITH DELIMITER E'\021';
-
-Loading a text file with multi-character delimiters
------------------------------------------------------
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DELIMITER = E'\021'
+	);
+
+Loading a Text File With Multi-Character Delimiters
+---------------------------------------------------
 
 In the file below, the separator is ``'|``.
 
 .. code-block:: postgres
    
-   COPY table_name FROM 'file.txt' WITH DELIMITER '''|';
-
-Loading files with a header row
------------------------------------
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DELIMITER = '''|'
+	);
+
+Loading Files With a Header Row
+-------------------------------
 
 Use ``OFFSET`` to skip rows.
 
@@ -170,89 +225,174 @@ Use ``OFFSET`` to skip rows.
 
 .. code-block:: postgres
 
-   COPY  table_name FROM 'filename.psv' WITH DELIMITER '|' OFFSET  2;
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DELIMITER = '|',
+	    OFFSET  2
+	);
 
 .. _changing_record_delimiter:
 
-Loading files formatted for Windows (``\r\n``)
----------------------------------------------------
+Loading Files Formatted for Windows (``\r\n``)
+----------------------------------------------
 
 .. code-block:: postgres
 
-   COPY table_name FROM 'filename.psv' WITH DELIMITER '|' RECORD DELIMITER '\r\n';
-
-Loading a file from a public S3 bucket
-------------------------------------------
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+		RECORD_DELIMITER = '\r\n',
+	    DELIMITER = '|'
+	);
+
+Loading a File From a Public S3 Bucket
+--------------------------------------
 
 .. note:: The bucket must be publicly available and objects can be listed
 
 .. code-block:: postgres
 
-   COPY nba FROM 's3://sqream-demo-data/nba.csv' WITH OFFSET 2 RECORD DELIMITER '\r\n';
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    OFFSET = 2
+	);
 
 Loading files from an authenticated S3 bucket
----------------------------------------------------
+---------------------------------------------
 
 .. code-block:: postgres
 
-   COPY nba FROM 's3://secret-bucket/*.csv' WITH OFFSET 2 RECORD DELIMITER '\r\n' AWS_ID '12345678' AWS_SECRET 'super_secretive_secret';
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    OFFSET = 2
+	    AWS_ID = '12345678', 
+	    AWS_SECRET = 'super_secretive_secret'
+	);
 
 .. _hdfs_copy_from_example:
 
 Loading files from an HDFS storage
----------------------------------------------------
+----------------------------------
 
 .. code-block:: postgres
 
-   COPY nba FROM 'hdfs://hadoop-nn.piedpiper.com/rhendricks/*.csv' WITH OFFSET 2 RECORD DELIMITER '\r\n';
-
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/*.csv',
+		RECORD DELIMITER = '\r\n',
+	    OFFSET = 2
+	);
 
 Saving rejected rows to a file
-----------------------------------
+------------------------------
 
 See :ref:`capturing_rejected_rows` for more information about the error handling capabilities of ``COPY FROM``.
 
 .. code-block:: postgres
 
-   COPY  table_name FROM 'filename.psv'  WITH DELIMITER '|'
-                                         ERROR_LOG  '/temp/load_error.log' -- Save error log
-                                         ERROR_VERBOSITY 0; -- Only save rejected rows
-
+	COPY
+	  t
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = '/tmp/file.psv',
+	    DELIMITER = '|',
+	    CONTINUE_ON_ERROR = True,
+	    ERROR_LOG = '/temp/load_error.log' -- Save error log,
+	    REJECTED_DATA = '/temp/load_rejected.log' -- Only save rejected rows
+	  );
 
 Stopping the load if a certain amount of rows were rejected
-------------------------------------------------------------------
+-----------------------------------------------------------
 
 .. code-block:: postgres
 
-   COPY  table_name  FROM  'filename.csv'   WITH  delimiter  '|'  
-                                            ERROR_LOG  '/temp/load_err.log' -- Save error log
-                                            OFFSET 2 -- skip header row
-                                            LIMIT  100 -- Only load 100 rows
-                                            STOP AFTER 5 ERRORS; -- Stop the load if 5 errors reached
+	COPY
+	  table
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION =  'filename.csv',
+	    DELIMITER = '|',
+	    ERROR_LOG = '/temp/load_err.log', -- Save error log
+	    OFFSET = 2 -- skip header row
+	)
+	LIMIT 100 -- Only load 100 rows
+	STOP AFTER 5 ERRORS;
+
+	-- Stop the load if 5 errors reached;
 
 Load CSV files from a set of directories
-------------------------------------------
+----------------------------------------
 
 Use glob patterns (wildcards) to load multiple files to one table.
 
 .. code-block:: postgres
 
-   COPY table_name  from  '/path/to/files/2019_08_*/*.csv';
+	COPY
+	  table  
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION =  '/path/to/files/2019_08_*/*.csv'
+	);
 
 
 Rearrange destination columns
----------------------------------
+-----------------------------
 
 When the source of the files does not match the table structure, tell the ``COPY`` command what the order of columns should be
 
 .. code-block:: postgres
 
-   COPY table_name (fifth, first, third) FROM '/path/to/files/*.csv';
+	COPY
+	  table (fifth, first, third)
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION =  '/path/to/files/*.csv'
+	);
 
 .. note:: Any column not specified will revert to its default value or ``NULL`` value if nullable
 
 Loading non-standard dates
-----------------------------------
+--------------------------
 
 If files contain dates not formatted as ``ISO8601``, tell ``COPY`` how to parse the column. After parsing, the date will appear as ``ISO8601`` inside SQream DB.
 
@@ -260,6 +400,15 @@ In this example, ``date_col1`` and ``date_col2`` in the table are non-standard.
 
 .. code-block:: postgres
 
-   COPY table_name FROM '/path/to/files/*.csv' WITH PARSERS 'date_col1=YMD,date_col2=MDY,date_col3=default';
+	COPY
+	  nba
+	FROM
+	WRAPPER
+	  csv_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/nba.csv',
+	    DATETIME_FORMAT = 'date_col1=YMD,date_col2=MDY,date_col3=default'
+	);
 
 .. tip:: The full list of supported date formats can be found under the :ref:`Supported date formats section` of the :ref:`copy_from` reference.
diff --git a/data_ingestion/index.rst b/data_ingestion/index.rst
index 83aca40ea..b80f9a8c6 100644
--- a/data_ingestion/index.rst
+++ b/data_ingestion/index.rst
@@ -1,18 +1,30 @@
 .. _data_ingestion:
 
-*************************
+**********************
 Data Ingestion Sources
-*************************
-The **Data Ingestion Sources** provides information about the following:
+**********************
+
+The **Data Ingestion Sources** page provides information about the following:
+
+* :ref:`Overview`
+* :ref:`avro`
+* :ref:`csv`
+* :ref:`parquet`
+* :ref:`orc`
+* :ref:`json`
+* :ref:`sqloader`
 
 .. toctree::
    :maxdepth: 1
    :glob:
+   :hidden:
    
-   inserting_data
+   ingesting_data
+   avro
    csv
    parquet
    orc
-   oracle
+   json
+   sqloader
+  
 
-For information about database tools and interfaces that SQream supports, see `Third Party Tools `_.
diff --git a/data_ingestion/ingesting_data.rst b/data_ingestion/ingesting_data.rst
new file mode 100644
index 000000000..68d114e13
--- /dev/null
+++ b/data_ingestion/ingesting_data.rst
@@ -0,0 +1,500 @@
+.. _ingesting_data:
+
+********
+Overview
+********
+
+The **Ingesting Data Overview** page provides basic information useful when ingesting data into SQream from a variety of sources and locations, and describes the following:
+
+.. contents::
+   :local:
+   :depth: 1
+   
+Getting Started
+===============
+
+SQream supports ingesting data using the following methods:
+
+* Executing the ``INSERT`` statement using a client driver.
+
+   ::
+   
+* Executing the ``COPY FROM`` statement or ingesting data from foreign tables:
+
+  * Local filesystem and locally mounted network filesystems
+  * Ingesting Data using the Amazon S3 object storage service
+  * Ingesting Data using an HDFS data storage system
+
+SQream supports loading files from the following formats:
+
+* Text - CSV, TSV, and PSV
+* Parquet
+* ORC
+* Avro
+* JSON
+
+For more information, see the following:
+
+* Using the ``INSERT`` statement - :ref:`insert`
+
+* Using client drivers - :ref:`Client drivers`
+
+* Using the ``COPY FROM`` statement - :ref:`copy_from`
+
+* Using the Amazon S3 object storage service - :ref:`s3`
+
+* Using the HDFS data storage system - :ref:`hdfs`
+
+* Loading data from foreign tables - :ref:`foreign_tables`
+
+Data Loading Considerations
+===========================
+
+The **Data Loading Considerations** section describes the following:
+
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Verifying Data and Performance after Loading
+--------------------------------------------
+
+Like many RDBMSs, SQream recommends its own set of best practices for table design and query optimization. When using SQream, verify the following:
+
+* That your data is structured as you expect (row counts, data types, formatting, content).
+
+* That your query performance is adequate.
+
+* That you followed the table design best practices (:ref:`Optimization and Best Practices`).
+
+* That you've tested and verified that your applications work.
+
+* That your data types have not been not over-provisioned.
+
+File Soure Location when Loading
+--------------------------------
+
+While you are loading data, you can use the ``COPY FROM`` command to let statements run on any worker. If you are running multiple nodes, verify that all nodes can see the source the same. Loading data from a local file that is only on one node and not on shared storage may cause it to fail. If required, you can also control which node a statement runs on using the Workload Manager).
+
+For more information, see the following:
+
+* :ref:`copy_from`
+
+* :ref:`workload_manager`
+
+Supported Load Methods
+----------------------
+
+You can use the ``COPY FROM`` syntax to load CSV files.
+
+.. note:: The ``COPY FROM`` cannot be used for loading data from Parquet and ORC files.
+
+You can use foreign tables to load text files, Parquet, and ORC files, and to transform your data before generating a full table, as described in the following table:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   :stub-columns: 1
+   
+   * - Method/File Type
+     - Text (CSV)
+     - Parquet
+     - ORC
+     - Streaming Data
+   * - COPY FROM
+     - Supported
+     - Not supported
+     - Not supported
+     - Not supported
+   * - Foreign tables
+     - Supported
+     - Supported
+     - Supported
+     - Not supported
+   * - INSERT
+     - Not supported
+     - Not supported
+     - Not supported
+     - Supported (Python, JDBC, Node.JS)
+	 
+For more information, see the following:
+
+* :ref:`COPY FROM`
+
+* :ref:`Foreign tables`
+
+* :ref:`INSERT`
+
+Unsupported Data Types
+----------------------
+
+SQream does not support certain features that are supported by other databases, such as ``ARRAY``, ``BLOB``, ``ENUM``, and ``SET``. You must convert these data types before loading them. For example, you can store ``ENUM`` as ``TEXT``.
+
+Handing Extended Errors
+-----------------------
+
+While you can use foreign tables to load CSVs, the ``COPY FROM`` statement provides more fine-grained error handling options and extended support for non-standard CSVs with multi-character delimiters, alternate timestamp formats, and more.
+
+For more information, see :ref:`foreign tables`.
+  
+Foreign Data Wrapper Best Practice
+==================================
+
+A recommended approach when working with :ref:`foreign_tables` and Foreign Data Wrapper (FDW) is storing files belonging to distinct file families and files with similar schemas in separate folders.
+
+Best Practices for CSV
+----------------------
+
+Text files, such as CSV, rarely conform to `RFC 4180 `_ , so you may need to make the following modifications:
+
+* Use ``OFFSET 2`` for files containing header rows.
+
+* You can capture failed rows in a log file for later analysis, or skip them. See :ref:`capturing_rejected_rows` for information on skipping rejected rows.
+
+* You can modify record delimiters (new lines) using the :ref:`RECORD DELIMITER` syntax.
+
+* If the date formats deviate from ISO 8601, refer to the :ref:`copy_date_parsers` section for overriding the default parsing.
+
+* *(Optional)* You can quote fields in a CSV using double-quotes (``"``).
+
+.. note:: You must quote any field containing a new line or another double-quote character.
+
+* If a field is quoted, you must double quote any double quote, similar to the **string literals quoting rules**. For example, to encode ``What are "birds"?``, the field should appear as ``"What are ""birds""?"``. For more information, see :ref:`string literals quoting rules`.
+
+* Field delimiters do not have to be a displayable ASCII character. For all supported field delimiters, see :ref:`field_delimiters`.
+
+Best Practices for Parquet
+--------------------------
+
+The following list shows the best practices when ingesting data from Parquet files:
+
+* You must load Parquet files through :ref:`foreign_tables`. Note that the destination table structure must be identical to the number of columns between the source files.
+
+* Parquet files support **predicate pushdown**. When a query is issued over Parquet files, SQream uses row-group metadata to determine which row-groups in a file must be read for a particular query and the row indexes can narrow the search to a particular set of rows.
+
+Supported Types and Behavior Notes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Unlike the ORC format, the column types should match the data types exactly, as shown in the table below:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   :stub-columns: 1
+   
+   * -   SQream DB type →
+   
+         Parquet source
+     - ``BOOL``
+     - ``TINYINT``
+     - ``SMALLINT``
+     - ``INT``
+     - ``BIGINT``
+     - ``REAL``
+     - ``DOUBLE``
+     - Text [#f0]_
+     - ``DATE``
+     - ``DATETIME``
+   * - ``BOOLEAN``
+     - Supported 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``INT16``
+     - 
+     - 
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``INT32``
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``INT64``
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``FLOAT``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+     - 
+     - 
+   * - ``DOUBLE``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+     - 
+   * - ``BYTE_ARRAY`` [#f2]_
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+   * - ``INT96`` [#f3]_
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported [#f4]_
+
+If a Parquet file has an unsupported type, such as ``enum``, ``uuid``, ``time``, ``json``, ``bson``, ``lists``, ``maps``, but the table does not reference this data (i.e., the data does not appear in the :ref:`SELECT` query), the statement will succeed. If the table **does** reference a column, an error will be displayed explaining that the type is not supported, but the column may be omitted.
+
+Best Practices for ORC
+----------------------
+
+The following list shows the best practices when ingesting data from ORC files:
+
+* You must load ORC files through :ref:`foreign_tables`. Note that the destination table structure must be identical to the number of columns between the source files.
+
+* ORC files support **predicate pushdown**. When a query is issued over ORC files, SQream uses ORC metadata to determine which stripes in a file need to be read for a particular query and the row indexes can narrow the search to a particular set of 10,000 rows.
+
+Type Support and Behavior Notes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You must load ORC files through a foreign table. Note that the destination table structure must be identical to the number of columns between the source files.
+
+For more information, see :ref:`foreign_tables`.
+
+The types should match to some extent within the same "class", as shown in the following table:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   :stub-columns: 1
+   
+   * -   SQream DB Type →
+   
+         ORC Source
+     - ``BOOL``
+     - ``TINYINT``
+     - ``SMALLINT``
+     - ``INT``
+     - ``BIGINT``
+     - ``REAL``
+     - ``DOUBLE``
+     - ``TEXT``
+     - ``DATE``
+     - ``DATETIME``
+   * - ``boolean``
+     - Supported 
+     - Supported [#f5]_
+     - Supported [#f5]_
+     - Supported [#f5]_
+     - Supported [#f5]_
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``tinyint``
+     - ○ [#f6]_
+     - Supported
+     - Supported
+     - Supported
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``smallint``
+     - ○ [#f6]_
+     - ○ [#f7]_
+     - Supported
+     - Supported
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``int``
+     - ○ [#f6]_
+     - ○ [#f7]_
+     - ○ [#f7]_
+     - Supported
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``bigint``
+     - ○ [#f6]_
+     - ○ [#f7]_
+     - ○ [#f7]_
+     - ○ [#f7]_
+     - Supported
+     - 
+     - 
+     - 
+     - 
+     - 
+   * - ``float``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - Supported
+     - 
+     - 
+     - 
+   * - ``double``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - Supported
+     - 
+     - 
+     - 
+   * - ``string`` / ``char`` / ``varchar``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - 
+     - 
+   * - ``date``
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+     - Supported
+   * - ``timestamp``, ``timestamp`` with timezone
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - 
+     - Supported
+
+* If an ORC file has an unsupported type like ``binary``, ``list``, ``map``, and ``union``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query), the statement will succeed. If the column is referenced, an error will be thrown to the user, explaining that the type is not supported, but the column may be omitted.
+
+
+
+..
+   insert
+
+   example
+
+   are there some variations to highlight?:
+
+   create table as
+
+   sequences, default values
+
+   insert select
+
+   make distinction between an insert command, and a parameterized/bulk
+   insert "over the network"
+
+
+   copy
+
+
+   best practices for insert
+
+   chunks and extents, and storage reorganisation
+
+   copy:
+
+   give an example
+
+   supports csv and parquet
+
+   what else do we have right now? any other formats? have the s3 and
+   hdfs url support also
+
+   error handling
+
+   best practices
+
+   try to combine sensibly with the external table stuff
+
+Further Reading and Migration Guides
+====================================
+
+For more information, see the following:
+
+* :ref:`copy_from`
+* :ref:`insert`
+* :ref:`foreign_tables`
+
+.. rubric:: Footnotes
+
+.. [#f2] With UTF8 annotation
+
+.. [#f3] With ``TIMESTAMP_NANOS`` or ``TIMESTAMP_MILLIS`` annotation
+
+.. [#f4] Any microseconds will be rounded down to milliseconds.
+
+.. [#f5] Boolean values are cast to 0, 1
+
+.. [#f6] Will succeed if all values are 0, 1
+
+.. [#f7] Will succeed if all values fit the destination type
diff --git a/data_ingestion/inserting_data.rst b/data_ingestion/inserting_data.rst
deleted file mode 100644
index 660cd61bd..000000000
--- a/data_ingestion/inserting_data.rst
+++ /dev/null
@@ -1,474 +0,0 @@
-.. _inserting_data:
-
-***************************
-Inserting Data Overview
-***************************
-
-The **Inserting Data Overview** page describes how to insert data into SQream, specifically how to insert data from a variety of sources and locations. 
-
-.. contents:: In this topic:
-   :local:
-
-
-Getting Started
-================================
-
-SQream supports importing data from the following sources:
-
-* Using :ref:`insert` with :ref:`a client driver`
-* Using :ref:`copy_from`:
-
-   - Local filesystem and locally mounted network filesystems
-   - :ref:`s3`
-   - :ref:`hdfs`
-
-* Using :ref:`external_tables`:
-
-   - Local filesystem and locally mounted network filesystems
-   - :ref:`s3`
-   - :ref:`hdfs`
-
-
-SQream DB supports loading files in the following formats:
-
-* Text - CSV, TSV, PSV
-* Parquet
-* ORC
-
-Data Loading Considerations
-================================
-
-Verifying Data and Performance after Loading
------------------------------------------
-
-Like other RDBMSs, SQream DB has its own set of best practcies for table design and query optimization.
-
-SQream therefore recommends:
-
-* Verify that the data is as you expect it (e.g. row counts, data types, formatting, content)
-
-* The performance of your queries is adequate
-
-* :ref:`Best practices` were followed for table design
-
-* Applications such as :ref:`Tableau` and others have been tested, and work
-
-* Data types were not over-provisioned (e.g. don't use VARCHAR(2000) to store a short string)
-
-File Soure Location when Loading
---------------------------------
-
-During loading using :ref:`copy_from`, the statement can run on any worker. If you are running multiple nodes, make sure that all nodes can see the source the same. If you load from a local file which is only on 1 node and not on shared storage, it will fail some of the time. (If you need to, you can also control which node a statement runs on using the :ref:`workload_manager`).
-
-Supported load methods
--------------------------------
-
-SQream DB's :ref:`COPY FROM` syntax can be used to load CSV files, but can't be used for Parquet and ORC.
-
-:ref:`FOREIGN TABLE` can be used to load text files, Parquet, and ORC files, and can also transform the data prior to materialization as a full table.
-
-.. list-table:: 
-   :widths: auto
-   :header-rows: 1
-   :stub-columns: 1
-   
-   * - Method / File type
-     - Text (CSV)
-     - Parquet
-     - ORC
-     - Streaming data
-   * - :ref:`copy_from`
-     - ✓
-     - ✗
-     - ✗
-     - ✗
-   * - :ref:`external_tables`
-     - ✓
-     - ✓
-     - ✓
-     - ✗
-   * - :ref:`insert`
-     - ✗
-     - ✗
-     - ✗
-     - ✓ (Python, JDBC, Node.JS)
-
-Unsupported Data Types
------------------------------
-
-SQream DB doesn't support the entire set of features that some other database systems may have, such as ``ARRAY``, ``BLOB``, ``ENUM``, ``SET``, etc.
-
-These data types will have to be converted before load. For example, ``ENUM`` can often be stored as a ``VARCHAR``.
-
-Handing Extended Errors
-----------------------------
-
-While :ref:`external tables` can be used to load CSVs, the ``COPY FROM`` statement provides more fine-grained error handling options, as well as extended support for non-standard CSVs with multi-character delimiters, alternate timestamp formats, and more.
-
-Best Practices for CSV
-------------------------------
-
-Text files like CSV rarely conform to `RFC 4180 `_ , so alterations may be required:
-
-* Use ``OFFSET 2`` for files containing header rows
-
-* Failed rows can be captured in a log file for later analysis, or just to skip them. See :ref:`capturing_rejected_rows` for information on skipping rejected rows.
-
-* Record delimiters (new lines) can be modified with the :ref:`RECORD DELIMITER` syntax.
-
-* If the date formats differ from ISO 8601, refer to the :ref:`copy_date_parsers` section to see how to override default parsing.
-
-* 
-   Fields in a CSV can be optionally quoted with double-quotes (``"``). However, any field containing a newline or another double-quote character must be quoted.
-
-   If a field is quoted, any double quote that appears must be double-quoted (similar to the :ref:`string literals quoting rules`. For example, to encode ``What are "birds"?``, the field should appear as ``"What are ""birds""?"``.
-
-* Field delimiters don't have a to be a displayable ASCII character. See :ref:`field_delimiters` for all options.
-
-
-Best Practices for Parquet
---------------------------------
-
-* Parquet files are loaded through :ref:`external_tables`. The destination table structure has to match in number of columns between the source files.
-
-* Parquet files support predicate pushdown. When a query is issued over Parquet files, SQream DB uses row-group metadata to determine which row-groups in a file need to be read for a particular query and the row indexes can narrow the search to a particular set of rows.
-
-Type Support and Behavior Notes
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-* Unlike ORC, the column types should match the data types exactly (see table below).
-
-.. list-table:: 
-   :widths: auto
-   :header-rows: 1
-   :stub-columns: 1
-   
-   * -   SQream DB type →
-   
-         Parquet source
-     - ``BOOL``
-     - ``TINYINT``
-     - ``SMALLINT``
-     - ``INT``
-     - ``BIGINT``
-     - ``REAL``
-     - ``DOUBLE``
-     - Text [#f0]_
-     - ``DATE``
-     - ``DATETIME``
-   * - ``BOOLEAN``
-     - ✓ 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``INT16``
-     - 
-     - 
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``INT32``
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``INT64``
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``FLOAT``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-     - 
-     - 
-   * - ``DOUBLE``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-     - 
-   * - ``BYTE_ARRAY`` [#f2]_
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-   * - ``INT96`` [#f3]_
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓ [#f4]_
-
-* If a Parquet file has an unsupported type like ``enum``, ``uuid``, ``time``, ``json``, ``bson``, ``lists``, ``maps``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query), the statement will succeed. If the column is referenced, an error will be thrown to the user, explaining that the type is not supported, but the column may be ommited.
-
-Best Practices for ORC
---------------------------------
-
-* ORC files are loaded through :ref:`external_tables`. The destination table structure has to match in number of columns between the source files.
-
-* ORC files support predicate pushdown. When a query is issued over ORC files, SQream DB uses ORC metadata to determine which stripes in a file need to be read for a particular query and the row indexes can narrow the search to a particular set of 10,000 rows.
-
-Type Support and Behavior Notes
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-* ORC files are loaded through :ref:`external_tables`. The destination table structure has to match in number of columns between the source files.
-
-* The types should match to some extent within the same "class" (see table below).
-
-.. list-table:: 
-   :widths: auto
-   :header-rows: 1
-   :stub-columns: 1
-   
-   * -   SQream DB type →
-   
-         ORC source
-     - ``BOOL``
-     - ``TINYINT``
-     - ``SMALLINT``
-     - ``INT``
-     - ``BIGINT``
-     - ``REAL``
-     - ``DOUBLE``
-     - Text [#f0]_
-     - ``DATE``
-     - ``DATETIME``
-   * - ``boolean``
-     - ✓ 
-     - ✓ [#f5]_
-     - ✓ [#f5]_
-     - ✓ [#f5]_
-     - ✓ [#f5]_
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``tinyint``
-     - ○ [#f6]_
-     - ✓
-     - ✓
-     - ✓
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``smallint``
-     - ○ [#f6]_
-     - ○ [#f7]_
-     - ✓
-     - ✓
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``int``
-     - ○ [#f6]_
-     - ○ [#f7]_
-     - ○ [#f7]_
-     - ✓
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``bigint``
-     - ○ [#f6]_
-     - ○ [#f7]_
-     - ○ [#f7]_
-     - ○ [#f7]_
-     - ✓
-     - 
-     - 
-     - 
-     - 
-     - 
-   * - ``float``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - ✓
-     - 
-     - 
-     - 
-   * - ``double``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - ✓
-     - 
-     - 
-     - 
-   * - ``string`` / ``char`` / ``varchar``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - 
-     - 
-   * - ``date``
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-     - ✓
-   * - ``timestamp``, ``timestamp`` with timezone
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - 
-     - ✓
-
-* If an ORC file has an unsupported type like ``binary``, ``list``, ``map``, and ``union``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query), the statement will succeed. If the column is referenced, an error will be thrown to the user, explaining that the type is not supported, but the column may be ommited.
-
-
-
-..
-   insert
-
-   example
-
-   are there some variations to highlight?:
-
-   create table as
-
-   sequences, default values
-
-   insert select
-
-   make distinction between an insert command, and a parameterized/bulk
-   insert "over the network"
-
-
-   copy
-
-
-   best practices for insert
-
-   chunks and extents, and storage reorganisation
-
-   copy:
-
-   give an example
-
-   supports csv and parquet
-
-   what else do we have right now? any other formats? have the s3 and
-   hdfs url support also
-
-   error handling
-
-   best practices
-
-   try to combine sensibly with the external table stuff
-
-Further Reading and Migration Guides
-=======================================
-
-.. toctree::
-   :caption: Data loading guides
-   :titlesonly:
-   
-   migration/csv
-   migration/parquet
-   migration/orc
-
-.. toctree::
-   :caption: Migration guides
-   :titlesonly:
-   
-   migration/oracle
-
-
-.. rubric:: See also:
-
-* :ref:`copy_from`
-* :ref:`insert`
-* :ref:`external_tables`
-
-.. rubric:: Footnotes
-
-.. [#f0] Text values include ``TEXT``, ``VARCHAR``, and ``NVARCHAR``
-
-.. [#f2] With UTF8 annotation
-
-.. [#f3] With ``TIMESTAMP_NANOS`` or ``TIMESTAMP_MILLIS`` annotation
-
-.. [#f4] Any microseconds will be rounded down to milliseconds.
-
-.. [#f5] Boolean values are cast to 0, 1
-
-.. [#f6] Will succeed if all values are 0, 1
-
-.. [#f7] Will succeed if all values fit the destination type
diff --git a/data_ingestion/json.rst b/data_ingestion/json.rst
new file mode 100644
index 000000000..cb41322eb
--- /dev/null
+++ b/data_ingestion/json.rst
@@ -0,0 +1 @@
+.. _json:

****
JSON
****

JSON (Java Script Object Notation) is used both as a file format and as a serialization method. The JSON file format is flexible and is commonly used for dynamic, nested, and semi-structured data representations. 

The SQreamDB JSON parser supports the `RFC 8259 `_ data interchange format and supports both JSON objects and JSON object arrays.

Only the `JSON Lines `_ data format is supported by SQreamDB.

.. contents:: 
   :local:
   :depth: 1

Foreign Data Wrapper Prerequisites
===================================

Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites:

* **File Existence:** Verify that the file you are ingesting data from exists at the specified path.

* **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues.
* **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process.

* **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion.

Making JSON Files Accessible to Workers
=======================================

To give workers access to files, every node in your system must have access to the storage being used.

The following are required for JSON files to be accessible to workers:

* For files hosted on NFS, ensure that the mount is accessible from all servers.

* For HDFS, ensure that SQream servers have access to the HDFS NameNode with the correct **user-id**. For more information, see :ref:`hdfs`.

* For S3, ensure network access to the S3 endpoint. For more information, see :ref:`s3`.

For more information about configuring worker access, see :ref:`workload_manager`.


Mapping between JSON and SQream
===============================

A JSON field consists of a key name and a value.

Key names, which are case sensitive, are mapped to SQream columns. Key names which do not have corresponding SQream table columns are treated as errors by default, unless the ``IGNORE_EXTRA_FIELDS`` parameter is set to ``true``, in which case these key names will be ignored during the mapping process.

SQream table columns which do not have corresponding JSON fields are automatically set to ``null`` as a value.

Values may be one of the following reserved words (lower-case): ``false``, ``true``, or ``null``, or any of the following data types:

.. list-table:: 
   :widths: auto
   :header-rows: 1
   
   * - JSON Data Type
     - Representation in SQream
   * - Number
     - ``TINYINT``, ``SMALLINT``, ``INT``, ``BIGINT``, ``FLOAT``, ``DOUBLE``, ``NUMERIC``
   * - String
     - ``TEXT``
   * - JSON Literal
     - ``NULL``, ``TRUE``, ``FALSE``
   * - JSON Array
     - ``TEXT``
   * - JSON Object
     - ``TEXT``
 


Character Escaping
------------------

The ASCII 10 character (LF) marks the end of JSON objects. Use ``\\n`` to escape the ``\n`` character when you do not mean it be a new line.



Ingesting JSON Data into SQream
===============================

.. contents:: In this topic:
   :local:

Syntax
-------

To access JSON files, use the ``json_fdw`` with a ``COPY FROM``, ``COPY TO``, or ``CREATE FOREIGN TABLE`` statement.

The Foreign Data Wrapper (FDW) syntax is:

.. code-block:: 

	json_fdw [OPTIONS(option=value[,...])]


Parameters
----------

The following parameters are supported by ``json_fdw``:

.. list-table:: 
   :widths: auto
   :header-rows: 1
   
   * - Parameter
     - Description
   * - ``DATETIME_FORMAT``
     - Default format is ``yyyy-mm-dd``. Other supported date formats are:``iso8601``, ``iso8601c``, ``dmy``, ``ymd``, ``mdy``, ``yyyymmdd``, ``yyyy-m-d``, ``yyyy-mm-dd``, ``yyyy/m/d``, ``yyyy/mm/dd``, ``d/m/yyyy``, ``dd/mm/yyyy``, ``mm/dd/yyyy``, ``dd-mon-yyyy``, ``yyyy-mon-dd``.  
   * - ``IGNORE_EXTRA_FIELDS``
     - Default value is ``false``. When value is ``true``, key names which do not have corresponding SQream table columns will be ignored. Parameter may be used with the ``COPY TO`` and ``IGNORE FOREIGN TABLE`` statements. 
   * - ``COMPRESSION``
     - Supported values are ``auto``, ``gzip``, and ``none``. ``auto`` means that the compression type is automatically detected upon import. Parameter is not supported for exporting. ``gzip`` means that a ``gzip`` compression is applied. ``none`` means that no compression or an attempt to decompress will take place. 
   * - ``LOCATION``
     - A path on the local filesystem, on S3, or on HDFS URI. The local path must be an absolute path that SQream DB can access.
   * - ``LIMIT``
     - When specified, tells SQream DB to stop ingesting after the specified number of rows. Unlimited if unset.
   * - ``OFFSET``
     - The row number from which to start ingesting.
   * - ``ERROR_LOG``
     - If when using the ``COPY`` command, copying a row fails, the ``ERROR LOG`` command writes error information to the error log specified in the ``ERROR LOG`` command.

         * If an existing file path is specified, the file will be overwritten.
         
         * Specifying the same file for ``ERROR_LOG`` and ``REJECTED_DATA`` is not allowed and will result in error.
         
         * Specifying an error log when creating a foreign table will write a new error log for every query on the foreign table.
   * - ``CONTINUE_ON_ERROR``
     - Specifies if errors should be ignored or skipped. When set to true, the transaction will continue despite rejected data. This parameter should be set together with ``ERROR_COUNT``. When reading multiple files, if an entire file cannot be opened, it will be skipped.
   * - ``ERROR_COUNT``
     - Specifies the maximum number of faulty records that will be ignored. This setting must be used in conjunction with ``continue_on_error``.
   * - ``MAX_FILE_SIZE``
     - Sets the maximum file size (bytes).
   * - ``ENFORCE_SINGLE_FILE``
     - Permitted values are ``true`` or ``false``. When set to ``true``, a single file of unlimited size is created. This single file is not limited by the ``MAX_FILE_SIZE`` parameter. ``false`` permits creating several files together limited by the ``MAX_FILE_SIZE`` parameter. Default value: ``false``.

   * - ``AWS_ID``, ``AWS_SECRET``
     - Specifies the authentication details for secured S3 buckets.
 

Automatic Schema Inference
--------------------------

SQreamDB can read the file metadata, enabling the automatic inference of column structure and data types.  

.. code-block:: postgres
   
	CREATE FOREIGN TABLE nba
	WRAPPER
	  json_fdw
	OPTIONS
	  (LOCATION = 's3://sqream-docs/nba.json');

For more information, follow the :ref:`CREATE FOREIGN TABLE` page.

Examples
--------

JSON objects:

.. code-block:: json

	[
	{ "name":"Avery Bradley", "age":25, "position":"PG" },
	{ "name":"Jae Crowder", "age":25, "position":"SF" },
	{ "name":"John Holland", "age":27, "position":"SG" }
	]

Using the ``COPY FROM`` statement:

.. code-block:: postgres
   
	COPY
	  nba
	FROM
	WRAPPER
	  json_fdw
	OPTIONS
	  (LOCATION = 's3://sqream-docs/nba.json');

Note that JSON files generated using the ``COPY TO`` statement will store objects, and not object arrays.

.. code-block:: postgres
   
	COPY 
	  nba
	TO 
	WRAPPER 
	  json_fdw
	OPTIONS
	(location = 's3://sqream-docs/nba.json');


When using the ``CREATE FOREIGN TABLE`` statement, make sure that the table schema corresponds with the JSON file structure.

.. code-block:: postgres
   
	CREATE FOREIGN TABLE t (id int not null)
	WRAPPER
	  json_fdw
	OPTIONS
	  (location = 'sqream-docs.json');

The following is an example of loading data from a JSON file into SQream:

.. code-block:: postgres

	WRAPPER
	  json_fdw
	OPTIONS
	  (LOCATION = 'sqream-docs.json');
	  


.. tip:: 

   An exact match must exist between the SQream and JSON types. For unsupported column types, you can set the type to any type and exclude it from subsequent queries.



\ No newline at end of file
diff --git a/data_ingestion/nba-t10.csv b/data_ingestion/nba-t10.csv
index 024530355..e57ad3131 100644
--- a/data_ingestion/nba-t10.csv
+++ b/data_ingestion/nba-t10.csv
@@ -1,10 +1,10 @@
 Name,Team,Number,Position,Age,Height,Weight,College,Salary
-Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
-Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
-John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,
-R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0
-Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0
-Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0
-Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0
-Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0
-Terry Rozier,Boston Celtics,12.0,PG,22.0,6-2,190.0,Louisville,1824360.0
+Avery Bradley,Boston Celtics,0,PG,25,44714,180,Texas,7730337
+Jae Crowder,Boston Celtics,99,SF,25,44718,235,Marquette,6796117
+John Holland,Boston Celtics,30,SG,27,44717,205,Boston University,
+R.J. Hunter,Boston Celtics,28,SG,22,44717,185,Georgia State,1148640
+Jonas Jerebko,Boston Celtics,8,PF,29,44722,231,,5000000
+Amir Johnson,Boston Celtics,90,PF,29,44721,240,,12000000
+Jordan Mickey,Boston Celtics,55,PF,21,44720,235,LSU,1170960
+Kelly Olynyk,Boston Celtics,41,C,25,36708,238,Gonzaga,2165160
+Terry Rozier,Boston Celtics,12,PG,22,44714,190,Louisville,1824360
diff --git a/data_ingestion/nba.json b/data_ingestion/nba.json
new file mode 100644
index 000000000..e4df53204
--- /dev/null
+++ b/data_ingestion/nba.json
@@ -0,0 +1,9 @@
+{"name":"Avery Bradley","team":"Boston Celtics","number":0,"position":"PG","age":25,"height":"6-2","weight":180.0,"college":"Texas","salary":7730337.0}
+{"name":"Jae Crowder","team":"Boston Celtics","number":99,"position":"SF","age":25,"height":"6-6","weight":235.0,"college":"Marquette","salary":6796117.0}
+{"name":"John Holland","team":"Boston Celtics","number":30,"position":"SG","age":27,"height":"6-5","weight":205.0,"college":"Boston University","salary":null}
+{"name":"R.J. Hunter","team":"Boston Celtics","number":28,"position":"SG","age":22,"height":"6-5","weight":185.0,"college":"Georgia State","salary":1148640.0}
+{"name":"Jonas Jerebko","team":"Boston Celtics","number":8,"position":"PF","age":29,"height":"6-10","weight":231.0,"college":null,"salary":5000000.0}
+{"name":"Amir Johnson","team":"Boston Celtics","number":90,"position":"PF","age":29,"height":"6-9","weight":240.0,"college":null,"salary":12000000.0}
+{"name":"Jordan Mickey","team":"Boston Celtics","number":55,"position":"PF","age":21,"height":"6-8","weight":235.0,"college":"LSU","salary":1170960.0}
+{"name":"Kelly Olynyk","team":"Boston Celtics","number":41,"position":"C","age":25,"height":"7-0","weight":238.0,"college":"Gonzaga","salary":2165160.0}
+{"name":"Terry Rozier","team":"Boston Celtics","number":12,"position":"PG","age":22,"height":"6-2","weight":190.0,"college":"Louisville","salary":1824360.0}
\ No newline at end of file
diff --git a/data_ingestion/oracle.rst b/data_ingestion/oracle.rst
deleted file mode 100644
index 0b0e6d5c8..000000000
--- a/data_ingestion/oracle.rst
+++ /dev/null
@@ -1,353 +0,0 @@
-.. _oracle:
-
-**********************
-Migrating Data from Oracle
-**********************
-
-This guide covers actions required for migrating from Oracle to SQream DB with CSV files. 
-
-.. contents:: In this topic:
-   :local:
-
-
-1. Preparing the tools and login information
-====================================================
-
-* Migrating data from Oracle requires a username and password for your Oracle system.
-
-* In this guide, we'll use the `Oracle Data Pump `_ , specifically the `Data Pump Export utility `_ .
-
-
-2. Export the desired schema
-===================================
-
-Use the Data Pump Export utility to export the database schema.
-
-The format for using the Export utility is
-
-   ``expdp / DIRECTORY= DUMPFILE= CONTENT=metadata_only NOLOGFILE``
-
-The resulting Oracle-only schema is stored in a dump file.
-
-
-Examples
-------------
-
-Dump all tables
-^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: console
-
-   $ expdp rhendricks/secretpassword DIRECTORY=dpumpdir DUMPFILE=tables.dmp CONTENT=metadata_only NOLOGFILE
-
-
-Dump only specific tables
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-In this example, we specify two tables for dumping.
-
-.. code-block:: console
-
-   $ expdp rhendricks/secretpassword DIRECTORY=dpumpdir DUMPFILE=tables.dmp CONTENT=metadata_only TABLES=employees,jobs NOLOGFILE
-
-3. Convert the Oracle dump to standard SQL
-=======================================================
-
-Oracle's Data Pump Import utility will help us convert the dump from the previous step to standard SQL.
-
-The format for using the Import utility is
-
-   ``impdp / DIRECTORY= DUMPFILE= SQLFILE= TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE``
-
-* ``TRANSFORM=SEGMENT_ATTRIBUTES:N:table`` excludes segment attributes (both STORAGE and TABLESPACE) from the tables
-
-* ``PARTITON_OPTIONS=MERGE`` combines all partitions and subpartitions into one table.
-
-Example
-----------
-
-.. code-block:: console
-   
-   $ impdp rhendricks/secretpassword DIRECTORY=dpumpdir DUMPFILE=tables.dmp SQLFILE=sql_export.sql TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE
-
-4. Figure out the database structures
-===============================================
-
-Using the SQL file created in the previous step, write CREATE TABLE statements to match the schemas of the tables.
-
-Remove unsupported attributes
------------------------------------
-
-Trim unsupported primary keys, indexes, constraints, and other unsupported Oracle attributes.
-
-Match data types
----------------------
-
-Refer to the table below to match the Oracle source data type to a new SQream DB type:
-
-.. list-table:: Data types
-   :widths: auto
-   :header-rows: 1
-   
-   * - Oracle Data type
-     - Precision
-     - SQream DB data type
-   * - ``CHAR(n)``, ``CHARACTER(n)``
-     - Any ``n``
-     - ``VARCHAR(n)``
-   * - ``BLOB``, ``CLOB``, ``NCLOB``, ``LONG``
-     - 
-     - ``TEXT``
-   * - ``DATE``
-     - 
-     - ``DATE``
-   * - ``FLOAT(p)``
-     - p <= 63
-     - ``REAL``
-   * - ``FLOAT(p)``
-     - p > 63
-     - ``FLOAT``, ``DOUBLE``
-
-   * - ``NCHAR(n)``, ``NVARCHAR2(n)``
-     - Any ``n``
-     - ``TEXT`` (alias of ``NVARCHAR``)
-
-   * - ``NUMBER(p)``, ``NUMBER(p,0)``
-     - p < 5
-     - ``SMALLINT``
-
-   * - ``NUMBER(p)``, `NUMBER(p,0)``
-     - p < 9
-     - ``INT``
-
-   * - ``NUMBER(p)``, `NUMBER(p,0)``
-     - p < 19
-     - ``INT``
-
-   * - ``NUMBER(p)``, `NUMBER(p,0)``
-     - p >= 20
-     - ``BIGINT``
-
-   * - ``NUMBER(p,f)``, ``NUMBER(*,f)``
-     - f > 0
-     - ``FLOAT`` / ``DOUBLE``
-
-   * - ``VARCHAR(n)``, ``VARCHAR2(n)``
-     - Any ``n``
-     - ``VARCHAR(n)`` or ``TEXT``
-   * - ``TIMESTAMP``
-     -  
-     - ``DATETIME``
-
-Read more about :ref:`supported data types in SQream DB`.
-
-Additional considerations
------------------------------
-
-* Understand how :ref:`tables are created in SQream DB`
-
-* Learn how :ref:`SQream DB handles null values`, particularly with regards to constraints.
-
-* Oracle roles and user management commands need to be rewritten to SQream DB's format. SQream DB supports :ref:`full role-based access control (RBAC)` similar to Oracle.
-
-5. Create the tables in SQream DB
-======================================
-
-After rewriting the table strucutres, create them in SQream DB.
-
-Example
----------
-
-
-Consider Oracle's ``HR.EMPLOYEES`` sample table:
-
-.. code-block:: sql
-
-      CREATE TABLE employees
-         ( employee_id NUMBER(6)
-         , first_name VARCHAR2(20)
-         , last_name VARCHAR2(25)
-         CONSTRAINT emp_last_name_nn NOT NULL
-         , email VARCHAR2(25)
-         CONSTRAINT emp_email_nn NOT NULL
-         , phone_number VARCHAR2(20)
-         , hire_date DATE
-         CONSTRAINT emp_hire_date_nn NOT NULL
-         , job_id VARCHAR2(10)
-         CONSTRAINT emp_job_nn NOT NULL
-         , salary NUMBER(8,2)
-         , commission_pct NUMBER(2,2)
-         , manager_id NUMBER(6)
-         , department_id NUMBER(4)
-         , CONSTRAINT emp_salary_min
-         CHECK (salary > 0) 
-         , CONSTRAINT emp_email_uk
-         UNIQUE (email)
-         ) ;
-      CREATE UNIQUE INDEX emp_emp_id_pk
-               ON employees (employee_id) ;
-             
-      ALTER TABLE employees
-               ADD ( CONSTRAINT emp_emp_id_pk
-         PRIMARY KEY (employee_id)
-         , CONSTRAINT emp_dept_fk
-         FOREIGN KEY (department_id)
-         REFERENCES departments
-         , CONSTRAINT emp_job_fk
-         FOREIGN KEY (job_id)
-         REFERENCES jobs (job_id)
-         , CONSTRAINT emp_manager_fk
-         FOREIGN KEY (manager_id)
-         REFERENCES employees
-         ) ;
-
-This table rewritten for SQream DB would be created like this:
-
-.. code-block:: postgres
-   
-   CREATE TABLE employees
-   (
-     employee_id      SMALLINT NOT NULL,
-     first_name       VARCHAR(20),
-     last_name        VARCHAR(25) NOT NULL,
-     email            VARCHAR(20) NOT NULL,
-     phone_number     VARCHAR(20),
-     hire_date        DATE NOT NULL,
-     job_id           VARCHAR(10) NOT NULL,
-     salary           FLOAT,
-     commission_pct   REAL,
-     manager_id       SMALLINT,
-     department_id    TINYINT
-   );
-
-
-6. Export tables to CSVs
-===============================
-
-Exporting CSVs from Oracle servers is not a trivial task.
-
-.. contents:: Options for exporting to CSVs
-   :local:
-
-Using SQL*Plus to export data lists
-------------------------------------------
-
-Here's a sample SQL*Plus script that will export PSVs in a format that SQream DB can read:
-
-:download:`Download to_csv.sql `
-
-.. literalinclude:: to_csv.sql
-    :language: sql
-    :caption: Oracle SQL*Plus CSV export script
-    :linenos:
-
-Enter SQL*Plus and export tables one-by-one interactively:
-
-.. code-block:: console
-   
-   $ sqlplus rhendricks/secretpassword
-
-   @spool employees
-   @spool jobs
-   [...]
-   EXIT
-
-Each table is exported as a data list file (``.lst``).
-
-Creating CSVs using stored procedures
--------------------------------------------
-
-You can use stored procedures if you have them set-up.
-
-Examples of `stored procedures for generating CSVs `_` can be found in the Ask The Oracle Mentors forums.
-
-CSV generation considerations
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-* Files should be a valid CSV. By default, SQream DB's CSV parser can handle `RFC 4180 standard CSVs `_ , but can also be modified to support non-standard CSVs (with multi-character delimiters, unquoted fields, etc).
-
-* Files are UTF-8 or ASCII encoded
-
-* Field delimiter is an ASCII character or characters
-
-* Record delimiter, also known as a new line separator, is a Unix-style newline (``\n``), DOS-style newline (``\r\n``), or Mac style newline (``\r``).
-
-* Fields are optionally enclosed by double-quotes, or mandatory quoted if they contain one of the following characters:
-
-   * The record delimiter or field delimiter
-
-   * A double quote character
-
-   * A newline
-
-* 
-   If a field is quoted, any double quote that appears must be double-quoted (similar to the :ref:`string literals quoting rules`. For example, to encode ``What are "birds"?``, the field should appear as ``"What are ""birds""?"``.
-   
-   Other modes of escaping are not supported (e.g. ``1,"What are \"birds\"?"`` is not a valid way of escaping CSV values).
-
-* ``NULL`` values can be marked in two ways in the CSV:
-   
-   - An explicit null marker. For example, ``col1,\N,col3``
-   - An empty field delimited by the field delimiter. For example, ``col1,,col3``
-   
-   .. note:: If a text field is quoted but contains no content (``""``) it is considered an empty text field. It is not considered ``NULL``.
-
-
-7. Place CSVs where SQream DB workers can access
-=======================================================
-
-During data load, the :ref:`copy_from` command can run on any worker (unless explicitly speficied with the :ref:`workload_manager`).
-It is important that every node has the same view of the storage being used - meaning, every SQream DB worker should have access to the files.
-
-* For files hosted on NFS, ensure that the mount is accessible from all servers.
-
-* For HDFS, ensure that SQream DB servers can access the HDFS name node with the correct user-id
-
-* For S3, ensure network access to the S3 endpoint
-
-8. Bulk load the CSVs
-=================================
-
-Issue the :ref:`copy_from` commands to SQream DB to insert a table from the CSVs created.
-
-Repeat the ``COPY FROM`` command for each table exported from Oracle.
-
-Example
--------------
-
-For the ``employees`` table, run the following command:
-
-.. code-block:: postgres
-   
-   COPY employees FROM 'employees.lst' WITH DELIMITER '|';
-
-9. Rewrite Oracle queries
-=====================================
-
-SQream DB supports a large subset of ANSI SQL.
-
-You will have to refactor much of Oracle's SQL and functions that often are not ANSI SQL. 
-
-We recommend the following resources:
-
-* :ref:`sql_feature_support` - to understand SQream DB's SQL feature support.
-
-* :ref:`sql_best_practices` - to understand best practices for SQL queries and schema design.
-
-* :ref:`common_table_expressions` - CTEs can be used to rewrite complex queries in a compact form.
-
-* :ref:`concurrency_and_locks` - to understand the difference between Oracle's transactions and SQream DB's concurrency.
-
-* :ref:`identity` - SQream DB supports sequences, but no triggers for auto-increment.
-
-* :ref:`joins` - SQream DB supports ANSI join syntax. Oracle uses the ``+`` operator which SQream DB doesn't support.
-
-* :ref:`saved_queries` - Saved queries can be used to emulate some stored procedures.
-
-* :ref:`subqueries` - SQream DB supports a limited set of subqueries.
-
-* :ref:`python_functions` - SQream DB supports Python User Defined Functions which can be used to run complex operations in-line.
-
-* :ref:`Views` - SQream DB supports logical views, but does not support materialized views.
-
-* :ref:`window_functions` - SQream DB supports a wide array of window functions.
\ No newline at end of file
diff --git a/data_ingestion/orc.rst b/data_ingestion/orc.rst
index d199e958e..5be02eb72 100644
--- a/data_ingestion/orc.rst
+++ b/data_ingestion/orc.rst
@@ -1,21 +1,36 @@
 .. _orc:
 
-**********************
-Inserting Data from an ORC File
-**********************
+***
+ORC
+***
 
-This guide covers inserting data from ORC files into SQream DB using :ref:`FOREIGN TABLE`. 
+This guide covers ingesting data from ORC files into SQream DB using :ref:`FOREIGN TABLE`. 
 
+.. contents:: 
+   :local:
+   :depth: 1
 
-1. Prepare the files
-=====================
+Foreign Data Wrapper Prerequisites
+===================================
+
+Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites:
+
+* **File Existence:** Verify that the file you are ingesting data from exists at the specified path.
+
+* **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues.
+* **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process.
+
+* **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion.
+
+Prepare the files
+=================
 
 Prepare the source ORC files, with the following requirements:
 
 .. list-table:: 
-   :widths: auto
+   :widths: 5 5 70 70 70 70 5 5 5 5 5
    :header-rows: 1
-   :stub-columns: 1
+
    
    * -   SQream DB type →
    
@@ -27,15 +42,15 @@ Prepare the source ORC files, with the following requirements:
      - ``BIGINT``
      - ``REAL``
      - ``DOUBLE``
-     - Text [#f0]_
+     - ``TEXT`` [#f0]_
      - ``DATE``
      - ``DATETIME``
    * - ``boolean``
-     - ✓ 
-     - ✓ [#f5]_
-     - ✓ [#f5]_
-     - ✓ [#f5]_
-     - ✓ [#f5]_
+     - Supported 
+     - Supported [#f5]_
+     - Supported [#f5]_
+     - Supported [#f5]_
+     - Supported [#f5]_
      - 
      - 
      - 
@@ -43,10 +58,10 @@ Prepare the source ORC files, with the following requirements:
      - 
    * - ``tinyint``
      - ○ [#f6]_
-     - ✓
-     - ✓
-     - ✓
-     - ✓
+     - Supported
+     - Supported
+     - Supported
+     - Supported
      - 
      - 
      - 
@@ -55,9 +70,9 @@ Prepare the source ORC files, with the following requirements:
    * - ``smallint``
      - ○ [#f6]_
      - ○ [#f7]_
-     - ✓
-     - ✓
-     - ✓
+     - Supported
+     - Supported
+     - Supported
      - 
      - 
      - 
@@ -67,8 +82,8 @@ Prepare the source ORC files, with the following requirements:
      - ○ [#f6]_
      - ○ [#f7]_
      - ○ [#f7]_
-     - ✓
-     - ✓
+     - Supported
+     - Supported
      - 
      - 
      - 
@@ -79,7 +94,7 @@ Prepare the source ORC files, with the following requirements:
      - ○ [#f7]_
      - ○ [#f7]_
      - ○ [#f7]_
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -91,8 +106,8 @@ Prepare the source ORC files, with the following requirements:
      - 
      - 
      - 
-     - ✓
-     - ✓
+     - Supported
+     - Supported
      - 
      - 
      - 
@@ -102,12 +117,12 @@ Prepare the source ORC files, with the following requirements:
      - 
      - 
      - 
-     - ✓
-     - ✓
+     - Supported
+     - Supported
      - 
      - 
      - 
-   * - ``string`` / ``char`` / ``varchar``
+   * - ``string`` / ``char`` / ``text``
      - 
      - 
      - 
@@ -115,7 +130,7 @@ Prepare the source ORC files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
    * - ``date``
@@ -127,8 +142,8 @@ Prepare the source ORC files, with the following requirements:
      - 
      - 
      - 
-     - ✓
-     - ✓
+     - Supported
+     - Supported
    * - ``timestamp``, ``timestamp`` with timezone
      - 
      - 
@@ -139,13 +154,13 @@ Prepare the source ORC files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
 
 * If an ORC file has an unsupported type like ``binary``, ``list``, ``map``, and ``union``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query), the statement will succeed. If the column is referenced, an error will be thrown to the user, explaining that the type is not supported, but the column may be ommited. This can be worked around. See more information in the examples.
 
 .. rubric:: Footnotes
 
-.. [#f0] Text values include ``TEXT``, ``VARCHAR``, and ``NVARCHAR``
+.. [#f0] Text values include ``TEXT``
 
 .. [#f5] Boolean values are cast to 0, 1
 
@@ -153,8 +168,8 @@ Prepare the source ORC files, with the following requirements:
 
 .. [#f7] Will succeed if all values fit the destination type
 
-2. Place ORC files where SQream DB workers can access them
-================================================================
+Place ORC files where SQream DB workers can access them
+=======================================================
 
 Any worker may try to access files (unless explicitly speficied with the :ref:`workload_manager`).
 It is important that every node has the same view of the storage being used - meaning, every SQream DB worker should have access to the files.
@@ -165,8 +180,8 @@ It is important that every node has the same view of the storage being used - me
 
 * For S3, ensure network access to the S3 endpoint. See our :ref:`s3` guide for more information.
 
-3. Figure out the table structure
-===============================================
+Figure out the table structure
+==============================
 
 Prior to loading data, you will need to write out the table structure, so that it matches the file structure.
 
@@ -184,23 +199,21 @@ We will make note of the file structure to create a matching ``CREATE FOREIGN TA
 
 .. code-block:: postgres
    
-   CREATE FOREIGN TABLE ext_nba
-   (
-        Name       VARCHAR(40),
-        Team       VARCHAR(40),
-        Number     BIGINT,
-        Position   VARCHAR(2),
-        Age        BIGINT,
-        Height     VARCHAR(4),
-        Weight     BIGINT,
-        College    VARCHAR(40),
-        Salary     FLOAT
-    )
-      WRAPPER orc_fdw
-      OPTIONS
-        (
-           LOCATION = 's3://sqream-demo-data/nba.orc'
-        );
+	CREATE FOREIGN TABLE ext_nba (
+	  Name TEXT(40),
+	  Team TEXT(40),
+	  Number BIGINT,
+	  Position TEXT(2),
+	  Age BIGINT,
+	  Height TEXT(4),
+	  Weight BIGINT,
+	  College TEXT(40),
+	  Salary FLOAT
+	)
+	WRAPPER
+	  orc_fdw
+	OPTIONS
+	  (LOCATION = 's3://sqream-docs/nba.orc');
 
 .. tip:: 
 
@@ -209,41 +222,46 @@ We will make note of the file structure to create a matching ``CREATE FOREIGN TA
    If the column type isn't supported, a possible workaround is to set it to any arbitrary type and then exclude it from subsequent queries.
 
 
-4. Verify table contents
-====================================
+Verify table contents
+=====================
 
 External tables do not verify file integrity or structure, so verify that the table definition matches up and contains the correct data.
 
 .. code-block:: psql
    
-   t=> SELECT * FROM ext_nba LIMIT 10;
-   Name          | Team           | Number | Position | Age | Height | Weight | College           | Salary  
-   --------------+----------------+--------+----------+-----+--------+--------+-------------------+---------
-   Avery Bradley | Boston Celtics |      0 | PG       |  25 | 6-2    |    180 | Texas             |  7730337
-   Jae Crowder   | Boston Celtics |     99 | SF       |  25 | 6-6    |    235 | Marquette         |  6796117
-   John Holland  | Boston Celtics |     30 | SG       |  27 | 6-5    |    205 | Boston University |         
-   R.J. Hunter   | Boston Celtics |     28 | SG       |  22 | 6-5    |    185 | Georgia State     |  1148640
-   Jonas Jerebko | Boston Celtics |      8 | PF       |  29 | 6-10   |    231 |                   |  5000000
-   Amir Johnson  | Boston Celtics |     90 | PF       |  29 | 6-9    |    240 |                   | 12000000
-   Jordan Mickey | Boston Celtics |     55 | PF       |  21 | 6-8    |    235 | LSU               |  1170960
-   Kelly Olynyk  | Boston Celtics |     41 | C        |  25 | 7-0    |    238 | Gonzaga           |  2165160
-   Terry Rozier  | Boston Celtics |     12 | PG       |  22 | 6-2    |    190 | Louisville        |  1824360
-   Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
+	SELECT * FROM ext_nba LIMIT 10;
+	
+	Name          | Team           | Number | Position | Age | Height | Weight | College           | Salary  
+	--------------+----------------+--------+----------+-----+--------+--------+-------------------+---------
+	Avery Bradley | Boston Celtics |      0 | PG       |  25 | 6-2    |    180 | Texas             |  7730337
+	Jae Crowder   | Boston Celtics |     99 | SF       |  25 | 6-6    |    235 | Marquette         |  6796117
+	John Holland  | Boston Celtics |     30 | SG       |  27 | 6-5    |    205 | Boston University |         
+	R.J. Hunter   | Boston Celtics |     28 | SG       |  22 | 6-5    |    185 | Georgia State     |  1148640
+	Jonas Jerebko | Boston Celtics |      8 | PF       |  29 | 6-10   |    231 |                   |  5000000
+	Amir Johnson  | Boston Celtics |     90 | PF       |  29 | 6-9    |    240 |                   | 12000000
+	Jordan Mickey | Boston Celtics |     55 | PF       |  21 | 6-8    |    235 | LSU               |  1170960
+	Kelly Olynyk  | Boston Celtics |     41 | C        |  25 | 7-0    |    238 | Gonzaga           |  2165160
+	Terry Rozier  | Boston Celtics |     12 | PG       |  22 | 6-2    |    190 | Louisville        |  1824360
+	Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
 
 If any errors show up at this stage, verify the structure of the ORC files and match them to the external table structure you created.
 
-5. Copying data into SQream DB
-===================================
+Copying data into SQream DB
+===========================
 
 To load the data into SQream DB, use the :ref:`create_table_as` statement:
 
 .. code-block:: postgres
    
-   CREATE TABLE nba AS
-      SELECT * FROM ext_nba;
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  *
+	FROM
+	  ext_nba;
 
-Working around unsupported column types
----------------------------------------------
+Working Around Unsupported Column Types
+---------------------------------------
 
 Suppose you only want to load some of the columns - for example, if one of the columns isn't supported.
 
@@ -252,15 +270,27 @@ By ommitting unsupported columns from queries that access the ``EXTERNAL TABLE``
 For this example, assume that the ``Position`` column isn't supported because of its type.
 
 .. code-block:: postgres
-   
-   CREATE TABLE nba AS
-      SELECT Name, Team, Number, NULL as Position, Age, Height, Weight, College, Salary FROM ext_nba;
+
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  Name,
+	  Team,
+	  Number,
+	  NULL as Position,
+	  Age,
+	  Height,
+	  Weight,
+	  College,
+	  Salary
+	FROM
+	  ext_nba;
    
    -- We ommitted the unsupported column `Position` from this query, and replaced it with a default ``NULL`` value, to maintain the same table structure.
 
 
 Modifying data during the copy process
-------------------------------------------
+--------------------------------------
 
 One of the main reasons for staging data with ``EXTERNAL TABLE`` is to examine the contents and modify them before loading them.
 
@@ -270,46 +300,77 @@ Similar to the previous example, we will also set the ``Position`` column as a d
 
 .. code-block:: postgres
    
-   CREATE TABLE nba AS 
-      SELECT name, team, number, NULL as position, age, height, (weight / 2.205) as weight, college, salary 
-              FROM ext_nba
-              ORDER BY weight;
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  name,
+	  team,
+	  number,
+	  NULL as position,
+	  age,
+	  height,
+	  (weight / 2.205) as weight,
+	  college,
+	  salary
+	FROM
+	  ext_nba
+	ORDER BY
+	  weight;
 
 
 Further ORC loading examples
-=======================================
+============================
 
 :ref:`create_foreign_table` contains several configuration options. See more in :ref:`the CREATE FOREIGN TABLE parameters section`.
 
 
 Loading a table from a directory of ORC files on HDFS
-------------------------------------------------------------
+-----------------------------------------------------
 
 .. code-block:: postgres
 
-   CREATE FOREIGN TABLE ext_users
-     (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL)  
-   WRAPPER orc_fdw
-     OPTIONS
-       ( 
-         LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.ORC'
-       );
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  orc_fdw
+	OPTIONS
+	  (
+	    LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.ORC'
+	  );
    
-   CREATE TABLE users AS SELECT * FROM ext_users;
+	CREATE TABLE
+	  users AS
+	SELECT
+	  *
+	FROM
+	  ext_users;
 
 Loading a table from a bucket of files on S3
------------------------------------------------
+--------------------------------------------
 
 .. code-block:: postgres
 
-   CREATE FOREIGN TABLE ext_users
-     (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL)  
-   WRAPPER orc_fdw
-   OPTIONS
-     (  LOCATION = 's3://pp-secret-bucket/users/*.ORC',
-        AWS_ID = 'our_aws_id',
-        AWS_SECRET = 'our_aws_secret'
-      )
-   ;
-   
-   CREATE TABLE users AS SELECT * FROM ext_users;
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  orc_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/users/*.ORC',
+	    AWS_ID = 'our_aws_id',
+	    AWS_SECRET = 'our_aws_secret'
+	  );
+
+
+	CREATE TABLE
+	  users AS
+	SELECT
+	  *
+	FROM
+	  ext_users;
\ No newline at end of file
diff --git a/data_ingestion/parquet.rst b/data_ingestion/parquet.rst
index 800c3122a..fb51a49f4 100644
--- a/data_ingestion/parquet.rst
+++ b/data_ingestion/parquet.rst
@@ -1,27 +1,38 @@
 .. _parquet:
 
-**********************
-Inserting Data from a Parquet File
-**********************
+*******
+Parquet
+*******
 
-This guide covers inserting data from Parquet files into SQream DB using :ref:`FOREIGN TABLE`. 
+Ingesting Parquet files into SQream is generally useful when you want to store the data permanently and perform frequent queries on it. Ingesting the data can also make it easier to join with other tables in your database. However, if you wish to retain your data on external Parquet files instead of ingesting it into SQream due to it being an open-source column-oriented data storage format, you may also execute :ref:`FOREIGN TABLE` queries.
 
-.. contents:: In this topic:
+.. contents:: 
    :local:
+   :depth: 1
+   
+Foreign Data Wrapper Prerequisites
+===================================
+
+Before proceeding, ensure the following Foreign Data Wrapper (FDW) prerequisites:
+
+* **File Existence:** Verify that the file you are ingesting data from exists at the specified path.
 
-1. Prepare the files
-=====================
+* **Path Accuracy:** Confirm that all path elements are present and correctly spelled. Any inaccuracies may lead to data retrieval issues.
+* **Bucket Access Permissions:** Ensure that you have the necessary access permissions to the bucket from which you are ingesting data. Lack of permissions can hinder the data retrieval process.
 
-Prepare the source Parquet files, with the following requirements:
+* **Wildcard Accuracy:** If using wildcards, double-check their spelling and configuration. Misconfigured wildcards may result in unintended data ingestion.
+   
+Preparing Your Parquet Files
+============================
+
+Prepare your source Parquet files according to the requirements described in the following table:
 
 .. list-table:: 
-   :widths: auto
+   :widths: 40 5 20 20 20 20 5 5 5 5 10
    :header-rows: 1
-   :stub-columns: 1
-   
-   * -   SQream DB type →
    
-         Parquet source
+   * -   SQream Type →
+         Parquet Source ↓
      - ``BOOL``
      - ``TINYINT``
      - ``SMALLINT``
@@ -29,11 +40,11 @@ Prepare the source Parquet files, with the following requirements:
      - ``BIGINT``
      - ``REAL``
      - ``DOUBLE``
-     - Text [#f0]_
+     - ``TEXT`` [#f0]_
      - ``DATE``
      - ``DATETIME``
    * - ``BOOLEAN``
-     - ✓ 
+     - Supported 
      - 
      - 
      - 
@@ -46,7 +57,7 @@ Prepare the source Parquet files, with the following requirements:
    * - ``INT16``
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -58,7 +69,7 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -70,7 +81,7 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -82,7 +93,7 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -94,7 +105,7 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
      - 
@@ -106,7 +117,7 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓
+     - Supported
      - 
      - 
    * - ``INT96`` [#f3]_
@@ -119,13 +130,13 @@ Prepare the source Parquet files, with the following requirements:
      - 
      - 
      - 
-     - ✓ [#f4]_
+     - Supported [#f4]_
 
-* If a Parquet file has an unsupported type like ``enum``, ``uuid``, ``time``, ``json``, ``bson``, ``lists``, ``maps``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query), the statement will succeed. If the column is referenced, an error will be thrown to the user, explaining that the type is not supported, but the column may be ommited. This can be worked around. See more information in the examples.
+Your statements will succeed even if your Parquet file contains unsupported types, such as ``enum``, ``uuid``, ``time``, ``json``, ``bson``, ``lists``, ``maps``, but the data is not referenced in the table (it does not appear in the :ref:`SELECT` query). If the column containing the unsupported type is referenced, an error message is displayed explaining that the type is not supported and that the column may be ommitted. For solutions to this error message, see more information in **Managing Unsupported Column Types** example in the **Example** section.
 
 .. rubric:: Footnotes
 
-.. [#f0] Text values include ``TEXT``, ``VARCHAR``, and ``NVARCHAR``
+.. [#f0] Text values include ``TEXT``
 
 .. [#f2] With UTF8 annotation
 
@@ -133,163 +144,206 @@ Prepare the source Parquet files, with the following requirements:
 
 .. [#f4] Any microseconds will be rounded down to milliseconds.
 
-2. Place Parquet files where SQream DB workers can access them
-================================================================
+Making Parquet Files Accessible to Workers
+==========================================
 
-Any worker may try to access files (unless explicitly speficied with the :ref:`workload_manager`).
-It is important that every node has the same view of the storage being used - meaning, every SQream DB worker should have access to the files.
+To give workers access to files, every node must have the same view of the storage being used.
 
 * For files hosted on NFS, ensure that the mount is accessible from all servers.
 
-* For HDFS, ensure that SQream DB servers can access the HDFS name node with the correct user-id. See our :ref:`hdfs` guide for more information.
+* For HDFS, ensure that SQream servers have access to the HDFS name node with the correct user-id. For more information, see :ref:`hdfs` guide.
 
-* For S3, ensure network access to the S3 endpoint. See our :ref:`s3` guide for more information.
+* For S3, ensure network access to the S3 endpoint. For more information, see :ref:`s3` guide.
 
+Creating a Table
+================
 
-3. Figure out the table structure
-===============================================
+Before loading data, you must create a table that corresponds to the file structure of the table you wish to insert.
 
-Prior to loading data, you will need to write out the table structure, so that it matches the file structure.
-
-For example, to import the data from ``nba.parquet``, we will first look at the source table:
+The example in this section is based on the source nba.parquet table shown below:
 
 .. csv-table:: nba.parquet
    :file: nba-t10.csv
    :widths: auto
    :header-rows: 1 
 
-* The file is stored on S3, at ``s3://sqream-demo-data/nba.parquet``.
-
-
-We will make note of the file structure to create a matching ``CREATE EXTERNAL TABLE`` statement.
+The following example shows the correct file structure used for creating a :ref:`FOREIGN TABLE` based on the nba.parquet table:
 
 .. code-block:: postgres
    
-   CREATE FOREIGN TABLE ext_nba
-   (
-        Name       VARCHAR(40),
-        Team       VARCHAR(40),
-        Number     BIGINT,
-        Position   VARCHAR(2),
-        Age        BIGINT,
-        Height     VARCHAR(4),
-        Weight     BIGINT,
-        College    VARCHAR(40),
-        Salary     FLOAT
-    )
-    WRAPPER parquet_fdw
-    OPTIONS
-    (
-      LOCATION =  's3://sqream-demo-data/nba.parquet'
-    );
-
-.. tip:: 
-
-   Types in SQream DB must match Parquet types exactly.
-   
-   If the column type isn't supported, a possible workaround is to set it to any arbitrary type and then exclude it from subsequent queries.
-
-
-4. Verify table contents
-====================================
-
-External tables do not verify file integrity or structure, so verify that the table definition matches up and contains the correct data.
-
-.. code-block:: psql
+	CREATE FOREIGN TABLE ext_nba (
+	  Name TEXT(40),
+	  Team TEXT(40),
+	  Number BIGINT,
+	  Position TEXT(2),
+	  Age BIGINT,
+	  Height TEXT(4),
+	  Weight BIGINT,
+	  College TEXT(40),
+	  Salary FLOAT
+	)
+	WRAPPER
+	  parquet_fdw
+	OPTIONS
+	  (LOCATION = 's3://sqream-docs/nba.parquet');
+
+.. tip:: An exact match must exist between the SQream and Parquet types. For unsupported column types, you can set the type to any type and exclude it from subsequent queries.
+
+.. note:: The **nba.parquet** file is stored on S3 at ``s3://sqream-demo-data/nba.parquet``.
+
+Ingesting Data into SQream
+==========================
    
-   t=> SELECT * FROM ext_nba LIMIT 10;
-   Name          | Team           | Number | Position | Age | Height | Weight | College           | Salary  
-   --------------+----------------+--------+----------+-----+--------+--------+-------------------+---------
-   Avery Bradley | Boston Celtics |      0 | PG       |  25 | 6-2    |    180 | Texas             |  7730337
-   Jae Crowder   | Boston Celtics |     99 | SF       |  25 | 6-6    |    235 | Marquette         |  6796117
-   John Holland  | Boston Celtics |     30 | SG       |  27 | 6-5    |    205 | Boston University |         
-   R.J. Hunter   | Boston Celtics |     28 | SG       |  22 | 6-5    |    185 | Georgia State     |  1148640
-   Jonas Jerebko | Boston Celtics |      8 | PF       |  29 | 6-10   |    231 |                   |  5000000
-   Amir Johnson  | Boston Celtics |     90 | PF       |  29 | 6-9    |    240 |                   | 12000000
-   Jordan Mickey | Boston Celtics |     55 | PF       |  21 | 6-8    |    235 | LSU               |  1170960
-   Kelly Olynyk  | Boston Celtics |     41 | C        |  25 | 7-0    |    238 | Gonzaga           |  2165160
-   Terry Rozier  | Boston Celtics |     12 | PG       |  22 | 6-2    |    190 | Louisville        |  1824360
-   Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
-
-If any errors show up at this stage, verify the structure of the Parquet files and match them to the external table structure you created.
-
-5. Copying data into SQream DB
-===================================
+Syntax
+------
 
-To load the data into SQream DB, use the :ref:`create_table_as` statement:
+You can use the :ref:`create_table_as` statement to load the data into SQream, as shown below:
 
 .. code-block:: postgres
    
-   CREATE TABLE nba AS
-      SELECT * FROM ext_nba;
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  *
+	FROM
+	  ext_nba;
 
-Working around unsupported column types
----------------------------------------------
+Examples
+--------
+
+.. contents:: 
+   :local:
+   :depth: 1
 
-Suppose you only want to load some of the columns - for example, if one of the columns isn't supported.
+Omitting Unsupported Column Types
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-By ommitting unsupported columns from queries that access the ``EXTERNAL TABLE``, they will never be called, and will not cause a "type mismatch" error.
+When loading data, you can omit columns using the NULL as argument. You can use this argument to omit unsupported columns from queries that access external tables. By omitting them, these columns will not be called and will avoid generating a “type mismatch” error.
 
-For this example, assume that the ``Position`` column isn't supported because of its type.
+In the example below, the ``Position column`` is not supported due its type.
 
 .. code-block:: postgres
    
-   CREATE TABLE nba AS
-      SELECT Name, Team, Number, NULL as Position, Age, Height, Weight, College, Salary FROM ext_nba;
-   
-   -- We ommitted the unsupported column `Position` from this query, and replaced it with a default ``NULL`` value, to maintain the same table structure.
-
-
-Modifying data during the copy process
-------------------------------------------
-
-One of the main reasons for staging data with ``EXTERNAL TABLE`` is to examine the contents and modify them before loading them.
-
-Assume we are unhappy with weight being in pounds, because we want to use kilograms instead. We can apply the transformation as part of the :ref:`create_table_as` statement.
-
-Similar to the previous example, we will also set the ``Position`` column as a default ``NULL``.
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  Name,
+	  Team,
+	  Number,
+	  NULL as Position,
+	  Age,
+	  Height,
+	  Weight,
+	  College,
+	  Salary
+	FROM
+	  ext_nba;
+
+Modifying Data Before Loading
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+One of the main reasons for staging data using the ``EXTERNAL TABLE`` argument is to examine and modify table contents before loading it into SQream.
+
+For example, we can replace **pounds** with **kilograms** using the ``CREATE TABLE AS`` statement.
+
+In the example below, the ``Position column`` is set to the default ``NULL``.
 
 .. code-block:: postgres
    
-   CREATE TABLE nba AS 
-      SELECT name, team, number, NULL as position, age, height, (weight / 2.205) as weight, college, salary 
-              FROM ext_nba
-              ORDER BY weight;
+	CREATE TABLE
+	  nba AS
+	SELECT
+	  name,
+	  team,
+	  number,
+	  NULL as position,
+	  age,
+	  height,
+	  (weight / 2.205) as weight,
+	  college,
+	  salary
+	FROM
+	  ext_nba
+	ORDER BY
+	  weight;
+
+Loading a Table from a Directory of Parquet Files on HDFS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following is an example of loading a table from a directory of Parquet files on HDFS:
 
+.. code-block:: postgres
 
-Further Parquet loading examples
-=======================================
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  parquet_fdw
+	OPTIONS
+	  (
+	    LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet'
+	  );
+
+	CREATE TABLE
+	  users AS
+	SELECT
+	  *
+	FROM
+	  ext_users;
+
+Loading a Table from a Directory of Parquet Files on S3
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following is an example of loading a table from a directory of Parquet files on S3:
 
-:ref:`create_foreign_table` contains several configuration options. See more in :ref:`the CREATE FOREIGN TABLE parameters section`.
+.. code-block:: postgres
 
+	CREATE FOREIGN TABLE ext_users (
+	  id INT NOT NULL,
+	  name TEXT(30) NOT NULL,
+	  email TEXT(50) NOT NULL
+	)
+	WRAPPER
+	  parquet_fdw
+	OPTIONS
+	  (
+	    LOCATION = 's3://sqream-docs/users/*.parquet',
+	    AWS_ID = 'our_aws_id',
+	    AWS_SECRET = 'our_aws_secret'
+	  );
 
-Loading a table from a directory of Parquet files on HDFS
-------------------------------------------------------------
+	CREATE TABLE
+	  users AS
+	SELECT
+	  *
+	FROM
+	  ext_users;
 
-.. code-block:: postgres
+For more configuration option examples, navigate to the :ref:`create_foreign_table` page and see the **Parameters** table.
 
-   CREATE FOREIGN TABLE ext_users
-     (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL)  
-   WRAPPER parquet_fdw
-   OPTIONS
-     (
-        LOCATION =  'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet'
-     );
-   
-   CREATE TABLE users AS SELECT * FROM ext_users;
+Best Practices
+==============
 
-Loading a table from a bucket of files on S3
------------------------------------------------
+Because external tables do not automatically verify the file integrity or structure, SQream recommends manually verifying your table output when ingesting Parquet files into SQream. This lets you determine if your table output is identical to your originally inserted table.
 
-.. code-block:: postgres
+The following is an example of the output based on the **nba.parquet** table:
 
-   CREATE FOREIGN TABLE ext_users
-     (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL)  
-   WRAPPER parquet_fdw
-   OPTIONS
-     ( LOCATION = 's3://pp-secret-bucket/users/*.parquet',
-       AWS_ID = 'our_aws_id',
-       AWS_SECRET = 'our_aws_secret'
-      );
+.. code-block:: psql
    
-   CREATE TABLE users AS SELECT * FROM ext_users;
+	SELECT * FROM ext_nba LIMIT 10;
+	
+	Name          | Team           | Number | Position | Age | Height | Weight | College           | Salary  
+	--------------+----------------+--------+----------+-----+--------+--------+-------------------+---------
+	Avery Bradley | Boston Celtics |      0 | PG       |  25 | 6-2    |    180 | Texas             |  7730337
+	Jae Crowder   | Boston Celtics |     99 | SF       |  25 | 6-6    |    235 | Marquette         |  6796117
+	John Holland  | Boston Celtics |     30 | SG       |  27 | 6-5    |    205 | Boston University |         
+	R.J. Hunter   | Boston Celtics |     28 | SG       |  22 | 6-5    |    185 | Georgia State     |  1148640
+	Jonas Jerebko | Boston Celtics |      8 | PF       |  29 | 6-10   |    231 |                   |  5000000
+	Amir Johnson  | Boston Celtics |     90 | PF       |  29 | 6-9    |    240 |                   | 12000000
+	Jordan Mickey | Boston Celtics |     55 | PF       |  21 | 6-8    |    235 | LSU               |  1170960
+	Kelly Olynyk  | Boston Celtics |     41 | C        |  25 | 7-0    |    238 | Gonzaga           |  2165160
+	Terry Rozier  | Boston Celtics |     12 | PG       |  22 | 6-2    |    190 | Louisville        |  1824360
+	Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
+
+.. note:: If your table output has errors, verify that the structure of the Parquet files correctly corresponds to the external table structure that you created.
\ No newline at end of file
diff --git a/data_ingestion/preparing_oracle_for_data_migration.rst b/data_ingestion/preparing_oracle_for_data_migration.rst
new file mode 100644
index 000000000..86124baed
--- /dev/null
+++ b/data_ingestion/preparing_oracle_for_data_migration.rst
@@ -0,0 +1,216 @@
+.. _preparing_oracle_for_data_migration:
+
+***********************************
+Preparing Oracle for Data Migration
+***********************************
+
+The preparation of incremental and Change Data Capture (CDC) tables is essential for efficiently tracking and managing changes to data over time, enabling streamlined data synchronization, and replication.
+
+Preparing CDC Tables
+====================
+
+1. Prepare the data table:
+
+   .. code-block:: sql
+
+	-- Drop the existing table if it exists
+	DROP TABLE cdc_example;
+
+	-- Create the main data table
+	CREATE TABLE cdc_example (
+	    id NUMBER(8) PRIMARY KEY,
+	    id_name VARCHAR(8),
+	    dttm TIMESTAMP,
+	    f_col FLOAT
+	);
+
+	-- Insert initial data into the table
+	INSERT INTO cdc_example (id, id_name, dttm, f_col) VALUES (-1, 'A', CURRENT_TIMESTAMP, 0);
+
+	-- Verify the data in the table
+	SELECT * FROM cdc_example ORDER BY id DESC;
+
+
+2. Prepare the CDC catalog:
+
+   .. code-block:: sql
+
+	-- Drop the CDC table if it exists
+	DROP TABLE cdc_example_cdc;
+
+	-- Create the CDC table to store change data
+	CREATE TABLE cdc_example_cdc (
+	    id NUMBER(8),
+	    id_name VARCHAR(8),
+	    row_id ROWID,
+	    updated_dttm DATE,
+	    type VARCHAR2(1)
+	);
+
+	-- Insert record to CDC_TABLES in the catalog
+	INSERT INTO public.CDC_TABLES (
+	    DB_NAME, 
+	    SCHEMA_NAME, 
+	    TABLE_NAME, 
+	    TABLE_NAME_FULL, 
+	    TABLE_NAME_CDC, 
+	    INC_COLUMN_NAME, 
+	    INC_COLUMN_TYPE, 
+	    LOAD_TYPE, 
+	    FREQ_TYPE, 
+	    FREQ_INTERVAL, 
+	    IS_ACTIVE, 
+	    STATUS_LOAD, 
+	    INC_GAP_VALUE
+	) VALUES (
+	    'ORCL', 
+	    'QA', 
+	    'CDC_EXAMPLE', 
+	    'QA.CDC_EXAMPLE', 
+	    'QA.CDC_EXAMPLE_CDC', 
+	    NULL, 
+	    NULL, 
+	    'CDC', 
+	    NULL, 
+	    NULL, 
+	    1, 
+	    0, 
+	    0
+	);
+
+	-- Insert record to primary keys table in the catalog
+	INSERT INTO public.CDC_TABLE_PRIMARY_KEYS (
+	    DB_NAME, 
+	    SCHEMA_NAME, 
+	    TABLE_NAME, 
+	    TABLE_NAME_FULL, 
+	    CONSTRAINT_NAME, 
+	    COLUMN_NAME, 
+	    IS_NULLABLE
+	) VALUES (
+	    'ORCL', 
+	    'QA', 
+	    'CDC_EXAMPLE', 
+	    'QA.CDC_EXAMPLE', 
+	    NULL, 
+	    'ID', 
+	   0
+	);
+
+
+3. Create trigger on data table:
+
+   .. code-block:: sql
+
+	-- Create a trigger on the data table to track changes and populate the CDC table
+	CREATE OR REPLACE TRIGGER cdc_example_tracking 
+	AFTER UPDATE OR INSERT OR DELETE ON cdc_example 
+	FOR EACH ROW 
+	DECLARE 
+	    l_xtn VARCHAR2(1); 
+	    l_id INTEGER; 
+	    l_id_name VARCHAR2(1); 
+	    r_rowid ROWID; 
+	BEGIN 
+	    l_xtn := CASE 
+	                 WHEN UPDATING THEN 'U' 
+	                 WHEN INSERTING THEN 'I' 
+	                 WHEN DELETING THEN 'D' 
+	             END; 
+				 
+		l_id_name := CASE 
+	                     WHEN UPDATING THEN :NEW.id_name 
+	                     WHEN INSERTING THEN :NEW.id_name 
+	                     WHEN DELETING THEN :OLD.id_name 
+	                 END; 
+					 
+		l_id := CASE 
+	                WHEN UPDATING THEN :NEW.id 
+	                WHEN INSERTING THEN :NEW.id 
+	                WHEN DELETING THEN :OLD.id 
+	            END; 
+				
+		r_rowid := CASE 
+	                   WHEN UPDATING THEN :NEW.rowid 
+	                   WHEN INSERTING THEN :NEW.rowid 
+	                   WHEN DELETING THEN :OLD.rowid 
+	               END; 
+				   
+		INSERT INTO cdc_example_cdc (
+	        id, 
+	        id_name, 
+	        row_id, 
+	        updated_dttm, 
+	        type
+		) VALUES (
+	        l_id, 
+	        l_id_name, 
+	        r_rowid, 
+	        SYSDATE, 
+	        l_xtn
+	   ); 
+	END;
+
+Preparing Incremental Table
+===========================
+
+1. Prepare the data table:
+
+   .. code-block:: sql
+
+	-- Create the data table for incremental loading
+	CREATE TABLE inc_example (
+	    ID INT PRIMARY KEY,
+	    name VARCHAR(8)
+	);
+
+	-- Insert initial data into the table
+	INSERT INTO inc_example (ID, name) VALUES (1, 'A');
+
+	-- Verify the data in the table
+	SELECT * FROM inc_example;
+	
+2. Prepare the CDC catalog:
+
+.. code-block:: sql
+
+	-- Insert record into CDC_TABLES in the catalog
+	INSERT INTO public.CDC_TABLES (
+	    DB_NAME, 
+	    SCHEMA_NAME, 
+	    TABLE_NAME, 
+	    TABLE_NAME_FULL, 
+	    INC_COLUMN_NAME, 
+	    INC_COLUMN_TYPE, 
+	    LOAD_TYPE, 
+	    IS_ACTIVE, 
+	    STATUS_LOAD
+	) VALUES (
+	    'ORCL', 
+	    'QA', 
+	    'INC_EXAMPLE', 
+	    'QA.INC_EXAMPLE', 
+	    'ID', 
+	    'INT', 
+	    'INC', 
+	    1, 
+	    0
+	);
+
+	-- Insert record into primary keys table in the catalog
+	INSERT INTO public.CDC_TABLE_PRIMARY_KEYS (
+	    DB_NAME, 
+	    SCHEMA_NAME, 
+	    TABLE_NAME, 
+	    TABLE_NAME_FULL, 
+	    COLUMN_NAME, 
+	    IS_NULLABLE
+	) VALUES (
+	    'ORCL', 
+	    'QA', 
+	    'INC_EXAMPLE', 
+	    'QA.INC_EXAMPLE', 
+	    'ID', 
+	    0
+	);
+
diff --git a/data_ingestion/sqloader.rst b/data_ingestion/sqloader.rst
new file mode 100644
index 000000000..1b81484a7
--- /dev/null
+++ b/data_ingestion/sqloader.rst
@@ -0,0 +1,1170 @@
+.. _sqloader:
+
+*********************
+SQLoader As a Service
+*********************
+
+The **SQLoader** is a Java service that enables you to ingest data into SQreamDB from other DBMS and DBaaS through HTTP requests using network insert.
+
+**SQLoader** supports ingesting data from the following DBMSs:
+
+* Greenplum
+* Microsoft SQL Server
+* Oracle (including Oracle Autonomous Database)
+* Postgresql
+* SAP HANA
+* Sybase
+* Teradata
+* SQreamDB 4.5.15 or later
+
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Before You Begin
+================
+
+It is essential that you have the following:
+
+* Java 17
+* :ref:`SQLoader configuration files`
+* :ref:`SQLoader.jar file`
+
+Minimum Hardware Requirements
+------------------------------
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Component
+     - Type
+   * - CPU cores
+     - 16
+   * - RAM
+     - 32GB
+
+.. _sqloader_thread_sizing_guideline:
+
+Sizing Guidelines 
+------------------
+
+The SQLoader sizing is determined by the number of concurrent tables and threads based on the available CPU cores, limiting it to the number of cores minus one, with the remaining core reserved for the operating system. Each SQLoader request runs on a single table, meaning concurrent imports of multiple tables require multiple requests. Additionally, it is important to note that for partitioned tables, each partition consumes a thread. Therefore, for performance efficiency, considering the table's partition count when managing thread allocation is a must.
+
+Compute formula: :math:`⌊ 0.8 * (TotalMemory - 4) ⌋`
+
+Installation and Connectivity
+=============================
+
+.. _getting_the_sqloader_configuration_and_jar_files:
+
+Getting All Configuration and JAR Files
+---------------------------------------
+
+#. Download the `SQLoader binary `_:
+
+#. Extract the ``.tar`` file using the following command:
+
+   .. code-block:: bash
+
+	tar -xf sqloader_*.tar.gz
+
+   A folder named ``sqloader`` with the following files is created:
+   
+   .. code-block:: 
+
+	├── sqloader-v1.sh
+	├── bin
+	│   ├── sqloader-admin-server-1.1.jar
+	│   └── sqloader-service-8.2.jar
+	├── config
+		├── reserved_words.txt
+		├── sqload-jdbc.properties
+		└── sqream-mapping.json
+   
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - File Name
+     - Description
+   * - ``sqream-mapping.json``
+     - Maps foreign DBMS and DBaaS data types into SQreamDB data types during ingestion
+   * - ``sqload-jdbc.properties``
+     - Used for defining a connection string and may also be used to reconfigure data loading
+   * - ``reserved_words.txt``
+     - A list of reserved words which cannot be used as table and/or column names. 
+   * - ``sqloader-service-8.2.jar``
+     - The SQLoader service JAR file 
+   * - ``sqloader-admin-server-1.0.jar``
+     - The SQLoader admin server JAR file
+   * - ``sqloader-v1.sh``
+     - SQLoader service installer bash file
+	 
+Installation
+------------
+
+Deployment Parameters
+^^^^^^^^^^^^^^^^^^^^^
+
+When using the ``sqloader-v1.sh`` file (installer), the following flags are already configured. 
+
+All deployment flags are not dynamically adjustable at runtime. 
+
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - State
+     - Default
+     - Example 
+     - Description
+   * - ``configDir``
+     - Optional
+     - ``config``
+     - ``java -jar sqloaderService-8.2.jar --configDir=``
+     - Defines the path to the folder containing both the data type mapping and the reserved words files. The defined folder must contain both files or else you will receive an error. This flag affects the mapping and reserved words files and does not affect the properties file 
+   * - ``hzClusterName=``
+     - Optional
+     - 
+     - ``java -jar sqloader-service-8.2.jar --hzClusterName=``
+     - In Hazelcast, a cluster refers to a group of connected Hazelcast instances across different JVMs or machines. By default, these instances connect to the same cluster on the network level, meaning that all SQLoader services that start on a network will connect to each other and share the same queue. An admin can connect to only one Hazelcast cluster at a time. If you start multiple clusters and want to connect them to the admin service, you will need to start multiple admin services, with each service connecting to one of your clusters. It is essential that this flag has the same name used here and across all SQLoader instances.
+   * - ``LOG_DIR``
+     - Optional
+     - ``logs``
+     - ``java -jar -DLOG_DIR=/path/to/log/directory sqloader-service-8.2.jar``
+     - Defines the path of log directory created when loading data. If no value is specified, a ``logs`` folder is created under the same location as the ``sqloader.jar`` file
+   * - ``spring.boot.admin.client.url``
+     - Optional
+     - ``http://localhost:7070``
+     - ``java -jar sqloader-service-8.2.jar --spring.boot.admin.client.url=http://IP:PORT``
+     - SQLoader admin server connection flag
+   * - ``Xmx``
+     - Optional
+     - 
+     - ``java -jar -Xmxg sqloader-service-8.2.jar``
+     - We recommend using the ``Xmx`` flag to set the maximum heap memory allocation for the service. If a single service is running on the machine, we suggest allocating 80% of the total memory minus approximately 4GB, which the service typically needs on average. If multiple services are running on the same machine, calculate the recommended heap size for one service and then divide it by the number of services. Compute formula: :math:`⌊ 0.8 * (TotalMemory - 4) ⌋`
+   * - ``DEFAULT_PROPERTIES``
+     - Mandatory
+     - ``sqload-jdbc.properties``
+     - ``java -jar -DDEFAULT_PROPERTIES=/path/to/file/sqload-jdbc.properties sqloader-service-8.2.jar``
+     - When the service initializes, it looks for the variable DEFAULT_PROPERTIES, which corresponds to the default sqload-jdbc.properties file. Once the service is running with a specified properties file, this setting will remain unchanged as long as the service is operational. To modify it, you must shut down the service, edit the properties file, and then restart the service. Alternatively, you can modify it via a POST request, but this change will only affect the specific load request and not the default setting for all requests.
+	 
+Installing the Admin Server and SQLoader Service
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+	 
+1. To install the admin server, run the following command (install it only once on one machine):
+
+.. code-block:: 
+
+	sudo ./sqloader-v1.sh -admin
+	
+Output:
+
+.. code-block::
+
+	##################################################################################
+	Welcome to SQloader Admin-Service installation
+	##################################################################################
+	Please Enter JAVA_HOME PATH
+	/opt/java
+	##################################################################################
+	The default PATH to install SQloader Admin Service is /usr/local/sqloader-admin
+	Do you want to change the default PATH ? (y/N)
+	##################################################################################
+	The default PATH to SQloader-Admin logs directory is /var/log/sqloader-admin/logs
+	Do you want to change the default? (y/N)
+	##################################################################################
+	Please enter HZCLUSTERNAME
+	sqcluster
+	##################################################################################
+	SQloader-Admin default port is 7070 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	JAVA_HOME=/opt/java
+	BINDIR=/usr/local/sqloader-admin/
+	LOG_DIR=/var/log/sqloader-admin/
+	JAR=sqloader-admin-server-1.0.jar
+	ADMINPORT=7070
+	HZCLUSTERNAME=sqcluster
+	##################################################################################
+	############# SQLoader-Admin Service installed successfuly #######################
+	##################################################################################
+	To Start SQLoader-Admin Service: sudo systemctl start sqloader-admin
+	To View SQLoader-Admin Service status: sudo systemctl status sqloader-admin
+	##################################################################################
+	
+2. To start the admin server, run the following command:
+
+.. code-block::
+
+	sudo systemctl start sqloader-admin
+	
+3. To verify admin server start status, run the following command (optional):
+
+.. code-block::
+
+	sudo systemctl status sqloader-admin
+	
+4. To install SQLoader service, run the following command (you can install per machine):
+
+.. code-block:: 
+	
+	sudo ./sqloader-v1.sh -service
+   
+Output:
+
+.. code-block::
+
+	##################################################################################
+	Welocome to SQloader service installation
+	##################################################################################
+	Please Enter JAVA_HOME Path
+	/opt/java
+	##################################################################################
+	The Default PATH to install SQloader Service is /usr/local/sqloader
+	Do you want to change the default? (y/N)
+	##################################################################################
+	The default PATH to SQloader Service logs directory is /var/log/sqloader-service
+	Do you want to change The default? (y/N)
+	##################################################################################
+	Please enter SQloader Admin IP address
+	192.168.5.234
+	##################################################################################
+	Please enter SQloader MEM size in GB
+	20
+	##################################################################################
+	Please enter HZCLUSTERNAME
+	sqcluster
+	##################################################################################
+	Default CONFDIR is /usr/local/sqloader/config , Do you want to change the default CONFDIR ? (y/N)
+	##################################################################################
+	Default SQloader Admin port is 7070 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	Default SQloader Service port is 6060 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	Default sqload-jdbc.properties is /usr/local/sqloader/config, Do you want to change the default? (y/N)
+	Using default sqload-jdbc.properties PATH
+	/usr/local/sqloader/config
+	##################################################################################
+	##################################################################################
+	Using /usr/local/sqloader/config/sqload-jdbc.properties
+	##################################################################################
+	JAVA_HOME=/opt/java
+	BINDIR=/usr/local/sqloader/bin
+	LOG_DIR=/var/log/sqloader-service
+	CONFDIR=/usr/local/sqloader/config
+	JAR=sqloader-service-8.2.jar
+	PROPERTIES_FILE=/usr/local/sqloader/config/sqload-jdbc.properties
+	PORT=6060
+	ADMINIP=192.168.5.234
+	ADMINPORT=7070
+	MEM=20
+	HZCLUSTERNAME=sqcluster
+	##################################################################################
+	############# SQLoader Service installed successfuly #######################
+	##################################################################################
+	To Start SQLoader Service: sudo systemctl start sqloader-service
+	To View SQLoader Service status: sudo systemctl status sqloader-service
+	##################################################################################
+
+5. To start the SQLoader service, run the following command:
+
+.. code-block::
+
+	sudo systemctl start sqloader-service
+	
+6. To verify SQLoader service start status, run the following command (optional):
+
+.. code-block::
+
+	sudo systemctl status sqloader-service
+   
+Reconfiguration
+---------------
+
+**Admin server**
+
+You may reconfigure the admin server even after you have started it.
+
+
+1. To get the configuration path, run the following command:
+
+.. code-block::
+
+	cat /usr/lib/systemd/system/sqloader-admin.service | grep 'EnvironmentFile'
+	
+Output:
+	
+.. code-block::
+
+	EnvironmentFile=/usr/local/sqloader-admin/config/sqloader_admin.conf
+
+2. Restart the admin server:
+
+.. code-block::
+
+	sudo systemctl restart sqloader-admin
+
+**SQLoader service**
+
+You may reconfigure the SQLoader service even after you have started it.
+
+1. To get the configuration path, run the following command:
+
+.. code-block::
+
+	cat /usr/lib/systemd/system/sqloader-service.service | grep 'EnvironmentFile'
+	
+Output:
+	
+.. code-block::
+
+	EnvironmentFile=/usr/local/sqloader/config/sqloader_service.conf
+
+2. Restart the SQLoader service:
+
+.. code-block::
+
+	sudo systemctl restart sqloader-service
+   
+Connection String
+-----------------
+
+It is recommended that the ``sqload-jdbc.properties`` file will contain a connection string.
+
+1. Open the ``sqload-jdbc.properties`` file.
+2. Configure connection parameters for:
+
+   a. The source connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, Sybase or Teradata
+   b. The target connection string: SQreamDB
+   c. The :ref:`catalog` connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, SQreamDB, Sybase, or Teradata
+
+.. list-table:: Connection String Parameters
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - Description
+   * - ``HostIp:port``
+     - The host and IP address number
+   * - ``database_name``
+     - The name of the database from which data is loaded
+   * - ``user``
+     - Username of a role to use for connection
+   * - ``password``
+     - Specifies the password of the selected role
+   * - ``ssl``
+     - Specifies SSL for this connection
+
+.. literalinclude:: connection_string.ini
+    :language: ini
+    :caption: Properties File Sample
+    :linenos:
+
+SQLoader Service Interface
+==========================
+
+The SQLoader service automatically detects the IP addresses of incoming HTTP requests, even if the request originates from the same IP address as the one hosting the service. If you are accessing the service using a proxy server, you can include the client IP address in the request itself by using the ``X-Forwarded-For`` HTTP header, as in the following example:
+
+.. code-block::
+
+	curl -X POST -H 'X-Forwarded-For: 192.168.1.2' -H 'Content-Type: application/json' --data '{"loadTypeName": "inc", "sourceSchema": "QA", "sourceTable": "MY_TABLE", "sqreamTable": "MY_TABLE", "sqreamSchema": "QA"}' http://MyPc:6060/load
+
+Supported HTTP Requests
+-----------------------
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Request Type
+     - Request Name
+     - cURL Command
+     - Description
+     - Example
+   * - POST
+     - ``load``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/load``
+     - Sends a request to the service and returns immediately. This HTTP request is utilized within a load-balancing queue shared across multiple instances. This setup ensures efficient resource utilization by distributing incoming load requests evenly across all available instances. Additionally, the system incorporates :ref:`high availability` mechanisms to recover failed jobs in case an instance crashes, ensuring continuous operation and reliability even during instance failures. Note that if all instances crash, at least one instance must remain operational to recover and execute pending jobs.
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/load``
+   * - POST
+     - ``syncLoad``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/syncLoad``
+     - Sends a request to the service and returns once the request is complete. There's no load-balancing queue shared across multiple instances; therefore, it's advised that ``syncLoad`` requests be monitored by the user and not heavily sent. Monitor using the ``getActiveLoads`` cURL.
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/syncLoad``
+   * - POST
+     - ``filterLogs``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"requestId":"", "outputFilePath": ""}' http://127.0.0.1:6060/filterLogs``
+     - Retrieves logs for a specific request ID
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"requestId":"request-1-6a2884a3", "outputFilePath": "/home/avivs/sqloader_request.log"}' http://127.0.0.1:6060/filterLogs``
+   * - GET
+     - ``getActiveLoads``
+     - ``curl --header "Content-Type: application/json" --request GET http://127.0.0.1:6060/getActiveLoads``
+     - Returns a list of all active loads currently running across all services
+     - 
+   * - GET
+     - ``cancelRequest``
+     - ``curl --request GET http://127.0.0.1:6061/cancelRequest/``
+     - Cancels an active request by request ID
+     - ``curl --request GET http://127.0.0.1:6061/cancelRequest/request-2-6aa3c53d``
+
+.. _high_availability:
+
+High Availability
+-----------------
+
+SQLoader as a service supports high availability for asynchronous load requests only. When a service crashes, another service will take over the tasks and execute them from the beginning. However, there are some limited cases where high availability will not provide coverage:
+
+* **At least one service must remain operational**: After a crash, at least one service must be up and running to ensure that tasks can be recovered and executed.
+
+* **Limitations for specific tasks**: When any of the following is configured: 
+
+	* A task involving a ``clustered`` flag must be set to ``true`` to enable high availability.
+
+	* A task involving a full load with ``truncate=false`` and ``drop=false`` will not rerun to prevent data duplication. In this type of load, data is inserted directly into the target table rather than a temporary table, making it impossible to determine if any data was inserted before the crash. 
+
+This setup ensures that asynchronous load requests are handled reliably, even in the event of service failures.
+
+Log Rotation
+------------
+
+Log rotation is based on time and size. At midnight (00:00) or when the file reaches 100MB, rotation occurs. Rotation means the log file ``SQLoader_service.log`` is renamed to ``SQLoader_service_%d_%i.log`` (%d=date, %i=rotation number), and a new, empty ``SQLoader_service.log`` file is created for the SQLoader service to continue writing to.
+
+Log Automatic cleanup
+^^^^^^^^^^^^^^^^^^^^^
+
+The maximum number of archived log files to keep is set to 360, so Logback will retain the latest 360 log files in the logs directory. Additionally, the total file size in the directory is limited to 50 GB. If the total size of archived log files exceeds this limit, older log files will be deleted to make room for new ones.
+
+SQLoader Request Parameters
+---------------------------
+
+Mandatory flags must be configured using HTTP flags or the ``properties`` file.
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - HTTP Parameter
+     - State
+     - Default
+     - Description
+   * - ``clustered``
+     - Optional
+     - ``true``
+     - This flag is relevant only for ``load`` requests (``async``), not for ``syncLoad``. Note that this flag affects :ref:`high availability`. When set to ``true``: the request is directed to one of the available instances within a cluster, often through a load balancer. When set to ``false``: the request goes directly to the specified host without load balancing.
+   * - ``configFile``
+     - Optional
+     - ``sqload-jdbc.properties``
+     - Defines the path to the configuration file you wish to use. If not specified, the service will use the default path provided upon service deployment.
+   * - ``connectionStringSqream``
+     - Mandatory
+     - 
+     - JDBC connection string to SQreamDB
+   * - ``connectionStringSource``
+     - Mandatory
+     - 
+     - JDBC connection string to source database
+   * - ``connectionStringCatalog``
+     - Mandatory
+     - 
+     - JDBC connection string to catalog database
+   * - ``cdcCatalogTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds all inc/cdc tables and their settings
+   * - ``cdcTrackingTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds the last tracking value for every inc/cdc table from ``cdcCatalogTable`` table	 
+   * - ``cdcPrimaryKeyTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds all primary keys for every inc/cdc table from ``cdcCatalogTable`` table	 
+   * - ``loadSummaryTable``
+     - Mandatory
+     - 
+     - Part of the schema within the catalog database. Pre-aggregated table that stores summarized loads which can help monitoring and analyzing load	 
+   * - ``batchSize``
+     - Optional
+     - ``10.000``
+     - The number of records to be inserted into SQreamDB at once. Please note that the configured batch size may impact chunk sizes.
+   * - ``caseSensitive``
+     - Optional
+     - ``false``
+     - If ``true``, keeps table name uppercase and lowercase characters when table is created in SQreamDB
+   * - ``checkCdcChain``
+     - Optional
+     - ``false``
+     - Check CDC chain between tracking table and source table 
+   * - ``chunkSize``
+     - Optional
+     - ``0``
+     - The number of records read at once from the source database
+   * - ``columnListFilePath``
+     - Optional
+     - 
+     - The name of the file that contains all column names. Columns must be separated using ``\n``. Expected file type is ``.txt`` 
+   * - ``selectedColumns``
+     - Optional
+     - All columns
+     - The name or names of columns to be loaded into SQreamDB ("col1,col2, ..."). For column names containing uppercase characters, maintain the uppercase format, avoid using double quotes or apostrophes, and ensure that the ``caseSensitive`` parameter is set to true. **Note:** In versions prior to 8.5, the ``selectedColumns`` parameter was named ``columns``.
+   * - ``count``
+     - Optional
+     - ``true``
+     - Defines whether or not table rows will be counted before being loaded into SQreamDB 
+   * - ``cdcDelete``
+     - Optional
+     - ``true``
+     - Defines whether or not loading using Change Data Capture (CDC) includes deleted rows
+   * - ``drop``
+     - Optional
+     - ``false``
+     - Defines whether or not a new target table in SQreamDB is created. If ``false``, you will need to configure a target table name using the ``target`` parameter
+   * - ``fetchSize``
+     - Optional
+     - ``100000``
+     - The number of records to be read at once from source database. 
+   * - ``filter``
+     - Optional
+     - ``1=1``
+     - Defines whether or not only records with SQL conditions are loaded
+   * - ``h, help``
+     - Optional
+     - 
+     - Displays the help menu and exits
+   * - ``limit``
+     - Optional
+     - ``0`` (no limit)
+     - Limits the number of rows to be loaded
+   * - ``loadDttm``
+     - Optional
+     - ``true``
+     - Add an additional ``loadDttm`` column that defines the time and date of loading
+   * - ``loadDttmColumnName``
+     - Optional
+     - ``sq_load_dttm``
+     - Specifies the name of the additional column that records the time and date of loading. This parameter works in conjunction with the ``loadDttm`` parameter. If ``loadDttm`` is enabled, the column defined by ``loadDttmColumnName`` will be added to the target table.
+   * - ``loadTypeName``
+     - Optional
+     - ``full``
+     - Defines a loading type that affects the table that is created in SQreamDB. Options are ``full``, ``cdc``, or ``inc``. Please note that ``cdc`` is supported for Oracle only and that ``inc`` is supported for Oracle and Postgresql
+   * - ``lockCheck``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader will check source table is locked before the loading starts
+   * - ``lockTable``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader will lock target table before the loading starts
+   * - ``partitionName``
+     - Optional
+     - 
+     - Specifies the number of table partitions. If configured, ``partition`` ensures that data is loaded according to the specified partition. You may configure the ``thread`` parameter for parallel loading of your table partitions. If you do, please ensure that the number of threads does not exceed the number of partitions.
+   * - ``port``
+     - Optional
+     - ``6060``
+     - 
+   * - ``rowid``
+     - Optional
+     - ``false``
+     - Defines whether or not SQLoader will get row IDs from Oracle tables
+   * - ``sourceDatabaseName``
+     - Optional
+     - ``ORCL``
+     - Defines the source database name. It does not modify the database connection string but impacts the storage and retrieval of data within catalog tables.
+   * - ``splitByColumn``
+     - Optional
+     - 
+     - Column name for split (required for multi-thread loads)
+   * - ``sourceSchema``
+     - Mandatory
+     -  
+     - Source schema name to load data from
+   * - ``sourceTable``
+     - Mandatory
+     - 
+     - Source table name to load data from
+   * - ``sqreamSchema``
+     - Optional 
+     - The schema name defined in the ``sourceSchema`` flag
+     - Target schema name to load data into
+   * - ``sqreamTable``
+     - Optional
+     - The table name defined in the ``sourceTable`` flag
+     - Target table name to load data into
+   * - ``threadCount``
+     - Optional
+     - ``1``
+     - Number of threads to use for loading. Using multiple threads can significantly improve the loading performance, especially when dealing with columns that have metadata statistics (e.g., min/max values). SQLoader will automatically divide the data into batches based on the specified thread number, allowing for parallel processing. You may use ``thread`` both for tables that are partitioned and tables that are not. See :ref:`Sizing Guidelines`
+   * - ``truncate``
+     - Optional
+     - ``false``
+     - Truncate target table before loading
+   * - ``typeMappingPath``
+     - Optional
+     - ``config/sqream-mapping.json``
+     - A mapping file that converts source data types into SQreamDB data types.
+   * - ``useDbmsLob``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader uses ``dbms_lob_substr`` function for ``CLOB`` and ``BLOB`` data types
+   * - ``usePartitions``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader uses partitions in ``SELECT`` statements
+   * - ``validateSourceTable``
+     - Optional
+     - ``true``
+     - Allows control over the validation of table existence during the load.
+   * - ``warnOnIncLoadFilterChanges``
+     - Optional
+     - ``true``
+     - Warns if the filter was changed since the last load and fails the load. If set to false - it accepts the new filter an updates it, future filters will be compared to this one from now on, until changed again.
+   * - ``autoCreateNewNullableColumn``
+     - Optional
+     - ``false``
+     - Allows adding new columns to an INC load automatically, if the source table has new columns. If during an INC load, new columns exist in the source table. When the parameter is set to false - there will be a warning and the load would fail. When the parameter is set to true, then the new columns will be added to the target table as well.
+	 
+.. _load_type_name:
+
+Using the ``loadTypeName`` Parameter
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Using the ``loadTypeName`` parameter, you can define how you wish records' changes to be made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Loading Type
+     - Parameter Option
+     - Supported Databases
+     - Description
+   * - Full Table
+     - ``full``
+     - All supported DBs
+     - The entire data of the source table is loaded into SQreamDB
+   * - Change Data Capture (CDC)
+     - ``cdc``
+     - Oracle
+     - Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of ``INSERT``, ``UPDATE``, and ``DELETE`` statements. SQLoader recognizes tables by table name and metadata. 
+   * - Incremental
+     - ``inc``
+     - Oracle, PostgreSQL, SQreamDB
+     - Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of ``INSERT`` statement. SQLoader recognizes the table by table name and metadata.
+	
+	
+Using the SQLoader Service Web Interface
+----------------------------------------
+
+The SQLoader Admin Server is a web-based administration tool specifically designed to manage and monitor the SQLoader service. It provides a user-friendly interface for monitoring data loading processes, managing configurations, and troubleshooting issues related to data loading into SQreamDB.
+
+
+SQLoader Service Web Interface Features
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* Monitor Services:
+
+	* Health Checks: Monitor the health status of services to ensure they are functioning properly.
+	* Metrics: Monitor real-time performance metrics, including CPU usage, memory usage, and response times.
+	* Logging: View logs generated by services for troubleshooting and debugging purposes, and dynamically modify log levels during runtime to adjust verbosity for troubleshooting or performance monitoring.
+	
+* Manage Active Load Requests:
+
+	* View a list of currently active data loading requests, including their status, progress, and relevant metadata.
+
+Creating Summary and Catalog Tables
+===================================
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+The summary and catalog tables are pre-aggregated tables that store summarized or aggregated data.
+
+Creating a Summary Table
+------------------------
+
+The summary table is part of the schema within the database catalog.
+
+The following summary table DDL uses Oracle syntax. 
+
+.. note:: 
+
+  If you are migrating from :ref:`SQLoader as a process` to **SQLoader as a service**, as described on this page, it is highly recommended that you add the following column to your existing summary table instead of re-creating it.
+
+  .. code-block:: sql
+
+	request_id            varchar2(200) default NULL,
+	client_ip             varchar2(200) default NULL,
+	requested_host        varchar2(200) default NULL,
+	acquired_host         varchar2(200) default NULL
+
+.. code-block:: sql
+
+  # Use this DDL to create summary tables on Oracle database 
+	create table sqload_summary (
+	  db_name               varchar2(200 byte),
+	  schema_name           varchar2(200 byte),
+	  table_name            varchar2(200 byte),
+	  table_name_full       varchar2(200 byte),
+	  load_type             varchar2(200 byte),
+	  updated_dttm_from     date,
+	  updated_dttm_to       date,
+	  last_val_int          number(22,0),
+	  last_val_ts           date,
+	  start_time            timestamp (6),
+	  finish_time           timestamp (6),
+	  elapsed_sec           number(*,0),
+	  row_count             number(*,0),
+	  sql_filter            varchar2(800 byte),
+	  partition             varchar2(200 byte),
+	  stmt_type             varchar2(200 byte),
+	  status                varchar2(200 byte),
+	  log_file              varchar2(200 byte),
+	  db_url                varchar2(400 byte),
+	  partition_count       number(*,0) default 0,
+	  thread_count          number(*,0) default 1,
+	  elapsed_ms            number(*,0) default 0,
+	  status_code           number(*,0) default 0,
+	  elapsed_source_ms     number(38,0) default NULL,
+	  elapsed_source_sec    number(38,0) default NULL,
+	  elapsed_target_ms     number(38,0) default NULL,
+	  elapsed_target_sec    number(38,0) default NULL,
+	  target_db_url         varchar2(200) default NULL,
+	  sqloader_version      varchar2(20) default NULL,
+	  host                  varchar2(200) default NULL,
+	  request_id            varchar2(200) default NULL,
+	  client_ip             varchar2(200) default NULL,
+	  requested_host        varchar2(200) default NULL,
+	  acquired_host         varchar2(200) default NULL
+	);
+
+	create index sqload_summary_idx1 on sqload_summary (db_name,schema_name,table_name);
+	create index sqload_summary_idx2 on sqload_summary (start_time);
+	create index sqload_summary_idx3 on sqload_summary (finish_time);
+	  
+.. code-block:: sql
+
+  # Use this DDL to create summary tables SQDB databases 
+	CREATE TABLE sqload_summary (
+	 DB_NAME TEXT,
+	 SCHEMA_NAME TEXT,
+	 TABLE_NAME TEXT,
+	 TABLE_NAME_FULL TEXT,
+	 LOAD_TYPE TEXT,
+	 UPDATED_DTTM_FROM DATE ,
+	 UPDATED_DTTM_TO DATE ,
+	 LAST_VAL_INT NUMBER,
+	 LAST_VAL_TS DATETIME ,
+	 LAST_VAL_DT2 DATETIME2 ,
+	 START_TIME DATETIME ,
+	 FINISH_TIME DATETIME ,
+	 ELAPSED_SEC NUMBER ,
+	 ROW_COUNT NUMBER ,
+	 SQL_FILTER TEXT,
+	 PARTITION TEXT,
+	 STMT_TYPE TEXT,
+	 STATUS TEXT,
+	 LOG_FILE TEXT,
+	 DB_URL TEXT,
+	 PARTITION_COUNT NUMBER  DEFAULT 0,
+	 THREAD_COUNT NUMBER  DEFAULT 1,
+	 ELAPSED_MS NUMBER  DEFAULT 0,
+	 STATUS_CODE NUMBER  DEFAULT 0,
+	 ELAPSED_SOURCE_MS NUMBER DEFAULT NULL,
+	 ELAPSED_SOURCE_SEC NUMBER DEFAULT NULL,
+	 ELAPSED_TARGET_MS NUMBER DEFAULT NULL,
+	 ELAPSED_TARGET_SEC NUMBER DEFAULT NULL,
+	 TARGET_DB_URL TEXT DEFAULT NULL,
+	 SQLOADER_VERSION TEXT DEFAULT NULL,
+	 CLIENT_IP TEXT DEFAULT NULL,
+	 REQUESTED_HOST TEXT DEFAULT NULL,
+	 ACQUIRED_HOST TEXT DEFAULT NULL,
+	 REQUEST_ID TEXT  DEFAULT NULL
+	 );
+
+.. _creating_catalog_tables:
+
+Creating Catalog Tables
+-----------------------
+
+CDC (Change Data Capture) and Incremental tables are database tables that record changes made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.
+
+See :ref:`load_type_name`
+
+.. code-block:: sql
+
+    #To be used for Oracle
+	create table cdc_tables (
+	  db_name               varchar2(200 byte),
+	  schema_name           varchar2(200 byte),
+	  table_name            varchar2(200 byte),
+	  table_name_full       varchar2(200 byte),
+	  table_name_cdc        varchar2(200 byte),
+	  inc_column_name       varchar2(200 byte),
+	  inc_column_type       varchar2(200 byte),
+	  load_type             varchar2(200 byte),
+	  freq_type             varchar2(200 byte),
+	  freq_interval         number(22,0),
+	  is_active             number(*,0) default 0,
+	  status_load           number(*,0));
+
+	create index cdc_tables_idx1 on cdc_tables (db_name,table_name_full);
+ 
+	create table cdc_table_primary_keys (
+	  db_name               varchar2(200 byte),
+	  schema_name           varchar2(200 byte),
+	  table_name            varchar2(200 byte),
+	  table_name_full       varchar2(200 byte),
+	  constraint_name       varchar2(200 byte),
+	  column_name           varchar2(200 byte),
+	  is_nullable           number(*,0));
+
+	create index cdc_table_primary_keys_idx1 on cdc_table_primary_keys (db_name,table_name_full);
+
+
+	create table cdc_tracking (
+	  db_name               varchar2(200 byte),
+	  schema_name           varchar2(200 byte),
+	  table_name            varchar2(200 byte),
+	  table_name_full       varchar2(200 byte),
+	  last_updated_dttm     date,
+	  last_val_int          number(22,0) default 0,
+	  last_val_ts           timestamp (6),
+	  last_val_dt           date,
+	  filter                varchar2(2000 byte));
+
+	create index cdc_tracking_idx1 on cdc_tracking (db_name,table_name_full);
+
+
+.. code-block:: sql
+	
+	#To be used for SQDB
+	CREATE TABLE cdc_tracking (
+	  DB_NAME TEXT,
+	  SCHEMA_NAME TEXT,
+	  TABLE_NAME TEXT,
+	  TABLE_NAME_FULL TEXT,
+	  LAST_UPDATED_DTTM DATE ,
+	  LAST_VAL_INT NUMBER DEFAULT 0,
+	  LAST_VAL_TS DATETIME,
+	  LAST_VAL_DT DATETIME,
+	  LAST_VAL_DT2 DATETIME2.
+	  FILTER TEXT
+	);
+	
+	CREATE TABLE public.CDC_TABLES (
+	  DB_NAME TEXT(200),
+	  SCHEMA_NAME TEXT(200),
+	  TABLE_NAME TEXT(200),
+	  TABLE_NAME_FULL TEXT(200),
+	  TABLE_NAME_CDC TEXT(200),
+	  INC_COLUMN_NAME TEXT(200),
+	  INC_COLUMN_TYPE TEXT(200),
+	  LOAD_TYPE TEXT(200),
+	  FREQ_TYPE TEXT(200),
+	  FREQ_INTERVAL BIGINT,
+	  IS_ACTIVE INT DEFAULT 0,
+	  STATUS_LOAD INT DEFAULT 0,
+	  INC_GAP_VALUE INT DEFAULT 0
+	);
+
+	CREATE TABLE public.CDC_TABLE_PRIMARY_KEYS (
+	  DB_NAME TEXT(200),
+	  SCHEMA_NAME TEXT(200),
+	  TABLE_NAME TEXT(200),
+	  TABLE_NAME_FULL TEXT(200),
+	  CONSTRAINT_NAME TEXT(200),
+	  COLUMN_NAME TEXT(200),
+	  IS_NULLABLE INT DEFAULT 0
+	);
+
+
+
+Data Type Mapping 
+=================
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Automatic Mapping
+------------------
+
+The **SQLoader** automatically maps data types used in Greenplum, Microsoft SQL Server, Oracle, Postgresql, Sybase, SAP HANA, and Teradata tables that are loaded into SQreamDB.
+
+Greenplum
+^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Greenplum Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``INT``, ``SMALLINT``, ``BIGINT``, ``INT2``, ``INT4``, ``INT8`` 
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``, ``BOOL``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``, ``FLOAT4``
+     - ``REAL``
+
+Microsoft SQL Server
+^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Microsoft SQL Server Type
+     - SQreamDB Type
+   * - ``CHAR``, ``NCHAR``, ``VARCHAR``, ``NVARCHAR``, ``NVARCHAR2``, ``CHARACTER``, ``TEXT``, ``NTEXT``
+     - ``TEXT``
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INT``, ``TINYINT``
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``, ``SMALLDATETIME``, ``DATETIMEOFFSET``, ``DATETIME2``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``
+     - ``REAL``
+   * - ``VARBINARY``
+     - ``TEXT``
+
+Oracle
+^^^^^^^ 
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Oracle Type
+     - SQreamDB Type
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGE``
+     - ``BIGINT``
+   * - ``CHAR``, ``NCHAR``, ``VARCHAR``, ``VARCHAR2``, ``NVARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``DATE``, ``DATETIME``
+     - ``DATETIME``
+   * - ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOLEAN``
+     - ``BOOL``
+   * - ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``CLOB``
+     - ``TEXT``
+   * - ``BLOB``
+     - ``TEXT``
+   * - ``RAW``
+     - ``TEXT``
+
+
+Postgresql
+^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Postgresql Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``INT``, ``SMALLINT``, ``BIGINT``, ``INT2``, ``INT4``, ``INT8`` 
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``, ``BOOL``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``, ``FLOAT4``
+     - ``REAL``
+
+SAP HANA
+^^^^^^^^
+	 
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - SAP HANA Type
+     - SQreamDB Type
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGER``, ``TINYINT``
+     - ``BIGINT``
+   * - ``CHAR``, ``VARCHAR``, ``NVARCHAR``, ``TEXT``, ``VARCHAR2``, ``NVARCHAR2``
+     - ``TEXT``
+   * - ``DATETIME``, ``TIMESTAMP``, ``SECONDDATE``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOLEAN``
+     - ``TEXT``
+   * - ``DECIMAL``, ``SMALLDECIMAL``, ``BIGDECIMAL``
+     - ``NUMERIC``
+   * - ``DOUBLE``, ``REAL``
+     - ``FLOAT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``BIGINT``
+     - ``BIGINT``
+   * - ``INT``
+     - ``INT``
+   * - ``SMALLINT``
+     - ``SMALLINT``
+   * - ``TINYINT``
+     - ``TINYINT``
+   * - ``DATETIME``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOL``
+     - ``BOOL``
+   * - ``NUMERIC``
+     - ``NUMERIC``
+   * - ``DOUBLE``
+     - ``DOUBLE``
+   * - ``FLOAT``
+     - ``FLOAT``
+   * - ``REAL``
+     - ``REAL``	 
+	 
+Sybase
+^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Sybase Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``LONG VARCHAR``, ``CHARACTER``, ``TEXT``
+     - ``TEXT``
+   * - ``TINYINT``
+     - ``TINYINT``
+   * - ``SMALLINT``
+     - ``SMALLINT``   
+   * - ``INT``, ``INTEGER``
+     - ``INT``
+   * - ``BIGINT``
+     - ``BIGINT``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``   
+   * - ``NUMERIC(126,38)``
+     - ``NUMERIC(38,10)``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``DATE``
+     - ``DATE``   
+   * - ``DATETIME``, ``TIMESTAMP``, ``TIME``
+     - ``DATETIME``   
+   * - ``BIT``
+     - ``BOOL``   
+   * - ``VARBINARY``, ``BINARY``, ``LONG BINARY``
+     - ``TEXT``   
+
+Teradata
+^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Teradata Type
+     - SQreamDB Type
+   * - ``F``
+     - ``DOUBLE``
+   * - ``N``, ``D``
+     - ``NUMERIC``
+   * - ``CO``
+     - ``TEXT``
+   * - ``BO``
+     - ``TEXT``
+   * - ``A1``, ``AN``, ``AT``, ``BF``, ``BV``, ``CF``, ``CV``, ``JN``, ``PD``, ``PM``, ``PS``, ``PT``, ``PZ``, ``SZ``, ``TZ``
+     - ``TEXT``
+   * - ``I``, ``I4``, ``I(4)``  
+     - ``INT``
+   * - ``I2``, ``I(2)``
+     - ``SMALLINT``
+   * - ``I1``, ``I(1)``
+     - ``TINYINT``
+   * - ``DH``, ``DM``, ``DS``, ``DY``, ``HM``, ``HS``, ``HR``, ``I8``, ``MO``, ``MS``, ``MI``, ``SC``, ``YM``, ``YR``
+     - ``BIGINT``
+   * - ``TS``, ``DATETIME``
+     - ``DATETIME``
+   * - ``DA``
+     - ``DATE``
+   * - ``BIT``
+     - ``BOOL``
+   * - ``REAL``, ``DOUBLE``
+     - ``DOUBLE``
+
+Manually Adjusting Mapping
+----------------------------
+
+You have the possibility to adjust the mapping process according to your specific needs, using any of the following methods.
+
+``names`` Method
+^^^^^^^^^^^^^^^^^
+
+To specify that you want to map one or more columns in your table to a specific data type, duplicate the code block which maps to the SQreamDB data type you want and include the ``names`` parameter in your code block. The SQLoader will map the specified columns to the specified SQreamDB data type. After the specified columns are mapped, the SQLoader continue to search for how to convert other data types to the same data type of the specified columns. 
+
+In this example, ``column1``, ``column2``, and ``column3`` are mapped to ``BIGINT`` and the Oracle data types ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGER`` are also mapped to ``BIGINT``.
+
+.. code-block:: json
+
+	{
+	  "oracle": [
+		{
+		  "names": ["column1", "column2", "column3"],
+		  "sqream": "bigint",
+		  "java": "int",
+		  "length": false
+		},
+		{
+		  "type": ["bigint","int","smallint","integer"],
+		  "sqream": "bigint",
+		  "java": "int",
+		  "length": false
+		}
+	}	
+
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+   :hidden:
+   
+   preparing_oracle_for_data_migration
\ No newline at end of file
diff --git a/data_ingestion/sqloader.rst.bck b/data_ingestion/sqloader.rst.bck
new file mode 100644
index 000000000..0f5644a52
--- /dev/null
+++ b/data_ingestion/sqloader.rst.bck
@@ -0,0 +1,1126 @@
+.. _sqloader:
+
+*********************
+SQLoader As a Service
+*********************
+
+The **SQLoader** is a Java service that enables you to ingest data into SQreamDB from other DBMS and DBaaS through HTTP requests using network insert.
+
+**SQLoader** supports ingesting data from the following DBMSs:
+
+* Greenplum
+* Microsoft SQL Server
+* Oracle (including Oracle Autonomous Database)
+* Postgresql
+* SAP HANA
+* Sybase
+* Teradata
+* SQreamDB 4.5.15 or later
+
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Before You Begin
+================
+
+It is essential that you have the following:
+
+* Java 17
+* :ref:`SQLoader configuration files`
+* :ref:`SQLoader.jar file`
+
+Minimum Hardware Requirements
+------------------------------
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Component
+     - Type
+   * - CPU cores
+     - 16
+   * - RAM
+     - 32GB
+
+.. _sqloader_thread_sizing_guideline:
+
+Sizing Guidelines 
+------------------
+
+The SQLoader sizing is determined by the number of concurrent tables and threads based on the available CPU cores, limiting it to the number of cores minus one, with the remaining core reserved for the operating system. Each SQLoader request runs on a single table, meaning concurrent imports of multiple tables require multiple requests. Additionally, it is important to note that for partitioned tables, each partition consumes a thread. Therefore, for performance efficiency, considering the table's partition count when managing thread allocation is a must.
+
+Compute formula: :math:`⌊ 0.8 * (TotalMemory - 4) ⌋`
+
+Installation and Connectivity
+=============================
+
+.. _getting_the_sqloader_configuration_and_jar_files:
+
+Getting All Configuration and JAR Files
+---------------------------------------
+
+#. Download the SQLoader zip file:
+
+   .. code-block:: console
+
+	https://storage.cloud.google.com/cicd-storage/sqloader_release/sqloader-release-v1.1.zip
+
+#. Extract the ``.tar`` file using the following command:
+
+   .. code-block:: bash
+
+	tar -xf sqloader_srv_v8.2.tar.gz
+
+   A folder named ``sqloader`` with the following files is created:
+   
+   .. code-block:: 
+
+	├── sqloader-v1.sh
+	├── bin
+	│   ├── sqloader-admin-server-1.1.jar
+	│   └── sqloader-service-8.2.jar
+	├── config
+		├── reserved_words.txt
+		├── sqload-jdbc.properties
+		└── sqream-mapping.json
+   
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - File Name
+     - Description
+   * - ``sqream-mapping.json``
+     - Maps foreign DBMS and DBaaS data types into SQreamDB data types during ingestion
+   * - ``sqload-jdbc.properties``
+     - Used for defining a connection string and may also be used to reconfigure data loading
+   * - ``reserved_words.txt``
+     - A list of reserved words which cannot be used as table and/or column names. 
+   * - ``sqloader-service-8.2.jar``
+     - The SQLoader service JAR file 
+   * - ``sqloader-admin-server-1.0.jar``
+     - The SQLoader admin server JAR file
+   * - ``sqloader-v1.sh``
+     - SQLoader service installer bash file
+	 
+Installation
+------------
+
+Deployment Parameters
+^^^^^^^^^^^^^^^^^^^^^
+
+When using the ``sqloader-v1.sh`` file (installer), the following flags are already configured. 
+
+All deployment flags are not dynamically adjustable at runtime. 
+
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - State
+     - Default
+     - Example 
+     - Description
+   * - ``configDir``
+     - Optional
+     - ``config``
+     - ``java -jar sqloaderService-8.2.jar --configDir=``
+     - Defines the path to the folder containing both the data type mapping and the reserved words files. The defined folder must contain both files or else you will receive an error. This flag affects the mapping and reserved words files and does not affect the properties file 
+   * - ``hzClusterName=``
+     - Optional
+     - 
+     - ``java -jar sqloader-service-8.2.jar --hzClusterName=``
+     - In Hazelcast, a cluster refers to a group of connected Hazelcast instances across different JVMs or machines. By default, these instances connect to the same cluster on the network level, meaning that all SQLoader services that start on a network will connect to each other and share the same queue. An admin can connect to only one Hazelcast cluster at a time. If you start multiple clusters and want to connect them to the admin service, you will need to start multiple admin services, with each service connecting to one of your clusters. It is essential that this flag has the same name used here and across all SQLoader instances.
+   * - ``LOG_DIR``
+     - Optional
+     - ``logs``
+     - ``java -jar -DLOG_DIR=/path/to/log/directory sqloader-service-8.2.jar``
+     - Defines the path of log directory created when loading data. If no value is specified, a ``logs`` folder is created under the same location as the ``sqloader.jar`` file
+   * - ``spring.boot.admin.client.url``
+     - Optional
+     - ``http://localhost:7070``
+     - ``java -jar sqloader-service-8.2.jar --spring.boot.admin.client.url=http://IP:PORT``
+     - SQLoader admin server connection flag
+   * - ``Xmx``
+     - Optional
+     - 
+     - ``java -jar -Xmxg sqloader-service-8.2.jar``
+     - We recommend using the ``Xmx`` flag to set the maximum heap memory allocation for the service. If a single service is running on the machine, we suggest allocating 80% of the total memory minus approximately 4GB, which the service typically needs on average. If multiple services are running on the same machine, calculate the recommended heap size for one service and then divide it by the number of services. Compute formula: :math:`⌊ 0.8 * (TotalMemory - 4) ⌋`
+   * - ``DEFAULT_PROPERTIES``
+     - Mandatory
+     - ``sqload-jdbc.properties``
+     - ``java -jar -DDEFAULT_PROPERTIES=/path/to/file/sqload-jdbc.properties sqloader-service-8.2.jar``
+     - When the service initializes, it looks for the variable DEFAULT_PROPERTIES, which corresponds to the default sqload-jdbc.properties file. Once the service is running with a specified properties file, this setting will remain unchanged as long as the service is operational. To modify it, you must shut down the service, edit the properties file, and then restart the service. Alternatively, you can modify it via a POST request, but this change will only affect the specific load request and not the default setting for all requests.
+	 
+Installing the Admin Server and SQLoader Service
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+	 
+1. To install the admin server, run the following command (install it only once on one machine):
+
+.. code-block:: 
+
+	sudo ./sqloader-v1.sh -admin
+	
+Output:
+
+.. code-block::
+
+	##################################################################################
+	Welcome to SQloader Admin-Service installation
+	##################################################################################
+	Please Enter JAVA_HOME PATH
+	/opt/java
+	##################################################################################
+	The default PATH to install SQloader Admin Service is /usr/local/sqloader-admin
+	Do you want to change the default PATH ? (y/N)
+	##################################################################################
+	The default PATH to SQloader-Admin logs directory is /var/log/sqloader-admin/logs
+	Do you want to change the default? (y/N)
+	##################################################################################
+	Please enter HZCLUSTERNAME
+	sqcluster
+	##################################################################################
+	SQloader-Admin default port is 7070 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	JAVA_HOME=/opt/java
+	BINDIR=/usr/local/sqloader-admin/
+	LOG_DIR=/var/log/sqloader-admin/
+	JAR=sqloader-admin-server-1.0.jar
+	ADMINPORT=7070
+	HZCLUSTERNAME=sqcluster
+	##################################################################################
+	############# SQLoader-Admin Service installed successfuly #######################
+	##################################################################################
+	To Start SQLoader-Admin Service: sudo systemctl start sqloader-admin
+	To View SQLoader-Admin Service status: sudo systemctl status sqloader-admin
+	##################################################################################
+	
+2. To start the admin server, run the following command:
+
+.. code-block::
+
+	sudo systemctl start sqloader-admin
+	
+3. To verify admin server start status, run the following command (optional):
+
+.. code-block::
+
+	sudo systemctl status sqloader-admin
+	
+4. To install SQLoader service, run the following command (you can install per machine):
+
+.. code-block:: 
+	
+	sudo ./sqloader-v1.sh -service
+   
+Output:
+
+.. code-block::
+
+	##################################################################################
+	Welocome to SQloader service installation
+	##################################################################################
+	Please Enter JAVA_HOME Path
+	/opt/java
+	##################################################################################
+	The Default PATH to install SQloader Service is /usr/local/sqloader
+	Do you want to change the default? (y/N)
+	##################################################################################
+	The default PATH to SQloader Service logs directory is /var/log/sqloader-service
+	Do you want to change The default? (y/N)
+	##################################################################################
+	Please enter SQloader Admin IP address
+	192.168.5.234
+	##################################################################################
+	Please enter SQloader MEM size in GB
+	20
+	##################################################################################
+	Please enter HZCLUSTERNAME
+	sqcluster
+	##################################################################################
+	Default CONFDIR is /usr/local/sqloader/config , Do you want to change the default CONFDIR ? (y/N)
+	##################################################################################
+	Default SQloader Admin port is 7070 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	Default SQloader Service port is 6060 , Do you want to change the default port ? (y/N)
+	##################################################################################
+	Default sqload-jdbc.properties is /usr/local/sqloader/config, Do you want to change the default? (y/N)
+	Using default sqload-jdbc.properties PATH
+	/usr/local/sqloader/config
+	##################################################################################
+	##################################################################################
+	Using /usr/local/sqloader/config/sqload-jdbc.properties
+	##################################################################################
+	JAVA_HOME=/opt/java
+	BINDIR=/usr/local/sqloader/bin
+	LOG_DIR=/var/log/sqloader-service
+	CONFDIR=/usr/local/sqloader/config
+	JAR=sqloader-service-8.2.jar
+	PROPERTIES_FILE=/usr/local/sqloader/config/sqload-jdbc.properties
+	PORT=6060
+	ADMINIP=192.168.5.234
+	ADMINPORT=7070
+	MEM=20
+	HZCLUSTERNAME=sqcluster
+	##################################################################################
+	############# SQLoader Service installed successfuly #######################
+	##################################################################################
+	To Start SQLoader Service: sudo systemctl start sqloader-service
+	To View SQLoader Service status: sudo systemctl status sqloader-service
+	##################################################################################
+
+5. To start the SQLoader service, run the following command:
+
+.. code-block::
+
+	sudo systemctl start sqloader-service
+	
+6. To verify SQLoader service start status, run the following command (optional):
+
+.. code-block::
+
+	sudo systemctl status sqloader-service
+   
+Reconfiguration
+---------------
+
+**Admin server**
+
+You may reconfigure the admin server even after you have started it.
+
+
+1. To get the configuration path, run the following command:
+
+.. code-block::
+
+	cat /usr/lib/systemd/system/sqloader-admin.service | grep 'EnvironmentFile'
+	
+Output:
+	
+.. code-block::
+
+	EnvironmentFile=/usr/local/sqloader-admin/config/sqloader_admin.conf
+
+2. Restart the admin server:
+
+.. code-block::
+
+	sudo systemctl restart sqloader-admin
+
+**SQLoader service**
+
+You may reconfigure the SQLoader service even after you have started it.
+
+1. To get the configuration path, run the following command:
+
+.. code-block::
+
+	cat /usr/lib/systemd/system/sqloader-service.service | grep 'EnvironmentFile'
+	
+Output:
+	
+.. code-block::
+
+	EnvironmentFile=/usr/local/sqloader/config/sqloader_service.conf
+
+2. Restart the SQLoader service:
+
+.. code-block::
+
+	sudo systemctl restart sqloader-service
+   
+Connection String
+-----------------
+
+It is recommended that the ``sqload-jdbc.properties`` file will contain a connection string.
+
+1. Open the ``sqload-jdbc.properties`` file.
+2. Configure connection parameters for:
+
+   a. The source connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, Sybase or Teradata
+   b. The target connection string: SQreamDB
+   c. The :ref:`catalog` connection string: Greenplum, Microsoft SQL Server, Oracle, Postgresql, SAP HANA, SQreamDB, Sybase, or Teradata
+
+.. list-table:: Connection String Parameters
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - Description
+   * - ``HostIp:port``
+     - The host and IP address number
+   * - ``database_name``
+     - The name of the database from which data is loaded
+   * - ``user``
+     - Username of a role to use for connection
+   * - ``password``
+     - Specifies the password of the selected role
+   * - ``ssl``
+     - Specifies SSL for this connection
+
+.. literalinclude:: connection_string.ini
+    :language: ini
+    :caption: Properties File Sample
+    :linenos:
+
+SQLoader Service Interface
+==========================
+
+The SQLoader service automatically detects the IP addresses of incoming HTTP requests, even if the request originates from the same IP address as the one hosting the service. If you are accessing the service using a proxy server, you can include the client IP address in the request itself by using the ``X-Forwarded-For`` HTTP header, as in the following example:
+
+.. code-block::
+
+	curl -X POST -H 'X-Forwarded-For: 192.168.1.2' -H 'Content-Type: application/json' --data '{"loadTypeName": "inc", "sourceSchema": "QA", "sourceTable": "MY_TABLE", "sqreamTable": "MY_TABLE", "sqreamSchema": "QA"}' http://MyPc:6060/load
+
+Supported HTTP Requests
+-----------------------
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Request Type
+     - Request Name
+     - cURL Command
+     - Description
+     - Example
+   * - POST
+     - ``load``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/load``
+     - Sends a request to the service and returns immediately. This HTTP request is utilized within a load-balancing queue shared across multiple instances. This setup ensures efficient resource utilization by distributing incoming load requests evenly across all available instances. Additionally, the system incorporates :ref:`high availability` mechanisms to recover failed jobs in case an instance crashes, ensuring continuous operation and reliability even during instance failures. Note that if all instances crash, at least one instance must remain operational to recover and execute pending jobs.
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/load``
+   * - POST
+     - ``syncLoad``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{}' http://127.0.0.1:6060/syncLoad``
+     - Sends a request to the service and returns once the request is complete. There's no load-balancing queue shared across multiple instances; therefore, it's advised that ``syncLoad`` requests be monitored by the user and not heavily sent. Monitor using the ``getActiveLoads`` cURL.
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"sourceTable": "AVIV_INC", "sqreamTable": "t_inc", "limit":2000, "loadTypeName":"full"}' http://127.0.0.1:6060/syncLoad``
+   * - POST
+     - ``filterLogs``
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"requestId":"", "outputFilePath": ""}' http://127.0.0.1:6060/filterLogs``
+     - Retrieves logs for a specific request ID
+     - ``curl --header "Content-Type: application/json" --request POST --data '{"requestId":"request-1-6a2884a3", "outputFilePath": "/home/avivs/sqloader_request.log"}' http://127.0.0.1:6060/filterLogs``
+   * - GET
+     - ``getActiveLoads``
+     - ``curl --header "Content-Type: application/json" --request GET http://127.0.0.1:6060/getActiveLoads``
+     - Returns a list of all active loads currently running across all services
+     - 
+   * - GET
+     - ``cancelRequest``
+     - ``curl --request GET http://127.0.0.1:6061/cancelRequest/``
+     - Cancels an active request by request ID
+     - ``curl --request GET http://127.0.0.1:6061/cancelRequest/request-2-6aa3c53d``
+
+.. _high_availability:
+
+High Availability
+-----------------
+
+SQLoader as a service supports high availability for asynchronous load requests only. When a service crashes, another service will take over the tasks and execute them from the beginning. However, there are some limited cases where high availability will not provide coverage:
+
+* **At least one service must remain operational**: After a crash, at least one service must be up and running to ensure that tasks can be recovered and executed.
+
+* **Limitations for specific tasks**: When any of the following is configured: 
+
+	* A task involving a ``clustered`` flag must be set to ``true`` to enable high availability.
+
+	* A task involving a full load with ``truncate=false`` and ``drop=false`` will not rerun to prevent data duplication. In this type of load, data is inserted directly into the target table rather than a temporary table, making it impossible to determine if any data was inserted before the crash. 
+
+This setup ensures that asynchronous load requests are handled reliably, even in the event of service failures.
+
+Log Rotation
+------------
+
+Log rotation is based on time and size. At midnight (00:00) or when the file reaches 100MB, rotation occurs. Rotation means the log file ``SQLoader_service.log`` is renamed to ``SQLoader_service_%d_%i.log`` (%d=date, %i=rotation number), and a new, empty ``SQLoader_service.log`` file is created for the SQLoader service to continue writing to.
+
+Log Automatic cleanup
+^^^^^^^^^^^^^^^^^^^^^
+
+The maximum number of archived log files to keep is set to 360, so Logback will retain the latest 360 log files in the logs directory. Additionally, the total file size in the directory is limited to 50 GB. If the total size of archived log files exceeds this limit, older log files will be deleted to make room for new ones.
+
+SQLoader Request Parameters
+---------------------------
+
+Mandatory flags must be configured using HTTP flags or the ``properties`` file.
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - HTTP Parameter
+     - State
+     - Default
+     - Description
+   * - ``clustered``
+     - Optional
+     - ``true``
+     - This flag is relevant only for ``load`` requests (``async``), not for ``syncLoad``. Note that this flag affects :ref:`high availability`. When set to ``true``: the request is directed to one of the available instances within a cluster, often through a load balancer. When set to ``false``: the request goes directly to the specified host without load balancing.
+   * - ``configFile``
+     - Optional
+     - ``sqload-jdbc.properties``
+     - Defines the path to the configuration file you wish to use. If not specified, the service will use the default path provided upon service deployment.
+   * - ``connectionStringSqream``
+     - Mandatory
+     - 
+     - JDBC connection string to SQreamDB
+   * - ``connectionStringSource``
+     - Mandatory
+     - 
+     - JDBC connection string to source database
+   * - ``connectionStringCatalog``
+     - Mandatory
+     - 
+     - JDBC connection string to catalog database
+   * - ``cdcCatalogTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds all inc/cdc tables and their settings
+   * - ``cdcTrackingTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds the last tracking value for every inc/cdc table from ``cdcCatalogTable`` table	 
+   * - ``cdcPrimaryKeyTable``
+     - Optional
+     - 
+     - Part of the schema within the catalog database. Holds all primary keys for every inc/cdc table from ``cdcCatalogTable`` table	 
+   * - ``loadSummaryTable``
+     - Mandatory
+     - 
+     - Part of the schema within the catalog database. Pre-aggregated table that stores summarized loads which can help monitoring and analyzing load	 
+   * - ``batchSize``
+     - Optional
+     - ``10.000``
+     - The number of records to be inserted into SQreamDB at once. Please note that the configured batch size may impact chunk sizes.
+   * - ``caseSensitive``
+     - Optional
+     - ``false``
+     - If ``true``, keeps table name uppercase and lowercase characters when table is created in SQreamDB
+   * - ``checkCdcChain``
+     - Optional
+     - ``false``
+     - Check CDC chain between tracking table and source table 
+   * - ``chunkSize``
+     - Optional
+     - ``0``
+     - The number of records read at once from the source database
+   * - ``columnListFilePath``
+     - Optional
+     - 
+     - The name of the file that contains all column names. Columns must be separated using ``\n``. Expected file type is ``.txt`` 
+   * - ``columns``
+     - Optional
+     - All columns
+     - The name or names of columns to be loaded into SQreamDB ("col1,col2, ..."). For column names containing uppercase characters, maintain the uppercase format, avoid using double quotes or apostrophes, and ensure that the ``caseSensitive`` parameter is set to true
+   * - ``count``
+     - Optional
+     - ``true``
+     - Defines whether or not table rows will be counted before being loaded into SQreamDB 
+   * - ``cdcDelete``
+     - Optional
+     - ``true``
+     - Defines whether or not loading using Change Data Capture (CDC) includes deleted rows
+   * - ``drop``
+     - Optional
+     - ``false``
+     - Defines whether or not a new target table in SQreamDB is created. If ``false``, you will need to configure a target table name using the ``target`` parameter
+   * - ``fetchSize``
+     - Optional
+     - ``100000``
+     - The number of records to be read at once from source database. 
+   * - ``filter``
+     - Optional
+     - ``1=1``
+     - Defines whether or not only records with SQL conditions are loaded
+   * - ``h, help``
+     - Optional
+     - 
+     - Displays the help menu and exits
+   * - ``limit``
+     - Optional
+     - ``0`` (no limit)
+     - Limits the number of rows to be loaded
+   * - ``loadDttm``
+     - Optional
+     - ``true``
+     - Add an additional ``loadDttm`` column that defines the time and date of loading
+   * - ``loadDttmColumnName``
+     - Optional
+     - ``sq_load_dttm``
+     - Specifies the name of the additional column that records the time and date of loading. This parameter works in conjunction with the ``loadDttm`` parameter. If ``loadDttm`` is enabled, the column defined by ``loadDttmColumnName`` will be added to the target table.
+   * - ``loadTypeName``
+     - Optional
+     - ``full``
+     - Defines a loading type that affects the table that is created in SQreamDB. Options are ``full``, ``cdc``, or ``inc``. Please note that ``cdc`` is supported for Oracle only and that ``inc`` is supported for Oracle and Postgresql
+   * - ``lockCheck``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader will check source table is locked before the loading starts
+   * - ``lockTable``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader will lock target table before the loading starts
+   * - ``partitionName``
+     - Optional
+     - 
+     - Specifies the number of table partitions. If configured, ``partition`` ensures that data is loaded according to the specified partition. You may configure the ``thread`` parameter for parallel loading of your table partitions. If you do, please ensure that the number of threads does not exceed the number of partitions.
+   * - ``port``
+     - Optional
+     - ``6060``
+     - 
+   * - ``rowid``
+     - Optional
+     - ``false``
+     - Defines whether or not SQLoader will get row IDs from Oracle tables
+   * - ``sourceDatabaseName``
+     - Optional
+     - ``ORCL``
+     - Defines the source database name. It does not modify the database connection string but impacts the storage and retrieval of data within catalog tables.
+   * - ``splitByColumn``
+     - Optional
+     - 
+     - Column name for split (required for multi-thread loads)
+   * - ``sourceSchema``
+     - Mandatory
+     -  
+     - Source schema name to load data from
+   * - ``sourceTable``
+     - Mandatory
+     - 
+     - Source table name to load data from
+   * - ``sqreamSchema``
+     - Optional 
+     - The schema name defined in the ``sourceSchema`` flag
+     - Target schema name to load data into
+   * - ``sqreamTable``
+     - Optional
+     - The table name defined in the ``sourceTable`` flag
+     - Target table name to load data into
+   * - ``threadCount``
+     - Optional
+     - ``1``
+     - Number of threads to use for loading. Using multiple threads can significantly improve the loading performance, especially when dealing with columns that have metadata statistics (e.g., min/max values). SQLoader will automatically divide the data into batches based on the specified thread number, allowing for parallel processing. You may use ``thread`` both for tables that are partitioned and tables that are not. See :ref:`Sizing Guidelines`
+   * - ``truncate``
+     - Optional
+     - ``false``
+     - Truncate target table before loading
+   * - ``typeMappingPath``
+     - Optional
+     - ``config/sqream-mapping.json``
+     - A mapping file that converts source data types into SQreamDB data types.
+   * - ``useDbmsLob``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader uses ``dbms_lob_substr`` function for ``CLOB`` and ``BLOB`` data types
+   * - ``usePartitions``
+     - Optional
+     - ``true``
+     - Defines whether or not SQLoader uses partitions in ``SELECT`` statements
+   * - ``validateSourceTable``
+     - Optional
+     - ``true``
+     - Allows control over the validation of table existence during the load.
+	 
+.. _load_type_name:
+
+Using the ``loadTypeName`` Parameter
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Using the ``loadTypeName`` parameter, you can define how you wish records' changes to be made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Loading Type
+     - Parameter Option
+	 - Supported Databases
+     - Description
+   * - Full Table
+     - ``full``
+	 - All
+     - The entire data of the source table is loaded into SQreamDB
+   * - Change Data Capture (CDC)
+     - ``cdc``
+	 - Oracle
+     - Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of ``INSERT``, ``UPDATE``, and ``DELETE`` statements. SQLoader recognizes tables by table name and metadata. 
+   * - Incremental
+     - ``inc``
+	 - Oracle, Postgresql, SqreamDB
+     - Only changes made to the source table data since last load will be loaded into SQreamDB. Changes include transactions of ``INSERT`` statement. SQLoader recognizes the table by table name and metadata.
+	
+Please see below 
+
+	
+Using the SQLoader Service Web Interface
+----------------------------------------
+
+The SQLoader Admin Server is a web-based administration tool specifically designed to manage and monitor the SQLoader service. It provides a user-friendly interface for monitoring data loading processes, managing configurations, and troubleshooting issues related to data loading into SQreamDB.
+
+
+SQLoader Service Web Interface Features
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* Monitor Services:
+
+	* Health Checks: Monitor the health status of services to ensure they are functioning properly.
+	* Metrics: Monitor real-time performance metrics, including CPU usage, memory usage, and response times.
+	* Logging: View logs generated by services for troubleshooting and debugging purposes, and dynamically modify log levels during runtime to adjust verbosity for troubleshooting or performance monitoring.
+	
+* Manage Active Load Requests:
+
+	* View a list of currently active data loading requests, including their status, progress, and relevant metadata.
+
+Creating Summary and Catalog Tables
+===================================
+
+The summary and catalog tables are pre-aggregated tables that store summarized or aggregated data.
+
+Creating a Summary Table
+------------------------
+
+The summary table is part of the schema within the database catalog.
+
+The following summary table DDL uses Oracle syntax. 
+
+.. note:: 
+
+  If you are migrating from :ref:`SQLoader as a process` to **SQLoader as a service**, as described on this page, it is highly recommended that you add the following column to your existing summary table instead of re-creating it.
+
+  .. code-block:: sql
+
+    REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL    
+    CLIENT_IP TEXT (200 BYTE) VISIBLE DEFAULT NULL
+    REQUESTED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL
+    ACQUIRED_HOST TEXT (200 BYTE) VISIBLE DEFAULT NULL
+
+.. code-block:: sql
+
+  # Use this DDL to create summary tables on none SQDB databases 
+	CREATE TABLE public.SQLOAD_SUMMARY (
+	 DB_NAME TEXT(200 BYTE) VISIBLE,
+	 SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
+	 TABLE_NAME TEXT(200 BYTE) VISIBLE,
+	 TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
+	 LOAD_TYPE TEXT(200 BYTE) VISIBLE,
+	 UPDATED_DTTM_FROM DATE VISIBLE,
+	 UPDATED_DTTM_TO DATE VISIBLE,
+	 LAST_VAL_INT NUMBER(22,0) VISIBLE,
+	 LAST_VAL_TS TIMESTAMP(6) VISIBLE,
+	 START_TIME TIMESTAMP(6) VISIBLE,
+	 FINISH_TIME TIMESTAMP(6) VISIBLE,
+	 ELAPSED_SEC NUMBER VISIBLE,
+	 ROW_COUNT NUMBER VISIBLE,
+	 SQL_FILTER TEXT(200 BYTE) VISIBLE,
+	 PARTITION TEXT(200 BYTE) VISIBLE,
+	 STMT_TYPE TEXT(200 BYTE) VISIBLE,
+	 STATUS TEXT(200 BYTE) VISIBLE,
+	 LOG_FILE TEXT(200 BYTE) VISIBLE,
+	 DB_URL TEXT(200 BYTE) VISIBLE,
+	 PARTITION_COUNT NUMBER VISIBLE DEFAULT 0,
+	 THREAD_COUNT NUMBER VISIBLE DEFAULT 1,
+	 ELAPSED_MS NUMBER VISIBLE DEFAULT 0,
+	 STATUS_CODE NUMBER VISIBLE DEFAULT 0,
+	 ELAPSED_SOURCE_MS NUMBER(38,0) DEFAULT NULL,
+	 ELAPSED_SOURCE_SEC NUMBER(38,0) DEFAULT NULL,
+	 ELAPSED_TARGET_MS NUMBER(38,0) DEFAULT NULL,
+	 ELAPSED_TARGET_SEC NUMBER(38,0) DEFAULT NULL,
+	 TARGET_DB_URL TEXT (200 BYTE) DEFAULT NULL,
+	 SQLOADER_VERSION TEXT (200 BYTE) DEFAULT NULL,
+	 CLIENT_IP TEXT (200 BYTE) DEFAULT NULL,
+	 REQUESTED_HOST TEXT (200 BYTE) DEFAULT NULL,
+	 ACQUIRED_HOST TEXT (200 BYTE) DEFAULT NULL,
+	 REQUEST_ID TEXT (200 BYTE) VISIBLE DEFAULT NULL
+	);
+  
+.. code-block:: sql
+
+  # Use this DDL to create summary tables SQDB databases 
+	CREATE TABLE sqload_summary (
+	 DB_NAME TEXT,
+	 SCHEMA_NAME TEXT,
+	 TABLE_NAME TEXT,
+	 TABLE_NAME_FULL TEXT,
+	 LOAD_TYPE TEXT,
+	 UPDATED_DTTM_FROM DATE ,
+	 UPDATED_DTTM_TO DATE ,
+	 LAST_VAL_INT NUMBER,
+	 LAST_VAL_TS DATETIME ,
+	 START_TIME DATETIME ,
+	 FINISH_TIME DATETIME ,
+	 ELAPSED_SEC NUMBER ,
+	 ROW_COUNT NUMBER ,
+	 SQL_FILTER TEXT,
+	 PARTITION TEXT,
+	 STMT_TYPE TEXT,
+	 STATUS TEXT,
+	 LOG_FILE TEXT,
+	 DB_URL TEXT,
+	 PARTITION_COUNT NUMBER  DEFAULT 0,
+	 THREAD_COUNT NUMBER  DEFAULT 1,
+	 ELAPSED_MS NUMBER  DEFAULT 0,
+	 STATUS_CODE NUMBER  DEFAULT 0,
+	 ELAPSED_SOURCE_MS NUMBER DEFAULT NULL,
+	 ELAPSED_SOURCE_SEC NUMBER DEFAULT NULL,
+	 ELAPSED_TARGET_MS NUMBER DEFAULT NULL,
+	 ELAPSED_TARGET_SEC NUMBER DEFAULT NULL,
+	 TARGET_DB_URL TEXT DEFAULT NULL,
+	 SQLOADER_VERSION TEXT DEFAULT NULL,
+	 CLIENT_IP TEXT DEFAULT NULL,
+	 REQUESTED_HOST TEXT DEFAULT NULL,
+	 ACQUIRED_HOST TEXT DEFAULT NULL,
+	 REQUEST_ID TEXT  DEFAULT NULL
+	 );
+
+.. _creating_catalog_tables:
+
+Creating Catalog Tables
+-----------------------
+
+CDC (Change Data Capture) and Incremental tables are database tables that record changes made to data in order to track inserts, updates, and deletes for data synchronization and auditing purposes.
+
+See :ref:`load_type_name`
+
+.. code-block:: sql
+
+    #To be used for Oracle
+    CREATE TABLE public.CDC_TRACKING (
+	  DB_NAME TEXT(200 BYTE) VISIBLE,
+	  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
+	  LAST_UPDATED_DTTM DATE VISIBLE,
+	  LAST_VAL_INT NUMBER(22,0) VISIBLE DEFAULT 0,
+	  LAST_VAL_TS TIMESTAMP(6) VISIBLE,
+	  LAST_VAL_DT DATE VISIBLE
+	);
+
+.. code-block:: sql
+	
+	#To be used for SQDB
+	CREATE TABLE cdc_tracking (
+	 DB_NAME TEXT,
+	 SCHEMA_NAME TEXT,
+	 TABLE_NAME TEXT,
+	 TABLE_NAME_FULL TEXT,
+	 LAST_UPDATED_DTTM DATE ,
+	 LAST_VAL_INT NUMBER DEFAULT 0,
+	 LAST_VAL_TS DATETIME,
+	 LAST_VAL_DT DATETIME
+	);
+
+.. code-block:: sql
+
+	CREATE TABLE public.CDC_TABLES (
+	  DB_NAME TEXT(200 BYTE) VISIBLE,
+	  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME_CDC TEXT(200 BYTE) VISIBLE,
+	  INC_COLUMN_NAME TEXT(200 BYTE) VISIBLE,
+	  INC_COLUMN_TYPE TEXT(200 BYTE) VISIBLE,
+	  LOAD_TYPE TEXT(200 BYTE) VISIBLE,
+	  FREQ_TYPE TEXT(200 BYTE) VISIBLE,
+	  FREQ_INTERVAL NUMBER(22,0) VISIBLE,
+	  IS_ACTIVE NUMBER VISIBLE DEFAULT 0,
+	  STATUS_LOAD NUMBER VISIBLE DEFAULT 0,
+	  INC_GAP_VALUE NUMBER VISIBLE DEFAULT 0
+	);
+
+	CREATE TABLE public.CDC_TABLE_PRIMARY_KEYS (
+	  DB_NAME TEXT(200 BYTE) VISIBLE,
+	  SCHEMA_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME TEXT(200 BYTE) VISIBLE,
+	  TABLE_NAME_FULL TEXT(200 BYTE) VISIBLE,
+	  CONSTRAINT_NAME TEXT(200 BYTE) VISIBLE,
+	  COLUMN_NAME TEXT(200 BYTE) VISIBLE,
+	  IS_NULLABLE NUMBER VISIBLE DEFAULT 0
+	);
+
+
+Data Type Mapping 
+=================
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+Automatic Mapping
+------------------
+
+The **SQLoader** automatically maps data types used in Greenplum, Microsoft SQL Server, Oracle, Postgresql, Sybase, SAP HANA, and Teradata tables that are loaded into SQreamDB.
+
+Greenplum
+^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Greenplum Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``INT``, ``SMALLINT``, ``BIGINT``, ``INT2``, ``INT4``, ``INT8`` 
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``, ``BOOL``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``, ``FLOAT4``
+     - ``REAL``
+
+Microsoft SQL Server
+^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Microsoft SQL Server Type
+     - SQreamDB Type
+   * - ``CHAR``, ``NCHAR``, ``VARCHAR``, ``NVARCHAR``, ``NVARCHAR2``, ``CHARACTER``, ``TEXT``, ``NTEXT``
+     - ``TEXT``
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INT``, ``TINYINT``
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``, ``SMALLDATETIME``, ``DATETIMEOFFSET``, ``DATETIME2``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``
+     - ``REAL``
+   * - ``VARBINARY``
+     - ``TEXT``
+
+Oracle
+^^^^^^^ 
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Oracle Type
+     - SQreamDB Type
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGE``
+     - ``BIGINT``
+   * - ``CHAR``, ``NCHAR``, ``VARCHAR``, ``VARCHAR2``, ``NVARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``DATE``, ``DATETIME``
+     - ``DATETIME``
+   * - ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOLEAN``
+     - ``BOOL``
+   * - ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``CLOB``
+     - ``TEXT``
+   * - ``BLOB``
+     - ``TEXT``
+   * - ``RAW``
+     - ``TEXT``
+
+
+Postgresql
+^^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Postgresql Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``CHARACTER``
+     - ``TEXT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``INT``, ``SMALLINT``, ``BIGINT``, ``INT2``, ``INT4``, ``INT8`` 
+     - ``BIGINT``
+   * - ``DATETIME``, ``TIMESTAMP``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BIT``, ``BOOL``
+     - ``BOOL``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``REAL``, ``FLOAT4``
+     - ``REAL``
+
+SAP HANA
+^^^^^^^^
+	 
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - SAP HANA Type
+     - SQreamDB Type
+   * - ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGER``, ``TINYINT``
+     - ``BIGINT``
+   * - ``CHAR``, ``VARCHAR``, ``NVARCHAR``, ``TEXT``, ``VARCHAR2``, ``NVARCHAR2``
+     - ``TEXT``
+   * - ``DATETIME``, ``TIMESTAMP``, ``SECONDDATE``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOLEAN``
+     - ``TEXT``
+   * - ``DECIMAL``, ``SMALLDECIMAL``, ``BIGDECIMAL``
+     - ``NUMERIC``
+   * - ``DOUBLE``, ``REAL``
+     - ``FLOAT``
+   * - ``TEXT``
+     - ``TEXT``
+   * - ``BIGINT``
+     - ``BIGINT``
+   * - ``INT``
+     - ``INT``
+   * - ``SMALLINT``
+     - ``SMALLINT``
+   * - ``TINYINT``
+     - ``TINYINT``
+   * - ``DATETIME``
+     - ``DATETIME``
+   * - ``DATE``
+     - ``DATE``
+   * - ``BOOL``
+     - ``BOOL``
+   * - ``NUMERIC``
+     - ``NUMERIC``
+   * - ``DOUBLE``
+     - ``DOUBLE``
+   * - ``FLOAT``
+     - ``FLOAT``
+   * - ``REAL``
+     - ``REAL``	 
+	 
+Sybase
+^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Sybase Type
+     - SQreamDB Type
+   * - ``CHAR``, ``VARCHAR``, ``LONG VARCHAR``, ``CHARACTER``, ``TEXT``
+     - ``TEXT``
+   * - ``TINYINT``
+     - ``TINYINT``
+   * - ``SMALLINT``
+     - ``SMALLINT``   
+   * - ``INT``, ``INTEGER``
+     - ``INT``
+   * - ``BIGINT``
+     - ``BIGINT``
+   * - ``DECIMAL``, ``NUMERIC``
+     - ``NUMERIC``   
+   * - ``NUMERIC(126,38)``
+     - ``NUMERIC(38,10)``
+   * - ``FLOAT``, ``DOUBLE``
+     - ``DOUBLE``
+   * - ``DATE``
+     - ``DATE``   
+   * - ``DATETIME``, ``TIMESTAMP``, ``TIME``
+     - ``DATETIME``   
+   * - ``BIT``
+     - ``BOOL``   
+   * - ``VARBINARY``, ``BINARY``, ``LONG BINARY``
+     - ``TEXT``   
+
+Teradata
+^^^^^^^^^
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Teradata Type
+     - SQreamDB Type
+   * - ``F``
+     - ``DOUBLE``
+   * - ``N``, ``D``
+     - ``NUMERIC``
+   * - ``CO``
+     - ``TEXT``
+   * - ``BO``
+     - ``TEXT``
+   * - ``A1``, ``AN``, ``AT``, ``BF``, ``BV``, ``CF``, ``CV``, ``JN``, ``PD``, ``PM``, ``PS``, ``PT``, ``PZ``, ``SZ``, ``TZ``
+     - ``TEXT``
+   * - ``I``, ``I4``, ``I(4)``  
+     - ``INT``
+   * - ``I2``, ``I(2)``
+     - ``SMALLINT``
+   * - ``I1``, ``I(1)``
+     - ``TINYINT``
+   * - ``DH``, ``DM``, ``DS``, ``DY``, ``HM``, ``HS``, ``HR``, ``I8``, ``MO``, ``MS``, ``MI``, ``SC``, ``YM``, ``YR``
+     - ``BIGINT``
+   * - ``TS``, ``DATETIME``
+     - ``DATETIME``
+   * - ``DA``
+     - ``DATE``
+   * - ``BIT``
+     - ``BOOL``
+   * - ``REAL``, ``DOUBLE``
+     - ``DOUBLE``
+
+Manually Adjusting Mapping
+----------------------------
+
+You have the possibility to adjust the mapping process according to your specific needs, using any of the following methods.
+
+``names`` Method
+^^^^^^^^^^^^^^^^^
+
+To specify that you want to map one or more columns in your table to a specific data type, duplicate the code block which maps to the SQreamDB data type you want and include the ``names`` parameter in your code block. The SQLoader will map the specified columns to the specified SQreamDB data type. After the specified columns are mapped, the SQLoader continue to search for how to convert other data types to the same data type of the specified columns. 
+
+In this example, ``column1``, ``column2``, and ``column3`` are mapped to ``BIGINT`` and the Oracle data types ``BIGINT``, ``INT``, ``SMALLINT``, ``INTEGER`` are also mapped to ``BIGINT``.
+
+.. code-block:: json
+
+	{
+	  "oracle": [
+		{
+		  "names": ["column1", "column2", "column3"],
+		  "sqream": "bigint",
+		  "java": "int",
+		  "length": false
+		},
+		{
+		  "type": ["bigint","int","smallint","integer"],
+		  "sqream": "bigint",
+		  "java": "int",
+		  "length": false
+		}
+	}	
+
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+   :hidden:
+   
+   preparing_oracle_for_data_migration
\ No newline at end of file
diff --git a/data_type_guides/converting_and_casting_types.rst b/data_type_guides/converting_and_casting_types.rst
index ee5e273da..ba9efc9fd 100644
--- a/data_type_guides/converting_and_casting_types.rst
+++ b/data_type_guides/converting_and_casting_types.rst
@@ -1,9 +1,10 @@
 .. _converting_and_casting_types:
 
-*************************
-Converting and Casting Types
-*************************
-SQream supports explicit and implicit casting and type conversion. The system may automatically add implicit casts when combining different data types in the same expression. In many cases, while the details related to this are not important, they can affect the query results of a query. When necessary, an explicit cast can be used to override the automatic cast added by SQream DB.
+*********************
+Casts and Conversions
+*********************
+
+SQreamDB supports explicit and implicit casting and type conversion. The system may automatically add implicit casts when combining different data types in the same expression. In many cases, while the details related to this are not important, they can affect the results of a query. When necessary, an explicit cast can be used to override the automatic cast added by SQreamDB.
 
 For example, the ANSI standard defines a ``SUM()`` aggregation over an ``INT`` column as an ``INT``. However, when dealing with large amounts of data this could cause an overflow. 
 
@@ -13,14 +14,116 @@ You can rectify this by casting the value to a larger data type, as shown below:
 
    SUM(some_int_column :: BIGINT)
 
-SQream supports the following three data conversion types:
+Conversion Methods
+==================
+
+SQreamDB supports the following data conversion methods:
 
-* ``CAST( TO )``, to convert a value from one type to another. For example, ``CAST('1997-01-01' TO DATE)``, ``CAST(3.45 TO SMALLINT)``, ``CAST(some_column TO VARCHAR(30))``.
+* ``CAST( AS )``, to convert a value from one type to another. 
 
-   ::
+  For example: 
+  
+  .. code-block:: postgres
+	
+	CAST('1997-01-01' AS DATE)
+	CAST(3.45 AS SMALLINT)
+	CAST(some_column AS TEXT)
   
-* `` :: ``, a shorthand for the ``CAST`` syntax. For example, ``'1997-01-01' :: DATE``, ``3.45 :: SMALLINT``, ``(3+5) :: BIGINT``.
+* `` :: ``, a shorthand for the ``CAST`` syntax. 
 
-   ::
+  For example: 
+  
+  .. code-block:: postgres
+  
+	'1997-01-01' :: DATE 
+	3.45 :: SMALLINT 
+	(3+5) :: BIGINT
   
-* See the :ref:`SQL functions reference ` for additional functions that convert from a specific value which is not an SQL type, such as :ref:`from_unixts`, etc.
\ No newline at end of file
+* See the :ref:`SQL functions reference ` for additional functions that convert from a specific value which is not an SQL type, such as :ref:`from_unixts`, etc.
+
+
+Supported Casts
+===============
+
+The listed table of supported casts also applies to the :ref:`sql_data_type_array` data type. For instance, you can cast a ``NUMERIC[]`` array to a ``TEXT[]`` array.
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - **FROM / TO**
+     - **BOOL**
+     - **TINYINT**/**SMALLINT**/**INT**/**BIGINT**
+     - **REAL/FLOAT**
+     - **NUMERIC**
+     - **DATE**/**DATETIME**/**DATETIME2**
+     - **TEXT**
+   * - **BOOL**
+     - N/A
+     - ✓
+     - ✗
+     - ✗
+     - ✗
+     - ✓
+   * - **TINYINT**/**SMALLINT**/**INT**/**BIGINT**
+     - ✓
+     - N/A
+     - ✓
+     - ✓
+     - ✗
+     - ✓
+   * - **REAL/FLOAT**
+     - ✗
+     - ✓
+     - N/A
+     - ✓
+     - ✗
+     - ✓
+   * - **NUMERIC**
+     - ✗
+     - ✓
+     - ✓
+     - ✓
+     - ✗
+     - ✓
+   * - **DATE**/**DATETIME**/**DATETIME2**
+     - ✗
+     - ✗
+     - ✗
+     - ✗
+     - ✓
+     - ✓
+   * - **TEXT**
+     - ✓
+     - ✓
+     - ✓
+     - ✓
+     - ✓
+     - N/A
+
+Value Dependent Conversions
+---------------------------
+
+Conversions between certain data types may be value-dependent, as the outcome can vary based on the specific values being converted and their compatibility with the target data type's range or precision.
+
+For example:
+
+.. code-block:: postgres
+
+	CREATE OR REPLACE TABLE t(xint INT, xtext TEXT);
+	INSERT INTO t VALUES(1234567, 'abc');
+
+	-- yields cast overflow:
+	SELECT xint::TINYINT FROM t;
+
+	-- yields Unsupported conversion attempt from string to number - not all strings are numbers:
+	SELECT xtext::INT FROM t;
+	
+	
+	CREATE OR REPLACE TABLE t(xint INT, xtext TEXT);
+	INSERT INTO t VALUES(12, '12');
+
+	-- yields 12 in both cases:
+	SELECT xint::TINYINT FROM t;
+	
+	SELECT xtext::INT FROM t;
\ No newline at end of file
diff --git a/data_type_guides/index.rst b/data_type_guides/index.rst
index 6afec60cf..84678b7dc 100644
--- a/data_type_guides/index.rst
+++ b/data_type_guides/index.rst
@@ -1,14 +1,15 @@
 .. _data_type_guides:
 
-*************************
-Data Type Guides
-*************************
+**********
+Data Types
+**********
+
 This section describes the following:
 
 .. toctree::
    :maxdepth: 1
    :glob:
 
-   converting_and_casting_types
    supported_data_types
-   supported_casts
\ No newline at end of file
+   converting_and_casting_types
+   supported_casts
diff --git a/data_type_guides/sql_data_type_array.rst b/data_type_guides/sql_data_type_array.rst
new file mode 100644
index 000000000..03ccffec7
--- /dev/null
+++ b/data_type_guides/sql_data_type_array.rst
@@ -0,0 +1,317 @@
+.. _sql_data_type_array:
+
+*****
+Array
+*****
+
+The ``ARRAY`` data type offers a convenient way to store ordered collections of elements in a single column. It provides storage efficiency by allowing multiple values of the same data type to be compactly stored, optimizing space utilization and enhancing database performance. Working with ``ARRAY`` simplifies queries as operations and manipulations can be performed on the entire ``ARRAY``, resulting in more concise and readable code.
+
+An ``ARRAY`` represents a sequence of zero or more elements of the same data type. Arrays in the same column can contain varying numbers of elements across different rows. Arrays can include null values, eliminating the need for separate SQL declarations.
+
+Each data type has its companion ``ARRAY`` type, such as ``INT[]`` for integers and ``TEXT[]`` for text values.
+
+You may use the ``ARRAY`` data type with all :ref:`SQreamDB connectors `, except for ODBC since the ODBC protocol does not support ``ARRAY``. 
+
+The maximum size of an ``ARRAY``, indicating the number of elements it can hold, is 65535. You have the option to specify the size of an ``ARRAY``, providing a maximum allowed size, while each row can have a different number of elements up to the specified maximum. If the ``ARRAY`` size is not specified, the maximum size is assumed. 
+
+.. seealso:: A full list of :ref:`data types` supported by SQreamDB.
+
+Syntax
+======
+
+Defining an ``ARRAY`` is done by appending the ``[]`` notation to a supported data type, for example, ``INT[]`` for an array of integers.
+
+.. code-block:: sql
+
+	CREATE TABLE
+	 < table_name > (< column1 > TEXT [], < column2 > INT [])
+	
+	INSERT INTO
+	  TABLE < table_name >
+	VALUES
+	  (ARRAY ['a','b','c'], ARRAY [1,2,NULL])
+
+
+Supported Operators
+===================
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Operator
+     - Description
+     - Example	 
+   * - Literals ``ARRAY []``
+     - Literals are created using the ``ARRAY`` operator
+     - ``ARRAY[1,2,3]``
+   * - Mapping
+     - Parquet, ORC, JSON, and AVRO ``ARRAY`` types may be mapped into SQreamDB ``ARRAY``
+     - See extended section under **Examples** 
+   * - Indexing
+     - Access to specific elements within the array by using a **zero-based index**
+     - ``SELECT ([2]) FROM `` returns the third element of the specified column  
+   * - ``UNNEST``
+     - Converts the arrayed elements within a single row into a set of rows
+     - ``SELECT UNNEST() FROM ``  
+   * - Concatenate ``||``
+     - Converts arrayed elements into one string
+     - * ``SELECT () || () FROM `` 
+       * ``SELECT () || ARRAY[1,2,3] FROM ``  
+   * - ``array_length``
+     - Returns the number of arrayed elements within the specified column
+     - ``SELECT array_length() FROM ``  
+   * - ``array_position``
+     - Locates the position of the specified value within the specified array. Returns ``NULL`` if the value is not found
+     - ``SELECT array_position(,) FROM ;``  
+   * - ``array_remove``
+     - Returns the specified ``ARRAY`` column with the specified value deducted
+     - ``SELECT array_remove(,) FROM ;``  
+   * - ``array_replace``
+     - Enables replacing values within an ``ARRAY`` column
+     - ``SELECT array_replace(,,) FROM ;``  
+   * - Limiting number of arrayed elements 
+     - You may limit the number of arrayed elements within an ``ARRAY``
+     - Limiting the number of arrayed elements to 4: ``CREATE TABLE  ( TEXT[4]);``	 
+   * - Compression
+     - You may follow SQreamDB :ref:`compression guide ` for compression types and methods
+     - ``CREATE TABLE t (comp_dict INT[] CHECK('CS "dict"');``
+   * - Aggregation
+     - The ``array_agg()`` function arrays groups created using the ``GROUP BY`` clause
+     - ``CREATE TABLE t2 (x INT, y INT);``
+       
+	``SELECT x, array_agg(y) FROM t2 GROUP BY x;``
+   * - Sorting
+     - ``TEXT[]`` elements are considered together as a single text, and comparisons are made based on their lexicographic order. In contrast, for arrays of non-TEXT data types, comparisons are performed on the individual elements of the arrays
+     - ``CREATE TABLE t (x TEXT[]);``
+	 
+	``INSERT INTO t VALUES (ARRAY['1']),(ARRAY['1','22']),(ARRAY['1','3']);``
+	``SELECT x FROM t ORDER BY x;``
+	
+	Output:
+	           
+	['1']      
+	           
+	['1','22'] 
+	           
+	['1','3']
+	
+Examples
+========
+
+.. contents:: 
+   :local:
+   :depth: 1
+
+``ARRAY`` Statements
+--------------------
+
+Creating a table with arrayed columns:
+
+.. code-block:: sql
+
+	CREATE TABLE
+	  my_array (
+	    clmn1 TEXT [],
+	    clmn2 TEXT [],
+	    clmn3 INT [],
+	    clmn4 NUMERIC(38, 20) []
+	  );
+	
+Inserting arrayed values into a table:
+
+.. code-block:: sql
+	
+	INSERT INTO
+	  my_array
+	VALUES
+	  (
+	    ARRAY ['1','2','3'],
+	    ARRAY ['4','5','6'],
+	    ARRAY [7,8,9,10],
+	    ARRAY [0.4354,0.5365435,3.6456]
+	  );
+	
+Converting arrayed elements into a set of rows:
+
+.. code-block:: sql
+	
+	SELECT
+	  UNNEST(clmn1) FROM my_array;
+
+.. code-block:: console
+	
+	 clmn1  |     
+	--------+
+	 1      |     
+	 2      |       
+	 3      |      
+
+Updating table values:
+
+.. code-block:: sql
+	
+	UPDATE
+	  my_array
+	SET
+	  clmn1 [0] = 'A';
+	
+	SELECT
+	  *
+	FROM
+	  my_array;
+	
+.. code-block:: console
+
+	clmn1                | clmn2            | clmn3
+	---------------------+------------------+-----------
+	["A","1","2","3"]    | ["4","5","6"]    | [7,8,9,10]
+
+Ingesting Arrayed Data from External Files
+------------------------------------------
+
+Consider the following JSON file named ``t``, located under ``/tmp/``:
+
+.. code-block:: json
+
+
+    {
+        "name": "Avery Bradley",
+        "age": 25,
+        "position": "PG",
+        "years_in_nba": [
+            2010,
+            2011,
+            2012,
+            2013,
+            2014,
+            2015,
+            2016,
+            2017,
+            2018,
+            2019,
+            2020,
+            2021
+        ]
+    },
+    {
+        "name": "Jae Crowder",
+        "age": 25,
+        "position": "PG",
+        "years_in_nba": [
+            2012,
+            2013,
+            2014,
+            2015,
+            2016,
+            2017,
+            2018,
+            2019,
+            2020,
+            2021
+        ]
+    },
+    {
+        "name": "John Holland",
+        "age": 27,
+        "position": "SG",
+        "years_in_nba": [
+            2017,
+            2018
+        ]
+    }
+]
+
+Execute the following statement:
+
+.. code-block:: sql
+
+	CREATE FOREIGN TABLE nba (name text, age int, position text, years_in_nba int [])
+	WRAPPER
+	  json_fdw
+	OPTIONS
+	  (location = '/tmp/t.json');
+	
+	SELECT
+	  *
+	FROM
+	  nba;
+	
+Output:
+
+.. code-block:: console
+
+	name           | age    | position    | years_in_nba
+	---------------+--------+-------------+-------------------------------------------------------------------------
+	Avery Bradley  | 25     | PG          | [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021]
+	Jae Crowder    | 25     | PG          | [2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021]
+	John Holland   | 27     | SG          | [2017, 2018]
+
+Limitations
+===========
+
+Casting Limitations
+-------------------
+
+``NUMERIC``
+"""""""""""
+
+Numeric data types smaller than ``INT``, such as ``TINYINT``, ``SMALLINT``, and ``BOOL``, must explicitly be cast.
+
+.. code-block:: sql
+
+	CREATE OR REPLACE TABLE my_array (clmn1 tinyint []); 
+	SELECT array_replace(clmn1 , 4::tinyint, 5::tinyint) FROM my_array;  
+	
+	CREATE OR REPLACE TABLE my_array (clmn1 bool []); 
+	SELECT array_replace(clmn1 , 0::bool, 1::bool) FROM my_array;
+	
+``TEXT``
+""""""""
+
+Casting ``TEXT`` to non-``TEXT`` and non-``TEXT`` to ``TEXT`` data types is not supported.
+	
+.. code-block:: sql
+
+
+	CREATE TABLE t_text (xtext TEXT[]);
+	CREATE TABLE t_int (xint INT[]);
+	INSERT INTO t_int VALUES (array[1,2,3]);
+	INSERT INTO t_text SELECT xint::TEXT[] FROM t_int;
+
+Connectors
+----------
+
+``.NET`` and ``ODBC``
+"""""""""""""""""""""
+
+Please note that the SQreamDB ODBC and .NET connectors do not support the use of ARRAY data types. If your database schema includes ARRAY columns, you may encounter compatibility issues when using these connectors.
+
+``Pysqream``
+""""""""""""
+
+Please note that SQLAlchemy does not support the ``ARRAY`` data type.
+
+Functions
+---------
+
+|| (Concatenate)
+""""""""""""""""
+
+Using the ``||`` (Concatenate) function with two different data types requires explicit casting.
+
+.. code-block:: sql
+
+	SELECT (clmn1, 4::tinyint) || (clmn2, 5::tinyint) FROM my_array;
+	
+UNNEST
+""""""
+
+The ``UNNEST`` function is computationally intensive; therefore, it is recommended to use up to 10 ``UNNEST`` clauses per statement as a best practice.
+To improve performance, consider filtering the results before applying ``UNNEST`` to reduce data volume and optimize query runtime.
+
+Window
+""""""
+
+Window functions are not supported.
+
diff --git a/data_type_guides/sql_data_types_boolean.rst b/data_type_guides/sql_data_types_boolean.rst
index 84b7c14ce..8142ae6f6 100644
--- a/data_type_guides/sql_data_types_boolean.rst
+++ b/data_type_guides/sql_data_types_boolean.rst
@@ -3,6 +3,7 @@
 *************************
 Boolean
 *************************
+
 The following table describes the Boolean data type.
 
 .. list-table::
@@ -17,7 +18,8 @@ The following table describes the Boolean data type.
      - 1 byte, but resulting average data sizes may be lower after compression.
 	 
 Boolean Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^^^^
+
 The following is an example of the Boolean syntax:
 
 .. code-block:: postgres
@@ -37,7 +39,7 @@ The following is an example of the correct output:
    "kiwi","Is not angry"
 
 Boolean Casts and Conversions
-^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The following table shows the possible Boolean value conversions:
 
diff --git a/data_type_guides/sql_data_types_date.rst b/data_type_guides/sql_data_types_date.rst
index da83f80cc..f4bcf7421 100644
--- a/data_type_guides/sql_data_types_date.rst
+++ b/data_type_guides/sql_data_types_date.rst
@@ -3,11 +3,13 @@
 *************************
 Date
 *************************
+
 ``DATE`` is a type designed for storing year, month, and day. ``DATETIME`` is a type designed for storing year, month, day, hour, minute, seconds, and milliseconds in UTC with 1 millisecond precision.
 
 
 Date Types
 ^^^^^^^^^^^^^^^^^^^^^^
+
 The following table describes the Date types:
 
 .. list-table:: Date Types
@@ -35,9 +37,10 @@ Aliases
 
 Syntax
 ^^^^^^^^
+
 ``DATE`` values are formatted as string literals. 
 
-The following is an example of the DATETIME syntax:
+The following is an example of the DATE syntax:
 
 .. code-block:: console
      
@@ -60,12 +63,14 @@ SQream attempts to guess if the string literal is a date or datetime based on co
 
 Size
 ^^^^^^
-A ``DATE`` column is 4 bytes in length, while a ``DATETIME`` column is 8 bytes in length.
+
+A ``DATE`` column is 4 bytes, while a ``DATETIME`` column is 8 bytes.
 
 However, the size of these values is compressed by SQream DB.
 
 Date Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^
+
 The following is an example of the Date syntax:
 
 .. code-block:: postgres
@@ -95,10 +100,8 @@ The following is an example of the correct output:
    1997-01-01 00:00:00.0,1955-11-05
    
 
-.. warning:: Some client applications may alter the ``DATETIME`` value by modifying the timezone.
-
 Date Casts and Conversions
-^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The following table shows the possible ``DATE`` and ``DATETIME`` value conversions:
 
@@ -108,5 +111,5 @@ The following table shows the possible ``DATE`` and ``DATETIME`` value conversio
    
    * - Type
      - Details
-   * - ``VARCHAR(n)``
+   * - ``TEXT``
      - ``'1997-01-01'`` → ``'1997-01-01'``, ``'1955-11-05 01:24'`` → ``'1955-11-05 01:24:00.000'``
\ No newline at end of file
diff --git a/data_type_guides/sql_data_types_datetime2.rst b/data_type_guides/sql_data_types_datetime2.rst
new file mode 100644
index 000000000..004bffc84
--- /dev/null
+++ b/data_type_guides/sql_data_types_datetime2.rst
@@ -0,0 +1,76 @@
+.. _sql_data_types_date:
+
+*************************
+DateTime2
+*************************
+
+``DATETIME2`` is a type designed for storing year, month, day, hour, minute, seconds, milliseconds, microseconds, nanoseconds and UTC offset with 1 nanosecond precision.
+
+
+Aliases
+^^^^^^^^^^
+
+NA
+
+
+Syntax
+^^^^^^^^
+
+``DATETIME2`` values are formatted as string literals. 
+
+The following is an example of the DATETIME2 syntax:
+
+.. code-block:: console
+     
+   ``1955-11-05 01:24:00.000000000 +00:00``
+
+
+``DATETIME2`` values are formatted as string literals conforming to `ISO 8601 `_.
+
+SQream attempts to guess if the string literal is a datetime2 based on context, for example when used in date-specific functions.
+
+Size
+^^^^^^
+
+A ``DATETIME2`` column is 16 bytes.
+
+However, the size of these values is compressed by SQream DB.
+
+Date Examples
+^^^^^^^^^^^^^
+
+The following is an example of the Date syntax:
+
+.. code-block:: postgres
+   
+   CREATE TABLE important_dates (a DATETIME2);
+
+   INSERT INTO important_dates VALUES ('1955-11-05 01:24:00.000000000 -08:00');
+
+   SELECT * FROM important_dates;
+   
+The following is an example of the correct output:
+
+.. code-block:: text
+
+ 1955-11-05 01:24:00.000 -0800
+   
+
+
+Date Casts and Conversions
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following table shows the possible ``DATETIME2`` value conversions:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Type
+     - Details
+   * - ``TEXT``
+     - ``'1997-01-01'`` → ``'1997-01-01 00:00:00.000000 +00:00'``, ``'1955-11-05 01:24'`` → ``'1955-11-05 01:24:00.000000 +00:00'``
+   * - ``DATE``
+     - ``'1997-01-01'`` → ``'1997-01-01 00:00:00.000000 +00:00'``
+   * - ``DATETIME``
+     - ``'1955-11-05 01:24'`` → ``'1955-11-05 01:24:00.000000 +00:00'``
\ No newline at end of file
diff --git a/data_type_guides/sql_data_types_floating_point.rst b/data_type_guides/sql_data_types_floating_point.rst
index 18227140c..4ea388dd7 100644
--- a/data_type_guides/sql_data_types_floating_point.rst
+++ b/data_type_guides/sql_data_types_floating_point.rst
@@ -3,12 +3,14 @@
 *************************
 Floating Point
 *************************
+
 The **Floating Point** data types (``REAL`` and ``DOUBLE``) store extremely close value approximations, and are therefore recommended for values that tend to be inexact, such as Scientific Notation. While Floating Point generally runs faster than Numeric, it has a lower precision of ``9`` (``REAL``) or ``17`` (``DOUBLE``) compared to Numeric's ``38``. For operations that require a higher level of precision, using :ref:`Numeric ` is recommended.
 
 The floating point representation is based on `IEEE 754 `_.
 
 Floating Point Types
 ^^^^^^^^^^^^^^^^^^^^^^
+
 The following table describes the Floating Point data types.
 
 .. list-table:: 
@@ -42,7 +44,8 @@ The following table shows information relevant to the Floating Point data types.
      - Floating point types are either 4 or 8 bytes, but size could be lower after compression.
 
 Floating Point Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^
+
 The following are examples of the Floating Point syntax:
 
 .. code-block:: postgres
@@ -61,7 +64,8 @@ The following are examples of the Floating Point syntax:
 .. note:: Most SQL clients control display precision of floating point numbers, and values may appear differently in some clients.
 
 Floating Point Casts and Conversions
-^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The following table shows the possible Floating Point value conversions:
 
 .. list-table:: 
@@ -74,7 +78,6 @@ The following table shows the possible Floating Point value conversions:
      - ``1.0`` → ``true``, ``0.0`` → ``false``
    * - ``TINYINT``, ``SMALLINT``, ``INT``, ``BIGINT``
      - ``2.0`` → ``2``, ``3.14159265358979`` → ``3``, ``2.718281828459`` → ``2``, ``0.5`` → ``0``, ``1.5`` → ``1``
-   * - ``VARCHAR(n)`` (n > 6 recommended)
-     - ``1`` → ``'1.0000'``, ``3.14159265358979`` → ``'3.1416'``
+
 
 .. note:: As shown in the above examples, casting ``real`` to ``int`` rounds down.
\ No newline at end of file
diff --git a/data_type_guides/sql_data_types_integer.rst b/data_type_guides/sql_data_types_integer.rst
index 9d4210731..24ace31b9 100644
--- a/data_type_guides/sql_data_types_integer.rst
+++ b/data_type_guides/sql_data_types_integer.rst
@@ -3,12 +3,14 @@
 *************************
 Integer
 *************************
+
 Integer data types are designed to store whole numbers.
 
 For more information about identity sequences (sometimes called auto-increment or auto-numbers), see :ref:`identity`.
 
 Integer Types
 ^^^^^^^^^^^^^^^^^^^
+
 The following table describes the Integer types.
 
 .. list-table:: 
@@ -48,7 +50,8 @@ The following table describes the Integer data type.
      - Integer types range between 1, 2, 4, and 8 bytes - but resulting average data sizes could be lower after compression.
 
 Integer Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^^^^
+
 The following is an example of the Integer syntax:
 
 .. code-block:: postgres
@@ -67,7 +70,7 @@ The following is an example of the correct output:
    -5,127,32000,45000000000
 
 Integer Casts and Conversions
-^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The following table shows the possible Integer value conversions:
 
@@ -79,5 +82,5 @@ The following table shows the possible Integer value conversions:
      - Details
    * - ``REAL``, ``DOUBLE``
      - ``1`` → ``1.0``, ``-32`` → ``-32.0``
-   * - ``VARCHAR(n)`` (All numberic values must fit in the string length)
+   * - ``TEXT`` (All numberic values must fit in the string length)
      - ``1`` → ``'1'``, ``2451`` → ``'2451'``
\ No newline at end of file
diff --git a/data_type_guides/sql_data_types_numeric.rst b/data_type_guides/sql_data_types_numeric.rst
index 8a78e19fa..e97528b63 100644
--- a/data_type_guides/sql_data_types_numeric.rst
+++ b/data_type_guides/sql_data_types_numeric.rst
@@ -1,22 +1,23 @@
 .. _sql_data_types_numeric:
 
-*************************
+*******
 Numeric
-*************************
-The **Numeric** data type (also known as **Decimal**) is recommended for values that tend to occur as exact decimals, such as in Finance. While Numeric has a fixed precision of ``38``, higher than ``REAL`` (``9``) or ``DOUBLE`` (``17``), it runs calculations more slowly. For operations that require faster performance, using :ref:`Floating Point ` is recommended.
+*******
 
-The correct syntax for Numeric is ``numeric(p, s)``), where ``p`` is the total number of digits (``38`` maximum), and ``s`` is the total number of decimal digits.
+The **Numeric** data type (also known as **Decimal**) is recommended for values that tend to occur as exact decimals, such as in Finance. While Numeric has a fixed precision of ``38``, higher than ``REAL`` (``9``) or ``DOUBLE`` (``17``), it runs calculations more slowly. For operations that require faster performance, using :ref:`Floating Point ` is recommended.
+
+The correct syntax for Numeric is ``numeric(p, s)``), where ``p`` is the total number of digits (``38`` maximum), and ``s`` is the total number of decimal digits. If no parameters are specified, Numeric defaults to ``numeric(38, 0)``.
 
 Numeric Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^^^^
 
 The following is an example of the Numeric syntax:
 
 .. code-block:: postgres
 
-   $ create or replace table t(x numeric(20, 10), y numeric(38, 38));
-   $ insert into t values(1234567890.1234567890, 0.123245678901234567890123456789012345678);
-   $ select x + y from t;
+   CREATE OR REPLACE table t(x numeric(20, 10), y numeric(38, 38));
+   INSERT INTO t VALUES(1234567890.1234567890, 0.12324567890123456789012345678901234567);
+   SELECT x + y FROM t;
    
 The following table shows information relevant to the Numeric data type:
 
@@ -29,10 +30,10 @@ The following table shows information relevant to the Numeric data type:
      - Example	 
    * - 38 digits
      - 16 bytes
-     - ``0.123245678901234567890123456789012345678``
+     - ``0.12324567890123456789012345678901234567``
 
 Numeric supports the following operations:
 
    * All join types.
    * All aggregation types (not including Window functions).
-   * Scalar functions (not including some trigonometric and logarithmic functions).
\ No newline at end of file
+   * Scalar functions (not including some trigonometric and logarithmic functions).
diff --git a/data_type_guides/sql_data_types_primitives.rst b/data_type_guides/sql_data_types_primitives.rst
new file mode 100644
index 000000000..bd49468b2
--- /dev/null
+++ b/data_type_guides/sql_data_types_primitives.rst
@@ -0,0 +1,78 @@
+.. _sql_data_types_primitives:
+
+********************
+Primitive Data Types
+********************
+
+SQreamDB compresses all columns and types. The data size noted is the maximum data size allocation for uncompressed data.
+
+.. list-table::
+   :widths: 20 15 20 30 20
+   :header-rows: 1
+   
+   * - Name
+     - Description
+     - Data Size (Not Null, Uncompressed)
+     - Example
+     - Alias
+   * - ``BOOL``
+     - Boolean values (``true``, ``false``)
+     - 1 byte
+     - ``true``
+     - ``BIT``
+   * - ``TINYINT``
+     - Unsigned integer (0 - 255)
+     - 1 byte
+     - ``5``
+     - NA
+   * - ``SMALLINT``
+     - Integer (-32,768 - 32,767)
+     - 2 bytes
+     - ``-155``
+     - NA
+   * - ``INT``
+     - Integer (-2,147,483,648 - 2,147,483,647)
+     - 4 bytes
+     - ``1648813``
+     - ``INTEGER``
+   * - ``BIGINT``
+     - Integer (-9,223,372,036,854,775,808 - 9,223,372,036,854,775,807)
+     - 8 bytes
+     - ``36124441255243``
+     - 
+   * - ``REAL``
+     - Floating point (inexact)
+     - 4 bytes
+     - ``3.141``
+     - NA
+   * - ``DOUBLE``
+     - Floating point (inexact)
+     - 8 bytes
+     - ``0.000003``
+     - ``FLOAT``/``DOUBLE PRECISION``
+   * - ``TEXT (n)``
+     - Variable length string - UTF-8 unicode
+     - Up to ``4`` bytes
+     - ``'Kiwis have tiny wings, but cannot fly.'``
+     - ``CHAR VARYING``, ``CHAR``, ``CHARACTER VARYING``, ``CHARACTER``, ``NATIONAL CHARACTER VARYING``, ``NATIONAL CHARACTER``, ``NCHAR VARYING``, ``NCHAR``, ``NATIONAL CHAR``, ``NATIONAL CHAR VARYING``
+   * - ``NUMERIC``
+     -  38 digits
+     - 16 bytes
+     - ``0.12324567890123456789012345678901234567``
+     - ``DECIMAL``, ``NUMBER``
+   * - ``DATE``
+     - Date
+     - 4 bytes
+     - ``'1955-11-05'``
+     - NA
+   * - ``DATETIME``
+     - Date and time pairing in UTC
+     - 8 bytes
+     - ``'1955-11-05 01:24:00.000'``
+     -  ``TIMESTAMP``
+   * - ``DATETIME2``
+     - Date and time in nanosecond precision including UTC offset
+     - 16 bytes
+     - ``'1955-11-05 01:24:00.000000000 +00:00'``
+     -  NA
+
diff --git a/data_type_guides/sql_data_types_string.rst b/data_type_guides/sql_data_types_string.rst
index beb970b8d..c8986a088 100644
--- a/data_type_guides/sql_data_types_string.rst
+++ b/data_type_guides/sql_data_types_string.rst
@@ -1,56 +1,35 @@
 .. _sql_data_types_string:
 
-*************************
+******
 String
-*************************
-``TEXT`` and ``VARCHAR`` are types designed for storing text or strings of characters.
+******
 
-SQream separates ASCII (``VARCHAR``) and UTF-8 representations (``TEXT``).
-
-.. note:: The data type ``NVARCHAR`` has been deprecated by ``TEXT`` as of version 2020.1.
-
-String Types
-^^^^^^^^^^^^^^^^^^^^^^
-The following table describes the String types:
-
-.. list-table:: 
-   :widths: auto
-   :header-rows: 1
-   
-   * - Name
-     - Details
-     - Data Size (Not Null, Uncompressed)
-     - Example
-   * - ``TEXT [(n)]``, ``NVARCHAR (n)``
-     - Varaiable length string - UTF-8 unicode. ``NVARCHAR`` is synonymous with ``TEXT``.
-     - Up to ``4*n`` bytes
-     - ``'キウイは楽しい鳥です'``
-   * - ``VARCHAR (n)``
-     - Variable length string - ASCII only
-     - ``n`` bytes
-     - ``'Kiwis have tiny wings, but cannot fly.'``
+``TEXT`` is designed for storing text or strings of characters. SQreamDB blocks non-UTF8 string inputs. 
 
 Length
-^^^^^^^^^
-When using ``TEXT``, specifying a size is optional. If not specified, the text field carries no constraints. To limit the size of the input, use ``VARCHAR(n)`` or ``TEXT(n)``, where ``n`` is the permitted number of characters.
+^^^^^^
+
+When using ``TEXT``, specifying a size is optional. If not specified, the text field carries no constraints. To limit the size of the input, use ``TEXT(n)``, where ``n`` is the permitted number of characters.
 
 The following apply to setting the String type length:
 
-* If the data exceeds the column length limit on ``INSERT`` or ``COPY`` operations, SQream DB will return an error.
-* When casting or converting, the string has to fit in the target. For example, ``'Kiwis are weird birds' :: VARCHAR(5)`` will return an error. Use ``SUBSTRING`` to truncate the length of the string.
-* ``VARCHAR`` strings are padded with spaces.
+* If the data exceeds the column length limit on ``INSERT`` or ``COPY`` operations, SQreamDB will return an error.
+* When casting or converting, the string has to fit in the target. For example, ``'Kiwis are weird birds' :: TEXT(5)`` will return an error. Use ``SUBSTRING`` to truncate the length of the string.
 
 Syntax
-^^^^^^^^
+^^^^^^
+
 String types can be written with standard SQL string literals, which are enclosed with single quotes, such as
 ``'Kiwi bird'``. To include a single quote in the string, use double quotations, such as ``'Kiwi bird''s wings are tiny'``. String literals can also be dollar-quoted with the dollar sign ``$``, such as ``$$Kiwi bird's wings are tiny$$`` is the same as ``'Kiwi bird''s wings are tiny'``.
 
 Size
-^^^^^^
-``VARCHAR(n)`` can occupy up to *n* bytes, whereas ``TEXT(n)`` can occupy up to *4*n* bytes. However, the size of strings is variable and is compressed by SQream.
+^^^^
+
+``TEXT(n)`` can occupy up to *4*n* bytes. However, the size of strings is variable and is compressed by SQreamDB.
 
 String Examples
-^^^^^^^^^^
+^^^^^^^^^^^^^^^
+
 The following is an example of the String syntax: 
 
 .. code-block:: postgres
@@ -73,7 +52,8 @@ The following is an example of the correct output:
 .. note:: Most clients control the display precision of floating point numbers, and values may appear differently in some clients.
 
 String Casts and Conversions
-^^^^^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The following table shows the possible String value conversions:
 
 .. list-table:: 
diff --git a/data_type_guides/supported_casts.rst b/data_type_guides/supported_casts.rst
index 34053b647..4c0bb08f2 100644
--- a/data_type_guides/supported_casts.rst
+++ b/data_type_guides/supported_casts.rst
@@ -3,6 +3,7 @@
 *************************
 Supported Casts
 *************************
+
 The **Supported Casts** section describes supported casts for the following types:
 
 .. toctree::
@@ -14,4 +15,5 @@ The **Supported Casts** section describes supported casts for the following type
    sql_data_types_integer
    sql_data_types_floating_point
    sql_data_types_string
-   sql_data_types_date
\ No newline at end of file
+   sql_data_types_date
+   sql_data_types_datetime2
diff --git a/data_type_guides/supported_data_types.rst b/data_type_guides/supported_data_types.rst
index 2743b054b..26951cee9 100644
--- a/data_type_guides/supported_data_types.rst
+++ b/data_type_guides/supported_data_types.rst
@@ -1,80 +1,17 @@
 .. _supported_data_types:
 
-*************************
+********************
 Supported Data Types
-*************************
-The **Supported Data Types** page describes SQream's supported data types:
+********************
 
-The following table shows the supported data types.
+Data types define the type of data that a column can hold in a table. They ensure that data is handled accurately and efficiently. Common data types include integers, decimals, characters, and dates. For example, an ``INT`` data type is used for whole numbers, ``TEXT`` for variable-length character strings, and ``DATE`` for date values. 
 
-.. list-table::
-   :widths: 20 15 20 30 20
-   :header-rows: 1
-   
-   * - Name
-     - Description
-     - Data Size (Not Null, Uncompressed)
-     - Example
-     - Alias
-   * - ``BOOL``
-     - Boolean values (``true``, ``false``)
-     - 1 byte
-     - ``true``
-     - ``BIT``
-   * - ``TINYINT``
-     - Unsigned integer (0 - 255)
-     - 1 byte
-     - ``5``
-     - NA
-   * - ``SMALLINT``
-     - Integer (-32,768 - 32,767)
-     - 2 bytes
-     - ``-155``
-     - NA
-   * - ``INT``
-     - Integer (-2,147,483,648 - 2,147,483,647)
-     - 4 bytes
-     - ``1648813``
-     - ``INTEGER``
-   * - ``BIGINT``
-     - Integer (-9,223,372,036,854,775,808 - 9,223,372,036,854,775,807)
-     - 8 bytes
-     - ``36124441255243``
-     - ``NUMBER``
-   * - ``REAL``
-     - Floating point (inexact)
-     - 4 bytes
-     - ``3.141``
-     - NA
-   * - ``DOUBLE``
-     - Floating point (inexact)
-     - 8 bytes
-     - ``0.000003``
-     - ``FLOAT``/``DOUBLE PRECISION``
-   * - ``TEXT [(n)]``, ``NVARCHAR (n)``
-     - Variable length string - UTF-8 unicode
-     - Up to ``4*n`` bytes
-     - ``'キウイは楽しい鳥です'``
-     - ``CHAR VARYING``, ``CHAR``, ``CHARACTER VARYING``, ``CHARACTER``, ``NATIONAL CHARACTER VARYING``, ``NATIONAL CHARACTER``, ``NCHAR VARYING``, ``NCHAR``, ``NVARCHAR``
-   * - ``NUMERIC``
-     -  38 digits
-     - 16 bytes
-     - ``0.123245678901234567890123456789012345678``
-     - ``DECIMAL``
-   * - ``VARCHAR (n)``
-     - Variable length string - ASCII only
-     - ``n`` bytes
-     - ``'Kiwis have tiny wings, but cannot fly.'``
-     - ``SQL VARIANT``
-   * - ``DATE``
-     - Date
-     - 4 bytes
-     - ``'1955-11-05'``
-     - NA
-   * - ``DATETIME``
-     - Date and time pairing in UTC
-     - 8 bytes
-     - ``'1955-11-05 01:24:00.000'``
-     -  ``TIMESTAMP``, ``DATETIME2``
 
-.. note:: SQream compresses all columns and types. The data size noted is the maximum data size allocation for uncompressed data.
\ No newline at end of file
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+
+
+   sql_data_types_primitives
+   sql_data_type_array
diff --git a/external_storage_platforms/azure.rst b/external_storage_platforms/azure.rst
new file mode 100644
index 000000000..42a95f20f
--- /dev/null
+++ b/external_storage_platforms/azure.rst
@@ -0,0 +1,51 @@
+.. _azure:
+
+***********************
+Azure Blob Storage
+***********************
+
+Azure Blob Storage (ABS) is a scalable object storage solution within Microsoft Azure, designed to store and manage vast amounts of unstructured data.
+
+ABS Bucket File Location
+=================================
+
+ABS syntax to be used for specifying a single or multiple file location within an ABS bucket:
+
+.. code-block:: sql
+ 
+	azure://accountname.core.windows.net/path
+
+Connection String
+===================
+
+Connection String Example:
+
+.. code-block:: json
+
+	"DefaultEndpointsProtocol=https;AccountName=myaccount101;AccountKey=#######################################==;EndpointSuffix=core.windows.net"
+
+Use the following parameters within your SQreamDB legacy configuration file for authentication:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - Description
+   * - ``DefaultEndpointsProtocol``
+     - Specifies the protocol (e.g., https or http) used for accessing the storage service
+   * - ``AccountName``
+     - Represents the unique name of your Azure Storage account
+   * - ``AccountKey``
+     - Acts as the primary access key for securely authenticating and accessing resources within the storage account
+   * - ``EndpointSuffix``
+     - Denotes the Azure Storage service endpoint suffix for a specific region or deployment, such as ``.core.windows.net``
+
+
+   
+Examples
+============
+
+.. code-block::
+
+	COPY table_name FROM WRAPPER csv_fdw OPTIONS(location = 'azure://sqreamrole.core.windows.net/sqream-demo-data/file.csv');
\ No newline at end of file
diff --git a/external_storage_platforms/gcp.rst b/external_storage_platforms/gcp.rst
new file mode 100644
index 000000000..32fc23994
--- /dev/null
+++ b/external_storage_platforms/gcp.rst
@@ -0,0 +1,107 @@
+.. _gcp:
+
+***********************
+Google Cloud Platform
+***********************
+
+Ingesting data using Google Cloud Platform (GCP) requires configuring Google Cloud Storage (GCS) bucket access. You may configure SQreamDB to separate source and destination by granting read access to one bucket and write access to a different bucket. Such separation requires that each bucket be individually configured.
+
+
+GCP Bucket File Location
+=================================
+
+GCP syntax to be used for specifying a single or multiple file location within a GCP bucket:
+
+.. code-block:: sql
+ 
+	gs:////
+   
+GCP Access
+====================
+
+Before You Begin
+----------------
+
+It is essential that you have a GCP service account string.
+
+String example:
+
+.. code-block::
+
+	sample_service_account@sample_project.iam.gserviceaccount.com
+
+Granting GCP Access
+---------------------
+
+#. In your Google Cloud console, go to **Select a project** and select the desired project.
+
+#. From the **PRODUCTS** menu, select **Cloud Storage** > **Buckets**.
+
+#. Select the bucket you wish to configure; or create a new bucket by selecting **CREATE** and following the **Create a bucket** procedure, and select the newly created bucket.
+
+#. Select **UPLOAD FILES** and upload the data files you wish SQreamDB to ingest.
+
+#. Go to **PERMISSIONS** and select **GRANT ACCESS**.
+
+#. Under **Add principals**, in the **New principals** box, paste your service account string.
+
+#. Under **Assign roles**, in the **Select a role** box, select **Storage Admin**.
+
+#. Select **ADD ANOTHER ROLE** and in the newly created **Select a role** box, select **Storage Object Admin**.
+
+#. Select **SAVE**.
+
+.. note::
+
+	Optimize access time to your data by configuring the location of your bucket according to `Google Cloud location considerations `_.
+   
+   
+   
+Examples
+============
+
+Using the ``COPY FROM`` command:
+
+.. code-block:: sql
+
+	CREATE TABLE nba
+	  (
+	    name     TEXT,
+	    team     TEXT,
+	    number   TEXT,
+	    position TEXT,
+	    age      TEXT,
+	    height   TEXT,
+	    weight   TEXT,
+	    college  TEXT,
+	    salary   TEXT
+	  ); 
+
+.. code-block:: sql
+
+	COPY nba FROM
+	WRAPPER csv_fdw
+	OPTIONS(location = 'gs://blue_docs/nba.csv');
+	
+Using the ``CREATE FOREIGN TABLE`` command:
+
+.. code-block:: sql
+
+	CREATE FOREIGN TABLE nba
+	(
+	  Name       TEXT,
+	  Team       TEXT,
+	  Number     TEXT,
+	  Position   TEXT,
+	  Age        TEXT,
+	  Height     TEXT,
+	  Weight     TEXT,
+	  College    TEXT,
+	  Salary     TEXT
+	 )
+	 WRAPPER csv_fdw
+	 OPTIONS
+	 (
+	   LOCATION =  'gs://blue_docs/nba.csv'
+	 );
+    
diff --git a/external_storage_platforms/hdfs.rst b/external_storage_platforms/hdfs.rst
new file mode 100644
index 000000000..bf4e59cc6
--- /dev/null
+++ b/external_storage_platforms/hdfs.rst
@@ -0,0 +1,238 @@
+.. _hdfs:
+
+HDFS Environment
+================
+
+.. _configuring_an_hdfs_environment_for_the_user_sqream:
+
+Configuring an HDFS Environment for the User **sqream**
+-------------------------------------------------------
+
+This section describes how to configure an HDFS environment for the user **sqream** and is only relevant for users with an HDFS environment.
+
+**To configure an HDFS environment for the user sqream:**
+
+1. Open your **bash_profile** configuration file for editing:
+
+   .. code-block:: console
+     
+       vim /home/sqream/.bash_profile
+       
+
+   .. code-block:: console
+     
+      #PATH=$PATH:$HOME/.local/bin:$HOME/bin
+
+      #export PATH
+
+      # PS1
+      #MYIP=$(curl -s -XGET "http://ip-api.com/json" | python -c 'import json,sys; jstr=json.load(sys.stdin); print jstr["query"]')
+      #PS1="\[\e[01;32m\]\D{%F %T} \[\e[01;33m\]\u@\[\e[01;36m\]$MYIP \[\e[01;31m\]\w\[\e[37;36m\]\$ \[\e[1;37m\]"
+
+      SQREAM_HOME=/usr/local/sqream
+      export SQREAM_HOME
+
+      export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk
+      export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop
+      export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob`
+      export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native
+      export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR
+
+
+      PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin
+      export PATH
+
+2. Verify that the edits have been made:
+
+   .. code-block:: console
+     
+      source /home/sqream/.bash_profile
+       
+3. Check if you can access Hadoop from your machine:       
+       
+  .. code-block:: console
+     
+     hadoop fs -ls hdfs://:8020/
+      
+..
+   Comment: - 
+   **NOTICE:** If you cannot access Hadoop from your machine because it uses Kerberos, see `Connecting a SQream Server to Cloudera Hadoop with Kerberos `_
+
+
+4. Verify that an HDFS environment exists for SQream services:
+
+   .. code-block:: console
+     
+      $ ls -l /etc/sqream/sqream_env.sh
+	  
+.. _step_6:
+
+      
+5. If an HDFS environment does not exist for SQream services, create one (sqream_env.sh):
+   
+   .. code-block:: console
+     
+      #!/bin/bash
+
+      SQREAM_HOME=/usr/local/sqream
+      export SQREAM_HOME
+
+      export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk
+      export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop
+      export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob`
+      export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native
+      export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR
+
+
+      PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin
+      export PATH
+
+.. _authenticate_hadoop_servers_that_require_kerberos:
+
+Authenticating Hadoop Servers that Require Kerberos
+---------------------------------------------------
+
+If your Hadoop server requires Kerberos authentication, do the following:
+
+1. Create a principal for the user **sqream**.
+
+   .. code-block:: console
+   
+      kadmin -p root/admin@SQ.COM
+      addprinc sqream@SQ.COM
+      
+2. If you do not know yor Kerberos root credentials, connect to the Kerberos server as a root user with ssh and run:
+
+   .. code-block:: console
+   
+      kadmin.local
+      
+   Running ``kadmin.local`` does not require a password.
+
+3. If a password is not required, change your password to ``sqream@SQ.COM``.
+
+   .. code-block:: console
+   
+      change_password sqream@SQ.COM
+
+4. Connect to the hadoop name node using ssh:
+
+   .. code-block:: console
+   
+      cd /var/run/cloudera-scm-agent/process
+
+5. Check the most recently modified content of the directory above:
+
+   .. code-block:: console
+   
+      ls -lrt
+
+6. Look for a recently updated folder containing the text **hdfs**.
+
+   The following is an example of the correct folder name:
+
+   .. code-block:: console
+   
+      cd -hdfs-
+	  
+   This folder should contain a file named **hdfs.keytab** or a similar ``.keytab`` file.
+   
+
+7. Copy the ``.keytab`` file to user **sqream**'s Home directory on the remote machines that you are planning to use Hadoop on.
+
+8. Copy the following files to the ``sqream sqream@server:/hdfs/hadoop/etc/hadoop:`` directory:
+
+   * core-site.xml
+   * hdfs-site.xml
+
+9. Connect to the sqream server and verify that the ``.keytab`` file's owner is a user sqream and is granted the correct permissions:
+
+   .. code-block:: console
+   
+      sudo chown sqream:sqream /home/sqream/hdfs.keytab
+      sudo chmod 600 /home/sqream/hdfs.keytab
+
+10. Log into the sqream server.
+
+11. Log in as the user **sqream**.
+
+12. Navigate to the Home directory and check the name of a Kerberos principal represented by the following ``.keytab`` file:
+
+   .. code-block:: console
+   
+      klist -kt hdfs.keytab
+
+   The following is an example of the correct output:
+
+   .. code-block:: console
+   
+      sqream@Host-121 ~ $ klist -kt hdfs.keytab
+      Keytab name: FILE:hdfs.keytab
+      KVNO Timestamp           Principal
+      ---- ------------------- ------------------------------------------------------
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+         5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM
+
+13. Verify that the hdfs service named **hdfs/nn1@SQ.COM** is shown in the generated output above.
+
+14. Run the following:
+
+   .. code-block:: console
+   
+      kinit -kt hdfs.keytab hdfs/nn1@SQ.COM
+
+15. Check the output:
+  
+   .. code-block:: console
+   
+      klist
+      
+   The following is an example of the correct output:
+
+   .. code-block:: console
+   
+      Ticket cache: FILE:/tmp/krb5cc_1000
+      Default principal: sqream@SQ.COM
+      
+      Valid starting       Expires              Service principal
+      09/16/2020 13:44:18  09/17/2020 13:44:18  krbtgt/SQ.COM@SQ.COM
+
+16. List the files located at the defined server name or IP address:
+
+   .. code-block:: console
+   
+      hadoop fs -ls hdfs://:8020/
+
+17. Do one of the following:
+
+    * If the list below is output, continue with the next step.
+    * If the list is not output, verify that your environment has been set up correctly.
+	
+  If any of the following are empty, verify that you followed :ref:`Step 6 ` in the **Configuring an HDFS Environment for the User sqream** section above correctly:
+
+  .. code-block:: console
+   
+      echo $JAVA_HOME
+      echo $SQREAM_HOME
+      echo $CLASSPATH
+      echo $HADOOP_COMMON_LIB_NATIVE_DIR
+      echo $LD_LIBRARY_PATH
+      echo $PATH
+
+18. Verify that you copied the correct keytab file.
+
+19. Review this procedure to verify that you have followed each step.
diff --git a/external_storage_platforms/index.rst b/external_storage_platforms/index.rst
new file mode 100644
index 000000000..ec8aaf2d0
--- /dev/null
+++ b/external_storage_platforms/index.rst
@@ -0,0 +1,26 @@
+.. _external_storage_platforms:
+
+**************************
+External Storage Platforms
+**************************
+
+SQream supports the following external storage platforms:
+
+.. toctree::
+   :maxdepth: 1
+   :titlesonly:
+
+   azure
+   gcp
+   hdfs
+   s3
+   
+
+   
+For more information, see the following:
+
+* :ref:`foreign_tables`
+   
+* :ref:`copy_from`
+   
+* :ref:`copy_to`
diff --git a/external_storage_platforms/nba-t10.csv b/external_storage_platforms/nba-t10.csv
new file mode 100644
index 000000000..c9816d654
--- /dev/null
+++ b/external_storage_platforms/nba-t10.csv
@@ -0,0 +1,10 @@
+Name,Team,Number,Position,Age,Height,Weight,College,Salary
+Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337
+Jae Crowder,Boston Celtics,99,SF,25,6-Jun,235,Marquette,6796117
+John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,
+R.J. Hunter,Boston Celtics,28,SG,22,5-Jun,185,Georgia State,1148640
+Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000
+Amir Johnson,Boston Celtics,90,PF,29,9-Jun,240,,12000000
+Jordan Mickey,Boston Celtics,55,PF,21,8-Jun,235,LSU,1170960
+Kelly Olynyk,Boston Celtics,41,C,25,Jul-00,238,Gonzaga,2165160
+Terry Rozier,Boston Celtics,12,PG,22,2-Jun,190,Louisville,1824360
diff --git a/operational_guides/s3.rst b/external_storage_platforms/s3.rst
similarity index 54%
rename from operational_guides/s3.rst
rename to external_storage_platforms/s3.rst
index bba878830..39aaec59f 100644
--- a/operational_guides/s3.rst
+++ b/external_storage_platforms/s3.rst
@@ -1,79 +1,88 @@
 .. _s3:
 
 ***********************
-Amazon S3
+Amazon Web Services
 ***********************
 
-SQream uses a native S3 connector for inserting data. The ``s3://`` URI specifies an external file path on an S3 bucket. File names may contain wildcard characters, and the files can be in CSV or columnar format, such as Parquet and ORC.
-
-The **Amazon S3** describes the following topics:
-
-.. contents::
-   :local:
+SQreamDB uses a native Amazon Simple Storage Services (S3) connector for inserting data.
    
-S3 Configuration
-==============================
-
-Any database host with access to S3 endpoints can access S3 without any configuration. To read files from an S3 bucket, the database must have listable files.
-
-S3 URI Format
-===============
-
-With S3, specify a location for a file (or files) when using :ref:`copy_from` or :ref:`external_tables`.
+S3 Bucket File Location
+========================
 
-The following is an example of the general S3 syntax:
+S3 syntax to be used for specifying a single or multiple file location within an S3 bucket:
 
 .. code-block:: console
  
-   s3://bucket_name/path
-
-Authentication
-=================
-
-SQream supports ``AWS ID`` and ``AWS SECRET`` authentication. These should be specified when executing a statement.
+   s3://bucket_name/path   
+   
+S3 Access 
+======================
 
-Examples
-==========
+A best practice for granting access to AWS S3 is by creating an `Identity and Access Management (IAM) `_ user account. If creating an IAM user account is not possible, you may follow AWS guidelines for `using the global configuration object `_ and setting an `AWS region `_.
 
-Use a foreign table to stage data from S3 before loading from CSV, Parquet, or ORC files.
+Authentication
+==============
 
-The **Examples** section includes the following examples:
+After being granted access to an S3 bucket, you'll be able to execute statements using the ``AWS ID`` and ``AWS SECRET`` parameters for authentication.
 
-.. contents::
-   :local:
-   :depth: 1
+Connecting to S3 Using SQreamDB Legacy Configuration File
+=========================================================
 
 
+You may use the following parameters within your SQreamDB legacy configuration file:
 
-Planning for Data Staging
---------------------------------
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+   
+   * - Parameter
+     - Description
+     - Parameter Value
+     - Example
+   * - ``AwsEndpointOverride``
+     - Overrides the AWS S3 HTTP endpoint when using Virtual Private Cloud (VPC)
+     - ``URL``
+	   Default: None
+     - .. code-block::
+	 
+			sqream_config_legacy.json:
+			{
+			  ...,	
+			  "AwsEndpointOverride": "https://my.endpoint.local"
+			}		   
+   * - ``AwsObjectAccessStyle``
+     - Enables configuration of S3 object access styles, which determine how you can access and interact with the objects stored in an S3 bucket
+     - ``virtual-host`` or ``path``. Default is ``virtual-host``
+     - .. code-block::
+	 
+			sqream_config_legacy.json:
+			{
+			  ...,
+			  "AwsObjectAccessStyle": "path"
+			}
 
-The examples in this section are based on a CSV file, as shown in the following table:
 
-.. csv-table:: nba.csv
-   :file: ../nba-t10.csv
-   :widths: auto
-   :header-rows: 1 
 
-The file is stored on Amazon S3, and this bucket is public and listable. To create a matching ``CREATE FOREIGN TABLE`` statement you can make note of the file structure.
+Examples
+========
 
 Creating a Foreign Table
------------------------------
+------------------------
 
 Based on the source file's structure, you can create a foreign table with the appropriate structure, and point it to your file as shown in the following example:
 
-.. code-block:: postgres
+.. code-block:: sql
    
    CREATE FOREIGN TABLE nba
    (
-      Name varchar(40),
-      Team varchar(40),
+      Name text(40),
+      Team text(40),
       Number tinyint,
-      Position varchar(2),
+      Position text(2),
       Age tinyint,
-      Height varchar(4),
+      Height text(4),
       Weight real,
-      College varchar(40),
+      College text(40),
       Salary float
     )
     WRAPPER csv_fdw
@@ -86,13 +95,8 @@ Based on the source file's structure, you can create a foreign table with the ap
 
 In the example above the file format is CSV, and it is stored as an S3 object. If the path is on HDFS, you must change the URI accordingly. Note that the record delimiter is a DOS newline (``\r\n``).
 
-For more information, see the following:
-
-* **Creating a foreign table** - see :ref:`create a foreign table`.
-* **Using SQream in an HDFS environment** - see :ref:`hdfs`.
-
 Querying Foreign Tables
-------------------------------
+-----------------------
 
 The following shows the data in the foreign table:
 
@@ -113,26 +117,22 @@ The following shows the data in the foreign table:
    Marcus Smart  | Boston Celtics |     36 | PG       |  22 | 6-4    |    220 | Oklahoma State    |  3431040
    
 Bulk Loading a File from a Public S3 Bucket
-----------------------------------------------
+-------------------------------------------
 
 The ``COPY FROM`` command can also be used to load data without staging it first.
 
-.. note:: The bucket must be publicly available and objects can be listed.
-
-The following is an example of bulk loading a file from a public S3 bucket:
+The bucket must be publicly available and objects must be listed.
 
 .. code-block:: postgres
 
    COPY nba FROM 's3://sqream-demo-data/nba.csv' WITH OFFSET 2 RECORD DELIMITER '\r\n';
-   
-For more information on the ``COPY FROM`` command, see :ref:`copy_from`.
+  
 
 Loading Files from an Authenticated S3 Bucket
 ---------------------------------------------------
-The following is an example of loading fles from an authenticated S3 bucket:
 
 .. code-block:: postgres
 
    COPY nba FROM 's3://secret-bucket/*.csv' WITH OFFSET 2 RECORD DELIMITER '\r\n' 
    AWS_ID '12345678'
-   AWS_SECRET 'super_secretive_secret';
+   AWS_SECRET 'super_secretive_secret';
\ No newline at end of file
diff --git a/feature_guides/.DS_Store b/feature_guides/.DS_Store
new file mode 100644
index 000000000..5008ddfcf
Binary files /dev/null and b/feature_guides/.DS_Store differ
diff --git a/feature_guides/compression.rst b/feature_guides/compression.rst
index 710641036..21e39b082 100644
--- a/feature_guides/compression.rst
+++ b/feature_guides/compression.rst
@@ -1,63 +1,61 @@
 .. _compression:
 
-***********************
+*********************** 
 Compression
 ***********************
 
-SQream DB uses compression and encoding techniques to optimize query performance and save on disk space.
+SQreamDB uses a variety of compression and encoding methods to optimize query performance and to save disk space.
+
+.. contents:: 
+   :local:
+   :depth: 1
 
 Encoding
 =============
 
-Encoding converts data into a common format.
-
-When data is stored in a columnar format, it is often in a common format. This is in contrast with data stored in CSVs for example, where everything is stored in a text format.
-
-Because encoding uses specific data formats and encodings, it increases performance and reduces data size. 
-
-SQream DB encodes data in several ways depending on the data type. For example, a date is stored as an integer, with March 1st 1CE as the start. This is a lot more efficient than encoding the date as a string, and offers a wider range than storing it relative to the Unix Epoch. 
-
-Compression
-==============
+**Encoding** is an automatic operation used to convert data into common formats. For example, certain formats are often used for data stored in columnar format, in contrast with data stored in a CSV file, which stores all data in text format.
 
-Compression transforms data into a smaller format without losing accuracy (lossless).
+Encoding enhances performance and reduces data size by using specific data formats and encoding methods. SQream encodes data in a number of ways in accordance with the data type. For example, a **date** is stored as an **integer**, starting with **March 1st 1CE**, which is significantly more efficient than encoding the date as a string. In addition, it offers a wider range than storing it relative to the Unix Epoch. 
 
-After encoding a set of column values, SQream DB packs the data and compresses it.
+Lossless Compression
+=======================
 
-Before data can be accessed, SQream DB needs to decompress it.
+**Compression** transforms data into a smaller format without sacrificing accuracy, known as **lossless compression**.
 
-Depending on the compression scheme, the operations can be performed on the CPU or the GPU. Some users find that GPU compressions perform better for their data.
+After encoding a set of column values, SQream packs the data and compresses it and decompresses it to make it accessible to users. Depending on the compression scheme used, these operations can be performed on the CPU or the GPU. Some users find that GPU compression provide better performance.
 
-Automatic compression
+Automatic Compression
 ------------------------
 
-By default, SQream DB automatically compresses every column (see :ref:`Specifying compressions` below for overriding default compressions). This feature is called **automatic adaptive compression** strategy.
+By default, SQream automatically compresses every column (see :ref:`Specifying Compression Strategies` below for overriding default compression). This feature is called **automatic adaptive compression** strategy.
 
-When loading data, SQream DB automatically decides on the compression schemes for specific chunks of data by trying several compression schemes and selecting the one that performs best. SQream DB tries to balance more agressive compressions with the time and CPU/GPU time required to compress and decompress the data.
+When loading data, SQreamDB automatically decides on the compression schemes for specific chunks of data by trying several compression schemes and selecting the one that performs best. SQreamDB tries to balance more aggressive compression with the time and CPU/GPU time required to compress and decompress the data.
 
-Compression strategies
+Compression Methods
 ------------------------
 
+The following table shows the supported compression methods:
+
 .. list-table:: 
    :widths: auto
    :header-rows: 1
 
-   * - Compression name
-     - Supported data types
+   * - Compression Method
+     - Supported Data Types
      - Description
      - Location
    * - ``FLAT``
      - All types
      - No compression (forced)
-     - -
+     - NA
    * - ``DEFAULT``
      - All types
      - Automatic scheme selection
-     - -
+     - NA
    * - ``DICT``
-     - Integer types, dates and timestamps, short texts
+     - All types
      - 
-         Dictionary compression with RLE. For each chunk, SQream DB creates a dictionary of distinct values and stores only their indexes.
+         Dictionary compression with RLE. For each chunk, SQreamDB creates a dictionary of distinct values and stores only their indexes.
          
          Works best for integers and texts shorter than 120 characters, with <10% unique values.
          
@@ -66,7 +64,7 @@ Compression strategies
          If the data is optionally sorted, this compression will perform even better.
      - GPU
    * - ``P4D``
-     - Integer types, dates and timestamps
+     - ``Integer``, ``dates``, ``timestamps``, and ``float`` types
      - 
          Patched frame-of-reference + Delta 
          
@@ -82,33 +80,39 @@ Compression strategies
      - General purpose compression, used for texts
      - CPU
    * - ``RLE``
-     - Integer types, dates and timestamps
-     - Run-length encoding. This replaces sequences of values with a single pair. It is best for low cardinality columns that are used to sort data (``ORDER BY``).
+     - ``integer`` types, ``dates``, ``timestamps``, and ``text``
+     - Run-Length Encoding. This replaces sequences of values with a single pair. It is best for low cardinality columns that are used to sort data (``ORDER BY``).
      - GPU
    * - ``SEQUENCE``
-     - Integer types
+     - ``Integer``, ``date``, and ``timestamp``
      - Optimized RLE + Delta type for built-in :ref:`identity columns`. 
      - GPU
 
+	
+
 .. _specifying_compressions:
 
-Specifying compression strategies
+Specifying Compression Strategies
 ----------------------------------
 
-When creating a table without any compression specifications, SQream DB defaults to automatic adaptive compression (``"default"``).
+When you create a table without defining any compression specifications, SQream defaults to automatic adaptive compression (``"default"``). However, you can prevent this by specifying a compression strategy when creating a table.
 
-However, this can be overriden by specifying a compression strategy when creating a table.
+This section describes the following compression strategies:
 
-Explicitly specifying automatic compression
+.. contents:: 
+   :local:
+   :depth: 1
+
+Explicitly Specifying Automatic Compression
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-The following two are equivalent:
+When you explicitly specify automatic compression, the following two are equivalent:
 
 .. code-block:: postgres
    
    CREATE TABLE t (
       x INT,
-      y VARCHAR(50)
+      y TEXT(50)
    );
 
 In this version, the default compression is specified explicitly:
@@ -117,47 +121,54 @@ In this version, the default compression is specified explicitly:
    
    CREATE TABLE t (
       x INT CHECK('CS "default"'),
-      y VARCHAR(50) CHECK('CS "default"')
+      y TEXT(50) CHECK('CS "default"')
    );
 
-Forcing no compression (flat)
+Forcing No Compression
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-In some cases, you may wish to remove compression entirely on some columns,
-in order to reduce CPU or GPU resource utilization at the expense of increased I/O.
+**Forcing no compression** is also known as "flat", and can be used in the event that you want to remove compression entirely on some columns. This may be useful for reducing CPU or GPU resource utilization at the expense of increased I/O.
+
+The following is an example of removing compression:
 
 .. code-block:: postgres
    
    CREATE TABLE t (
       x INT NOT NULL CHECK('CS "flat"'), -- This column won't be compressed
-      y VARCHAR(50) -- This column will still be compressed automatically
+      y TEXT(50) -- This column will still be compressed automatically
    );
 
-
-Forcing compressions
+Forcing Compression
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-In some cases, you may wish to force SQream DB to use a specific compression scheme based
-on your knowledge of the data. 
-
-For example:
+In other cases, you may want to force SQream to use a specific compression scheme based on your knowledge of the data, as shown in the following example:
 
 .. code-block:: postgres
    
    CREATE TABLE t (
       id BIGINT NOT NULL CHECK('CS "sequence"'),
-      y VARCHAR(110) CHECK('CS "lz4"'), -- General purpose text compression
-      z VARCHAR(80) CHECK('CS "dict"'), -- Low cardinality column
+      y TEXT(110) CHECK('CS "lz4"'), -- General purpose text compression
+      z TEXT(80) CHECK('CS "dict"'), -- Low cardinality column
       
    );
 
+However, if SQream finds that the given compression method cannot effectively compress the data, it will return to the default compression type.
 
-Examining compression effectiveness
+Examining Compression Effectiveness
 --------------------------------------
 
-Queries to the internal metadata catalog can expose how effective the compression is, as well as what compression schemes were selected.
+Queries made on the internal metadata catalog can expose how effective the compression is, as well as what compression schemes were selected.
+
+This section describes the following:
+
+.. contents:: 
+   :local:
+   :depth: 1
 
-Here is a sample query we can use to query the catalog:
+Querying the Catalog
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following is a sample query that can be used to query the catalog:
 
 .. code-block:: postgres
    
@@ -178,7 +189,10 @@ Here is a sample query we can use to query the catalog:
       GROUP BY 1,
                2;
 
-Example (subset) from the ``ontime`` table:
+Example Subset from "Ontime" Table			   
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following is an example (subset) from the ``ontime`` table:
 
 .. code-block:: psql
    
@@ -268,43 +282,48 @@ Example (subset) from the ``ontime`` table:
    uniquecarrier             | dict               |     578221 |      7230705 |                     11.96 | default             
    year                      | rle                |          6 |      2065915 |                 317216.08 | default             
 
+Notes on Reading the "Ontime" Table
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Notes on reading this table:
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The following are some useful notes on reading the "Ontime" table shown above:
 
-#. Higher numbers in the *effectiveness* column represent better compressions. 0 represents a column that wasn't compressed at all.
+#. Higher numbers in the **Compression effectiveness** column represent better compressions. **0** represents a column that has **not been compressed**.
 
-#. Column names are the internal representation. Names with ``@null`` and ``@val`` suffixes represent a nullable column's null (boolean) and values respectively, but are treated as one logical column.
+    ::
 
+#. Column names are an internal representation. Names with ``@null`` and ``@val`` suffixes represent a nullable column's null (boolean) and values respectively, but are treated as one logical column.
+
+    ::
+	
 #. The query lists all actual compressions for a column, so it may appear several times if the compression has changed mid-way through the loading (as with the ``carrierdelay`` column).
 
-#. When ``default`` is the compression strategy, the system automatically selects the best compression. This can also mean no compression at all (``flat``).
+    ::
+	
+#. When your compression strategy is ``default``, the system automatically selects the best compression, including no compression at all (``flat``).
 
-Compression best practices
+Best Practices
 ==============================
 
-Let SQream DB decide on the compression strategy
-----------------------------------------------------
-
-In general, SQream DB will decide on the best compression strategy in most cases.
+This section describes the best compression practices:
 
-When overriding compression strategies, we recommend benchmarking not just storage size but also query and load performance.
+.. contents:: 
+   :local:
+   :depth: 1
+   
+Letting SQream Determine the Best Compression Strategy
+-------------------------------------------------------
 
+In general, SQream determines the best compression strategy for most cases. If you decide to override SQream's selected compression strategies, we recommend benchmarking your query and load performance **in addition to** your storage size.
 
-Maximize the advantage of each compression schemes
+Maximizing the Advantage of Each Compression Scheme
 -------------------------------------------------------
 
-Some compression schemes perform better when data is organized in a specific way.
-
-For example, to take advantage of RLE, sorting a column may result in better performance and reduced disk-space and I/O usage.
+Some compression schemes perform better when data is organized in a specific way. For example, to take advantage of RLE, sorting a column may result in better performance and reduced disk-space and I/O usage.
 Sorting a column partially may also be beneficial. As a rule of thumb, aim for run-lengths of more than 10 consecutive values.
 
-Choose data types that fit the data
+Choosing Data Types that Fit Your Data
 ---------------------------------------
 
-Adapting to the narrowest data type will improve query performance and also reduce disk space usage.
-However, smaller data types may compress better than larger types.
-
-For example, use the smallest numeric data type that will accommodate your data. Using ``BIGINT`` for data that fits in ``INT`` or ``SMALLINT`` can use more disk space and memory for query execution.
+Adapting to the narrowest data type improves query performance while reducing disk space usage. However, smaller data types may compress better than larger types.
 
-Using ``FLOAT`` to store integers will reduce compression's effectiveness significantly.
\ No newline at end of file
+For example, SQream recommends using the smallest numeric data type that will accommodate your data. Using ``BIGINT`` for data that fits in ``INT`` or ``SMALLINT`` can use more disk space and memory for query execution. Using ``FLOAT`` to store integers will reduce compression's effectiveness significantly.
\ No newline at end of file
diff --git a/feature_guides/concurrency_and_locks.rst b/feature_guides/concurrency_and_locks.rst
index e18dea015..af300b7c6 100644
--- a/feature_guides/concurrency_and_locks.rst
+++ b/feature_guides/concurrency_and_locks.rst
@@ -4,16 +4,16 @@
 Concurrency and Locks
 ***********************
 
-Locks are used in SQream DB to provide consistency when there are multiple concurrent transactions updating the database. 
+Locks are used in SQreamDB to provide consistency when there are multiple concurrent transactions updating the database. 
 
 Read only transactions are never blocked, and never block anything. Even if you drop a database while concurrently running a query on it, both will succeed correctly (as long as the query starts running before the drop database commits).
 
 .. _locking_modes:
 
-Locking modes
+Locking Modes
 ================
 
-SQream DB has two kinds of locks:
+SQreamDB has two kinds of locks:
 
 * 
    ``exclusive`` - this lock mode prevents the resource from being modified by other statements
@@ -27,7 +27,7 @@ SQream DB has two kinds of locks:
    
    This lock allows other statements to insert or delete data from a table, but they'll have to wait in order to run DDL.
 
-When are locks obtained?
+When are Locks Obtained?
 ============================
 
 .. list-table::
@@ -64,44 +64,29 @@ When are locks obtained?
 
 Statements that wait will exit with an error if they hit the lock timeout. The default timeout is 3 seconds, see ``statementLockTimeout``.
 
-Global locks
-----------------
-
-Some operations require exclusive global locks at the cluster level. These usually short-lived locks will be obtained for the following operations:
-
-   * :ref:`create_database`
-   * :ref:`create_role`
-   * :ref:`create_table`
-   * :ref:`alter_role`
-   * :ref:`alter_table`
-   * :ref:`drop_database`
-   * :ref:`drop_role`
-   * :ref:`drop_table`
-   * :ref:`grant`
-   * :ref:`revoke`
-
-Monitoring locks
+Monitoring Locks
 ===================
 
 Monitoring locks across the cluster can be useful when transaction contention takes place, and statements appear "stuck" while waiting for a previous statement to release locks.
 
 The utility :ref:`show_locks` can be used to see the active locks.
 
-In this example, we create a table based on results (:ref:`create_table_as`), but we are also effectively dropping the previous table (by using ``OR REPLACE`` which also :ref:`drops the table`). Thus, SQream DB applies locks during the table creation process to prevent the table from being altered during it's creation.
+In this example, we create a table based on results (:ref:`create_table_as`), but we are also effectively dropping the previous table (by using ``OR REPLACE`` which also :ref:`drops the table`). Thus, SQreamDB applies locks during the table creation process to prevent the table from being altered during it's creation.
 
 
 .. code-block:: psql
 
-   t=> SELECT SHOW_LOCKS();
-   statement_id | statement_string                                                                                | username | server       | port | locked_object                   | lockmode  | statement_start_time | lock_start_time    
-   -------------+-------------------------------------------------------------------------------------------------+----------+--------------+------+---------------------------------+-----------+----------------------+--------------------
-   287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | database$t                      | Inclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
-   287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | globalpermission$               | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
-   287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | schema$t$public                 | Inclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
-   287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | table$t$public$nba2$Insert      | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
-   287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | table$t$public$nba2$Update      | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
+	SELECT SHOW_LOCKS();
+	statement_id | statement_string                                                                                | username | server       | port | locked_object                   | lockmode  | statement_start_time | lock_start_time    
+	-------------+-------------------------------------------------------------------------------------------------+----------+--------------+------+---------------------------------+-----------+----------------------+--------------------
+	287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | database$t                      | Inclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
+	287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | globalpermission$               | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
+	287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | schema$t$public                 | Inclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
+	287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | table$t$public$nba2$Insert      | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
+	287          | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream   | 192.168.1.91 | 5000 | table$t$public$nba2$Update      | Exclusive | 2019-12-26 00:03:30  | 2019-12-26 00:03:30
 
-For more information on troubleshooting lock related issues, see 
+.. note:: A ``SUPERUSER`` can remove :ref:`unaccounted-for locks` and has the ability to :ref:`clear all locks` in the system.
 
 
 
+For more information on troubleshooting lock related issues, see :ref:`lock_related_issues`.
\ No newline at end of file
diff --git a/feature_guides/concurrency_and_scaling_in_sqream.rst b/feature_guides/concurrency_and_scaling_in_sqream.rst
deleted file mode 100644
index 0370913fa..000000000
--- a/feature_guides/concurrency_and_scaling_in_sqream.rst
+++ /dev/null
@@ -1,36 +0,0 @@
-.. _concurrency_and_scaling_in_sqream:
-
-***************************************
-Concurrency and Scaling in SQream DB
-***************************************
-
-A SQream DB cluster can concurrently run one regular statement per worker process. A number of small statements will execute alongside these statements without waiting or blocking anything.
-
-SQream DB supports ``n`` concurrent statements by having ``n`` workers in a cluster. Each worker uses a fixed slice of a GPU's memory, with usual values are around 8-16GB of GPU memory per worker. This size is ideal for queries running on large data with potentially large row sizes.
-
-Scaling when data sizes grow
---------------------------------
-
-For many statements, SQream DB scales linearly when adding more storage and querying on large data sets. It uses very optimised 'brute force' algorithms and implementations, which don't suffer from sudden performance cliffs at larger data sizes.
-
-Scaling when queries are queueing
----------------------------------------
-
-SQream DB scales well by adding more workers, GPUs, and nodes to support more concurrent statements.
-
-What to do when queries are slow
-----------------------------------
-
-Adding more workers or GPUs does not boost the performance of a single statement or query. 
-
-To boost the performance of a single statement, start by examining the :ref:`best practices` and ensure the guidelines are followed.
-
-.. TODO: we have a lot of techniques to speed up statements which aren't ready for customers to use without support - add something here and in the best practices about this
-
-Adding additional RAM to nodes, using more GPU memory, and faster CPUs or storage can also sometimes help.
-
-.. rubric:: Need help?
-
-Analyzing complex workloads can be challenging. SQream's experienced customer support has the experience to advise on these matters to ensure the best experience.
-
-Visit `SQream's support portal `_ for additional support.
diff --git a/feature_guides/data_encryption.rst b/feature_guides/data_encryption.rst
new file mode 100644
index 000000000..c38d126fc
--- /dev/null
+++ b/feature_guides/data_encryption.rst
@@ -0,0 +1,16 @@
+.. _data_encryption:
+
+***********************
+Data Encryption
+***********************
+The **Data Encryption** page describes the following:
+
+   
+.. toctree::
+   :maxdepth: 1
+   :titlesonly:
+
+   data_encryption_overview
+   data_encryption_methods
+   data_encryption_types
+   data_encryption_syntax
\ No newline at end of file
diff --git a/feature_guides/data_encryption_methods.rst b/feature_guides/data_encryption_methods.rst
new file mode 100644
index 000000000..250ca06de
--- /dev/null
+++ b/feature_guides/data_encryption_methods.rst
@@ -0,0 +1,17 @@
+.. _data_encryption_methods:
+
+***********************
+Encryption Methods
+***********************
+Data exists in one of following states and determines the encryption method:
+
+
+Encrypting Data in Transit
+----------------
+**Data in transit** refers to data you use on a regular basis, usually stored on a database and accessed through applications or programs. This data is typically transferred between several physical or remote locations through email or uploading documents to the cloud. This type of data must therefore be protected while **in transit**. SQream encrypts data in transit using SSL when, for example, users insert data files from external repositories over a JDBC or ODBC connection.
+
+For more information, see `Use TLS/SSL When Possible <../operational_guides/security.html#use-tls-ssl-when-possible>`_.
+
+Encrypting Data at Rest
+----------------
+**Data at rest** refers to data stored on your hard drive or on the cloud. Because this data can be potentially intercepted **physically**, it requires a form of encryption that protects your data wherever you store it. SQream facilitates encryption by letting you encrypt any column.
\ No newline at end of file
diff --git a/feature_guides/data_encryption_overview.rst b/feature_guides/data_encryption_overview.rst
new file mode 100644
index 000000000..3df4820a9
--- /dev/null
+++ b/feature_guides/data_encryption_overview.rst
@@ -0,0 +1,18 @@
+.. _data_encryption_overview:
+
+***********************
+Overview
+***********************
+**Data Encryption** helps protect sensitive data at rest by concealing it from unauthorized users in the event of a breach. This is achieved by scrambling the content into an unreadable format based on encryption and decryption keys. Typically speaking, this data pertains to **PII (Personally Identifiable Information)**, which is sensitive information such as credit card numbers and other information related to an identifiable person.
+
+Users encrypt their data on a column basis by specifying ``column_name`` in the encryption syntax.
+
+The demand for confidentiality has steadily increased to protect the growing volumes of private data stored on computer systems and transmitted over the internet. To this end, regulatory bodies such as the **General Data Protection Regulation (GDPR)** have produced requirements to standardize and enforce compliance aimed at protecting customer data.
+
+SQream enables customers to implement a-Symmetric Encryption solution using Secret Keys that they provide and manage themselves.
+The chosen encryption algorithm is AES-256, known for its strength and security. It is crucial to ensure that the Secret Key length is precisely 256 bits or 32 bytes.
+
+
+For more information on the encryption syntax, see :ref:`data_encryption_syntax`.
+
+For more information on GDPR compliance requirements, see the `GDPR checklist `_.
\ No newline at end of file
diff --git a/feature_guides/data_encryption_syntax.rst b/feature_guides/data_encryption_syntax.rst
new file mode 100644
index 000000000..5642c7092
--- /dev/null
+++ b/feature_guides/data_encryption_syntax.rst
@@ -0,0 +1,121 @@
+.. _data_encryption_syntax:
+
+***********************
+Syntax
+***********************
+Encrypting columns in a new table
+
+.. code-block:: psql
+     
+   CREATE TABLE   (
+          ENCRYPT,
+          NULL ENCRYPT,
+		  NOT NULL ENCRYPT
+		);
+
+Adding an encrypted column to an existing table
+
+.. code-block:: psql
+
+	ALTER TABLE  ADD COLUMN   ENCRYPT;
+		
+		
+Encryption methods
+
+.. code-block:: psql
+
+	ENCRYPT (  ,  )
+
+
+Decryption method
+
+.. code-block:: psql
+
+	DECRYPT (  ,  )
+
+***********************
+Examples
+***********************
+
+Encrypting a new table
+
+.. code-block:: psql
+     
+   CREATE TABLE client_name  (
+        id BIGINT NOT NULL ENCRYPT,
+        first_name TEXT ENCRYPT,
+        last_name TEXT,
+        salary INT ENCRYPT);
+
+Inserting encrypt player salary (``INT`` data type)
+
+.. code-block:: psql
+
+	INSERT INTO NBA (player_name, team_name, jersey_number, position, age, height, weight, college, salary)
+	VALUES ('Jayson Christopher Tatum', 'Boston Celtics', 0, 'SF', 25, '6-8', 210 , 'Duke', ENCRYPT ( 32600060 , '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f'));
+
+Similar example using ``COPY FROM``
+
+.. code-block:: psql
+
+	COPY NBA
+	(
+	player_name, team_name, jersey_number, position, age, height, weight, college, 
+	ENCRYPT (salary, '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f')
+	)
+	FROM WRAPPER csv_fdw 
+	OPTIONS
+	(location = '/tmp/source_file.csv', quote='@');
+
+Query the encrypted data
+
+.. code-block:: psql
+
+	SELECT player_name, DECRYPT( salary, '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f') FROM NBA
+	WHERE player_name ='Jayson Christopher Tatum';
+
+	player_name             |salary    |
+	------------------------+----------+
+	Jayson Christopher Tatum|1500000   |
+
+Query the encrypted data using ``WHERE`` clause on an encrypted column
+
+.. code-block:: psql
+
+	SELECT player_name, DECRYPT( salary, '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f')
+	FROM NBA
+	WHERE DECRYPT( salary, '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f') > 1000000;
+	
+	player_name             |salary    |
+	------------------------+----------+
+	Jayson Christopher Tatum|1500000   |
+	------------------------+----------+
+	Marcus Smart            |1350000   |
+
+Example of ``COPY TO`` using ``DECRYPT``
+
+.. code-block:: psql
+
+	COPY 
+	  (SELECT player_name, DECRYPT( salary, '6a8431f6e9c2777ee356c0b8aa3c12c0c63bdf366ac3342c4c9184b51697b47f')
+	  FROM NBA
+	  WHERE player_name ='Jayson Christopher Tatum') 
+	TO WRAPPER parquet_fdw 
+	OPTIONS (LOCATION = '/tmp/file.parquet');
+
+
+***********************
+Limitations
+***********************
+* The following functionality is not supported by the encryption feature: ``Catalog queries``, ``Utility commands``, ``Foreign Tables``, ``Create AS SELECT``.
+* A single encryption key must be used per column - using a different key would result in an error.
+* Compression of encrypted columns is limited to the following types: ``Flat``,	``LZ4``, ``PD4``, ``DICT``, ``RLE``.
+* This feature is not backward compatible with previous versions of SQreamDB.
+* The encryption feature affect performance and compression.
+
+
+
+***********************
+Permissions
+***********************
+The Data Encryption feature does not require a specific permission, users with relevant **TABLE** and **COLUMN** `permissions <../operational_guides/access_control_permissions.html#permissions>`_ may utilize it. 
\ No newline at end of file
diff --git a/feature_guides/data_encryption_types.rst b/feature_guides/data_encryption_types.rst
new file mode 100644
index 000000000..a42d22da9
--- /dev/null
+++ b/feature_guides/data_encryption_types.rst
@@ -0,0 +1,14 @@
+.. _data_encryption_types:
+
+***********************
+Data Types
+***********************
+Typically speaking, sensitive pertains to **PII (Personally Identifiable Information)**, which is sensitive information such as credit card numbers and other information related to an identifiable person.
+
+SQream's data encryption feature supports encrypting column-based data belonging to the following data types:
+
+* ``INT``
+* ``BIGINT``
+* ``TEXT``
+
+For more information on the above data types, see :ref:`supported_data_types`.
\ No newline at end of file
diff --git a/feature_guides/delete.rst b/feature_guides/delete.rst
deleted file mode 100644
index 24ab5a218..000000000
--- a/feature_guides/delete.rst
+++ /dev/null
@@ -1,214 +0,0 @@
-.. _delete_guide:
-
-***********************
-Deleting Data
-***********************
-
-SQream DB supports deleting data, but it's important to understand how this works and how to maintain deleted data.
-
-How does deleting in SQream DB work?
-========================================
-
-In SQream DB, when you run a delete statement, any rows that match the delete predicate will no longer be returned when running subsequent queries.
-Deleted rows are tracked in a separate location, in *delete predicates*.
-
-After the delete statement, a separate process can be used to reclaim the space occupied by these rows, and to remove the small overhead that queries will have until this is done. 
-
-Some benefits to this design are:
-
-#. Delete transactions complete quickly
-
-#. The total disk footprint overhead at any time for a delete transaction or cleanup process is small and bounded (while the system still supports low overhead commit, rollback and recovery for delete transactions).
-
-
-Phase 1: Delete
----------------------------
-
-.. TODO: isn't the delete cleanup able to complete a certain amount of work transactionally, so that you can do a massive cleanup in stages?
-
-.. TODO: our current best practices is to use a cron job with sqream sql to run the delete cleanup. we should document how to do this, we have customers with very different delete schedules so we can give a few extreme examples and when/why you'd use them
-   
-When a :ref:`delete` statement is run, SQream DB records the delete predicates used. These predicates will be used to filter future statements on this table until all this delete predicate's matching rows have been physically cleaned up.
-
-This filtering process takes full advantage of SQream's zone map feature.
-
-Phase 2: Clean-up
---------------------
-
-The cleanup process is not automatic. This gives control to the user or DBA, and gives flexibility on when to run the clean up.
-
-Files marked for deletion during the logical deletion stage are removed from disk. This is achieved by calling both utility function commands: ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` sequentially.
-
-.. note::
-   * :ref:`alter_table` and other DDL operations are blocked on tables that require clean-up. See more in the :ref:`concurrency_and_locks` guide.
-   * If the estimated time for a cleanup processs is beyond a threshold, you will get an error message about it. The message will explain how to override this limitation and run the process anywhere.
-
-Notes on data deletion
-=========================================
-
-.. note::
-   * If the number of deleted records crosses the threshold defined by the ``mixedColumnChunksThreshold`` parameter, the delete operation will be aborted.
-   * This is intended to alert the user that the large number of deleted records may result in a large number of mixed chuncks.
-   * To circumvent this alert, replace XXX with the desired number of records before running the delete operation:
-
-.. code-block:: postgres
-
-   set mixedColumnChunksThreshold=XXX;
-   
-
-Deleting data does not free up space
------------------------------------------
-
-With the exception of a full table delete (:ref:`TRUNCATE`), deleting data does not free up disk space. To free up disk space, trigger the cleanup process.
-
-``SELECT`` performance on deleted rows
-----------------------------------------
-
-Queries on tables that have deleted rows may have to scan data that hasn't been cleaned up.
-In some cases, this can cause queries to take longer than expected. To solve this issue, trigger the cleanup process.
-
-Use ``TRUNCATE`` instead of ``DELETE``
----------------------------------------
-For tables that are frequently emptied entirely, consider using :ref:`truncate` rather than :ref:`delete`. TRUNCATE removes the entire content of the table immediately, without requiring a subsequent cleanup to free up disk space.
-
-Cleanup is I/O intensive
--------------------------------
-
-The cleanup process actively compacts tables by writing a complete new version of column chunks with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes.
-
-Cleanup operations can create significant I/O load on the database. Consider this when planning the best time for the cleanup process.
-
-If this is an issue with your environment, consider using ``CREATE TABLE AS`` to create a new table and then rename and drop the old table.
-
-
-Example
-=============
-
-Deleting values from a table
-------------------------------
-
-.. code-block:: psql
-
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   4,Elephant            ,6500
-   5,Rhinoceros          ,2100
-   6,\N,\N
-   
-   6 rows
-   
-   farm=> DELETE FROM cool_animals WHERE weight > 1000;
-   executed
-   
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   6,\N,\N
-   
-   4 rows
-
-Deleting values based on more complex predicates
----------------------------------------------------
-
-.. code-block:: psql
-
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   4,Elephant            ,6500
-   5,Rhinoceros          ,2100
-   6,\N,\N
-   
-   6 rows
-   
-   farm=> DELETE FROM cool_animals WHERE weight > 1000;
-   executed
-   
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   6,\N,\N
-   
-   4 rows
-
-Identifying and cleaning up tables
----------------------------------------
-
-List tables that haven't been cleaned up
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-   
-   farm=> SELECT t.table_name FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      GROUP BY 1;
-   cool_animals
-   
-   1 row
-
-Identify predicates for clean-up
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-
-   farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      WHERE t.table_name = 'cool_animals';
-   weight > 1000
-   
-   1 row
-
-Triggering a cleanup
-^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-
-   -- Chunk reorganization (aka SWEEP)
-   farm=> SELECT CLEANUP_CHUNKS('public','cool_animals');
-   executed
-
-   -- Delete leftover files (aka VACUUM)
-   farm=> SELECT CLEANUP_EXTENTS('public','cool_animals');
-   executed
-   
-   
-   farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      WHERE t.table_name = 'cool_animals';
-   
-   0 rows
-
-
-
-Best practices for data deletion
-=====================================
-
-* Run ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` after large ``DELETE`` operations.
-
-* When deleting large proportions of data from very large tables, consider running a ``CREATE TABLE AS`` operation instead, then rename and drop the original table.
-
-* Avoid killing ``CLEANUP_EXTENTS`` operations after they've started.
-
-* SQream DB is optimised for time-based data. When data is naturally ordered by a date or timestamp, deleting based on those columns will perform best. For more information, see our :ref:`time based data management guide`.
-
-
-
-.. soft update concept
-
-.. delete cleanup and it's properties. automatic/manual, in transaction or background
-
-.. automatic background gives fast delete, minimal transaction overhead,
-.. small cost to queries until background reorganised
-
-.. when does delete use the metadata effectively
-
-.. more examples
-
diff --git a/feature_guides/delete_guide.rst b/feature_guides/delete_guide.rst
deleted file mode 100644
index 24ab5a218..000000000
--- a/feature_guides/delete_guide.rst
+++ /dev/null
@@ -1,214 +0,0 @@
-.. _delete_guide:
-
-***********************
-Deleting Data
-***********************
-
-SQream DB supports deleting data, but it's important to understand how this works and how to maintain deleted data.
-
-How does deleting in SQream DB work?
-========================================
-
-In SQream DB, when you run a delete statement, any rows that match the delete predicate will no longer be returned when running subsequent queries.
-Deleted rows are tracked in a separate location, in *delete predicates*.
-
-After the delete statement, a separate process can be used to reclaim the space occupied by these rows, and to remove the small overhead that queries will have until this is done. 
-
-Some benefits to this design are:
-
-#. Delete transactions complete quickly
-
-#. The total disk footprint overhead at any time for a delete transaction or cleanup process is small and bounded (while the system still supports low overhead commit, rollback and recovery for delete transactions).
-
-
-Phase 1: Delete
----------------------------
-
-.. TODO: isn't the delete cleanup able to complete a certain amount of work transactionally, so that you can do a massive cleanup in stages?
-
-.. TODO: our current best practices is to use a cron job with sqream sql to run the delete cleanup. we should document how to do this, we have customers with very different delete schedules so we can give a few extreme examples and when/why you'd use them
-   
-When a :ref:`delete` statement is run, SQream DB records the delete predicates used. These predicates will be used to filter future statements on this table until all this delete predicate's matching rows have been physically cleaned up.
-
-This filtering process takes full advantage of SQream's zone map feature.
-
-Phase 2: Clean-up
---------------------
-
-The cleanup process is not automatic. This gives control to the user or DBA, and gives flexibility on when to run the clean up.
-
-Files marked for deletion during the logical deletion stage are removed from disk. This is achieved by calling both utility function commands: ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` sequentially.
-
-.. note::
-   * :ref:`alter_table` and other DDL operations are blocked on tables that require clean-up. See more in the :ref:`concurrency_and_locks` guide.
-   * If the estimated time for a cleanup processs is beyond a threshold, you will get an error message about it. The message will explain how to override this limitation and run the process anywhere.
-
-Notes on data deletion
-=========================================
-
-.. note::
-   * If the number of deleted records crosses the threshold defined by the ``mixedColumnChunksThreshold`` parameter, the delete operation will be aborted.
-   * This is intended to alert the user that the large number of deleted records may result in a large number of mixed chuncks.
-   * To circumvent this alert, replace XXX with the desired number of records before running the delete operation:
-
-.. code-block:: postgres
-
-   set mixedColumnChunksThreshold=XXX;
-   
-
-Deleting data does not free up space
------------------------------------------
-
-With the exception of a full table delete (:ref:`TRUNCATE`), deleting data does not free up disk space. To free up disk space, trigger the cleanup process.
-
-``SELECT`` performance on deleted rows
-----------------------------------------
-
-Queries on tables that have deleted rows may have to scan data that hasn't been cleaned up.
-In some cases, this can cause queries to take longer than expected. To solve this issue, trigger the cleanup process.
-
-Use ``TRUNCATE`` instead of ``DELETE``
----------------------------------------
-For tables that are frequently emptied entirely, consider using :ref:`truncate` rather than :ref:`delete`. TRUNCATE removes the entire content of the table immediately, without requiring a subsequent cleanup to free up disk space.
-
-Cleanup is I/O intensive
--------------------------------
-
-The cleanup process actively compacts tables by writing a complete new version of column chunks with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes.
-
-Cleanup operations can create significant I/O load on the database. Consider this when planning the best time for the cleanup process.
-
-If this is an issue with your environment, consider using ``CREATE TABLE AS`` to create a new table and then rename and drop the old table.
-
-
-Example
-=============
-
-Deleting values from a table
-------------------------------
-
-.. code-block:: psql
-
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   4,Elephant            ,6500
-   5,Rhinoceros          ,2100
-   6,\N,\N
-   
-   6 rows
-   
-   farm=> DELETE FROM cool_animals WHERE weight > 1000;
-   executed
-   
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   6,\N,\N
-   
-   4 rows
-
-Deleting values based on more complex predicates
----------------------------------------------------
-
-.. code-block:: psql
-
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   4,Elephant            ,6500
-   5,Rhinoceros          ,2100
-   6,\N,\N
-   
-   6 rows
-   
-   farm=> DELETE FROM cool_animals WHERE weight > 1000;
-   executed
-   
-   farm=> SELECT * FROM cool_animals;
-   1,Dog                 ,7
-   2,Possum              ,3
-   3,Cat                 ,5
-   6,\N,\N
-   
-   4 rows
-
-Identifying and cleaning up tables
----------------------------------------
-
-List tables that haven't been cleaned up
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-   
-   farm=> SELECT t.table_name FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      GROUP BY 1;
-   cool_animals
-   
-   1 row
-
-Identify predicates for clean-up
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-
-   farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      WHERE t.table_name = 'cool_animals';
-   weight > 1000
-   
-   1 row
-
-Triggering a cleanup
-^^^^^^^^^^^^^^^^^^^^^^
-
-.. code-block:: psql
-
-   -- Chunk reorganization (aka SWEEP)
-   farm=> SELECT CLEANUP_CHUNKS('public','cool_animals');
-   executed
-
-   -- Delete leftover files (aka VACUUM)
-   farm=> SELECT CLEANUP_EXTENTS('public','cool_animals');
-   executed
-   
-   
-   farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp
-      JOIN sqream_catalog.tables t
-      ON dp.table_id = t.table_id
-      WHERE t.table_name = 'cool_animals';
-   
-   0 rows
-
-
-
-Best practices for data deletion
-=====================================
-
-* Run ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` after large ``DELETE`` operations.
-
-* When deleting large proportions of data from very large tables, consider running a ``CREATE TABLE AS`` operation instead, then rename and drop the original table.
-
-* Avoid killing ``CLEANUP_EXTENTS`` operations after they've started.
-
-* SQream DB is optimised for time-based data. When data is naturally ordered by a date or timestamp, deleting based on those columns will perform best. For more information, see our :ref:`time based data management guide`.
-
-
-
-.. soft update concept
-
-.. delete cleanup and it's properties. automatic/manual, in transaction or background
-
-.. automatic background gives fast delete, minimal transaction overhead,
-.. small cost to queries until background reorganised
-
-.. when does delete use the metadata effectively
-
-.. more examples
-
diff --git a/feature_guides/flexible_data_clustering.rst b/feature_guides/flexible_data_clustering.rst
deleted file mode 100644
index ce0f3d321..000000000
--- a/feature_guides/flexible_data_clustering.rst
+++ /dev/null
@@ -1,16 +0,0 @@
-.. _flexible_data_clustering:
-
-***********************
-Flexible Data Clustering
-***********************
-The **Flexible Data Clustering** section describes the following:
-
-.. toctree::
-   :maxdepth: 4
-   :titlesonly:
-
-   flexible_data_clustering_overview
-   flexible_data_clustering_chunks
-   flexible_data_clustering_data_clustering_methods
-   flexible_data_clustering_data_rechunking_data
-   flexible_data_clustering_data_examples
\ No newline at end of file
diff --git a/feature_guides/flexible_data_clustering_chunks.rst b/feature_guides/flexible_data_clustering_chunks.rst
deleted file mode 100644
index b8146d0fc..000000000
--- a/feature_guides/flexible_data_clustering_chunks.rst
+++ /dev/null
@@ -1,18 +0,0 @@
-.. _flexible_data_clustering_chunks:
-
-***********************
-What are Chunks?
-***********************
-Chunks, sometimes referred to as **partitions**, are a contiguous number of rows in a specific column. SQream relies on an advanced partitioning method called **chunking**, which provides all static partitioning capabilities without the known limitations.
-
-The following figure shows a table rows grouped as chunks:
-
-.. figure:: /_static/images/chunking2.png
-   :scale: 75 %
-   :align: center
-   
-The following figure shows the rows from the table above converted into chunks:
-   
-.. figure:: /_static/images/chunking_metadata2.png
-   :scale: 75 %
-   :align: center
\ No newline at end of file
diff --git a/feature_guides/flexible_data_clustering_data_clustering_methods.rst b/feature_guides/flexible_data_clustering_data_clustering_methods.rst
deleted file mode 100644
index 347be830a..000000000
--- a/feature_guides/flexible_data_clustering_data_clustering_methods.rst
+++ /dev/null
@@ -1,180 +0,0 @@
-.. _flexible_data_clustering_data_clustering_methods:
-
-***********************
-Data Clustering Methods
-***********************
-The following data clustering methods can be used in tandem or separately to enhance query performance:
-
-.. contents:: 
-   :local:
-   :depth: 1
-   
-Using Time-Based Data Management
-============
-Overview
-~~~~~~~~~~
-**Time-based data management** refers to sorting table data along naturally occuring dimensions. The most common and naturally occuring sorting mechanism is a **timestamp**, which indicates the point in time at which data was inserted into SQream. Because SQream is a columnar storage system, timestamped metadata facilitates quick and easy query processing.
-
-The following is the correct syntax for timestamping a chunk:
-
-.. code-block:: postgres
-
-   SELECT DATEPART(HOUR, timestamp),
-          MIN(transaction_amount),
-          MAX(transaction_amount),
-          avg(transaction_amount)
-   FROM transactions
-   WHERE timestamp BETWEEN (CURRENT_TIMESTAMP AND DATEADD(MONTH,-3,CURRENT_TIMESTAMP))
-   GROUP BY 1;
-
-Timestamping data includes the following properties:
-
-* Data is loaded in a natural order while being inserted.
-
-   ::
-   
-* Updates are infrequent or non-existent. Updates occur by inserting new rows, which have their own timestamps.
-
-   ::
-   
-* Queries on timestamped data is typically on continuous time range.
-
-   ::
-   
-* Inserting and reading data are performed independently, not in the operation or transaction.
-
-   ::
-  
-* Timestamped data has a high data volume and accumulates faster than typical online transactional processing workloads.
-
-The following are some scenarios ideal for timestamping:
-
-* Running analytical queries spanning specific date ranges (such as the sum of transactions during August-July 2020 versus August-July 2019).
-
-   ::
-   
-* Deleting data older than a specific number of months old.
-
-   ::
-
-* Regulations require you to maintain several years of data that you do not need to query on a regular basis.
-
-Best Practices for Time-Based Management
-~~~~~~~~~~
-Data inserted in bulks is automatically timestamped with the insertion date and time. Therefore, inserting data through small and frequent bulks has the effect of naturally ordering data according to timestamp. Frequent bulks generally refers to short time frames, such as at 15-minute, hourly, or daily intervals. As you insert new data, SQream chunks and appends it into your existing tables according to its timestamp.
-
-The ``DATE`` and ``DATETIME`` types were created to improve performance, minimze storage size, and maintain data integrity. SQream recommends using them instead of ``VARCHAR``.
-
-Using Clustering Keys
-============
-Overview
-~~~~~~~~~~
-While data clustering occurs relatively naturally within a table, certain practices can be used to actively enhance query performance and runtime. Defining **clustering keys** increases performance by explicitly co-locating your data, enabling SQream to avoid processing irrelevant chunks.
-
-A clustering key is a subset of table columns or expressions and is defined using the ``CLUSTER BY`` statement, as shown below:
-
-.. code-block:: postgres
-     
-   CREATE TABLE users (
-      name VARCHAR(30) NOT NULL,
-      start_date datetime not null,
-      country VARCHAR(30) DEFAULT 'Unknown' NOT NULL
-   ) CLUSTER BY country;
-   
-
-   
-The ``CLUSTER BY`` statement splits ingested data based on the range of data corresponding to the clustering key. This helps create chunks based on specific or related data, avoiding mixed chunks as much as possible. For example, instead of creating chunks based on a fixed number of rows, the ``CLUSTER_BY`` statement creates them based on common values. This optimizes the ``DELETE`` command as well, which deletes rows based on their location in a table.
-
-For more information, see the following:
-
-* `The CLUSTER_BY statement `_
-* `The DELETE statement `_
-* `The Deleting Data Guide `_
-
-Inspecting Clustered Table Health
-~~~~~~~~~~
-You can use the ``clustering_health`` utility function to check how well a table is clustered, as shown below:
-
-.. code-block:: postgres
-
-   SELECT CLUSTERING_HEALTH('table_name','clustering_keys');
-   
-The ``CLUSTERING_HEALTH`` function returns the average clustering depth of your table relative to the clustering keys. A lower value indicates a well-clustered table.
-
-Clustering keys are useful for restructuring large tables not optimally ordered when inserted or as a result of extensive DML. A table that uses clustering keys is referred to as a **clustered table**. Tables that are not clustered require SQream's query optimizer to scan entire tables while running queries, dramatically increasing runtime. Some queries significantly benefit from clustering, such as filtering or joining extensively on clustered columns.
-
-SQream partially sorts data that you load into a clustered table. Note that while clustering tables increases query performance, clustering during the insertion stage can decrease performance by 75%. Nevertheless, once a table is clustered subsequent queries run more quickly.
-
-.. note:: 
-
-   To determine whether clustering will enhance performance, SQream recommends end-to-end testing your clustering keys on a small subset of your data before committing them to permanent use. This is relevant for testing insert and query performance.   
-
-For more information, see the following:
-
-* **Data Manipulation commands (DML)** - see `Data Manipulation Commands (DML) `_.
-
-* **Creating tables** - see :ref:`create_table`. When you create a table, all new data is clustered upon insert.
-   
-* **Modifying tables** - see :ref:`cluster_by`.
-   
-* **Modifying a table schema** - see :ref:`alter_table`.
-
-Using Metadata
-============
-SQream uses an automated and transparent system for collecting metadata describing each chunk. This metadata enables skipping unnecessary chunks and extents during query runtime. The system collects chunk metadata when data is inserted into SQream. This is done by splitting data into chunks and collecting and storing specific parameters to be used later.
-
-Because collecting metadata is not process-heavy and does not contribute significantly to query processing, it occurs continuously as a background process. Most metadata collection is typically performed by the GPU. For example, for a 10TB dataset, the metadata storage overhead is approximately 0.5GB.
-
-When a query includes a filter (such as a ``WHERE`` or ``JOIN`` condition) on a range of values spanning a fraction of the table values, SQream scans only the filtered segment of the table.
-
-Once collected, several metadata parameters are stored for later use, including:
- 
-* The range of values on each column chunk (minimum, maximum).
-
-   ::
- 
-* The number of values.
-
-   ::
- 
-* Additional information for query optimization.
-
-Data is collected automatically and transparently on every column type.
-
-Queries filtering highly granular date and time ranges are the most effective, particularly when data is timestamped, and when tables contain a large amount of historical data.
-
-Using Chunks and Extents
-============
-SQream stores data in logical tables made up of rows spanning one or more columns. Internally, data is stored in vertical partitions by column, and horizontally by chunks. The **Using Chunks and Extents** section describes how to leverge chunking to optimize query performance.
-
-A **chunk** is a contiguous number of rows in a specific column. Depending on data type, a chunk's uncompressed size typically ranges between 1MB and a few hundred megabytes. This size range is suitable for filtering and deleting data from large tables, which may contain between hundreds, millions, or billions of chunks.
-   
-An **extent** is a specific number of contiguous chunks. Extents optimize disk access patterns, at around 20MB uncompressed, on-disk. Extents typically include between one and 25 chunks based on the compressed size of each chunk.
-
-.. note:: 
-
-   SQream compresses all data. In addition, all tables are automatically and transparently chunked.
-
-Unlike node-partitioning (or sharding), chunks are:
-
-* Small enough to be read concurrently by multiple workers.
-
-   ::
-   
-* Optimized for inserting data quickly.
-
-   ::
-  
-* Capable of carrying metadata, which narrows down their contents for the query optimizer.
-
-   ::
- 
-* Ideal for data retension because they can be deleted in bulk.
-
-   ::
- 
-* Optimized for reading into RAM and the GPU.
-
-   ::
- 
-* Compressed individually to improve compression and data locality.
\ No newline at end of file
diff --git a/feature_guides/flexible_data_clustering_data_examples.rst b/feature_guides/flexible_data_clustering_data_examples.rst
deleted file mode 100644
index 0c720ff04..000000000
--- a/feature_guides/flexible_data_clustering_data_examples.rst
+++ /dev/null
@@ -1,22 +0,0 @@
-.. _flexible_data_clustering_data_examples:
-
-***********************
-Examples
-***********************
-The **Examples** includes the following examples:
-
-.. contents:: 
-   :local:
-   :depth: 1
-   
-Creating a Clustered Table
------------------------------
-The following is an example of syntax for creating a clustered table on a table naturally ordered by ``start_date``. An alternative cluster key can be defined on such a table to improve performance on queries already ordered by ``country``:
-
-.. code-block:: postgres
-
-   CREATE TABLE users (
-      name VARCHAR(30) NOT NULL,
-      start_date datetime not null,
-      country VARCHAR(30) DEFAULT 'Unknown' NOT NULL
-   ) CLUSTER BY country;
\ No newline at end of file
diff --git a/feature_guides/flexible_data_clustering_data_rechunking_data.rst b/feature_guides/flexible_data_clustering_data_rechunking_data.rst
deleted file mode 100644
index 30a74bbaa..000000000
--- a/feature_guides/flexible_data_clustering_data_rechunking_data.rst
+++ /dev/null
@@ -1,11 +0,0 @@
-.. _flexible_data_clustering_data_rechunking_data:
-
-***********************
-Rechunking Data
-***********************
-SQream performs background storage reorganization operations to optimize I/O and read patterns.
-
-For example, when small batches of data are inserted, SQream runs two background processes called **rechunk** and **reextent** to reorganize the data into larger contiguous chunks and extents. This is also what happens when data is deleted.
-
-
-Instead of overwriting data, SQream writes new optimized chunks and extents to replace old ones. After rewriting all old data, SQream switches to the new optimized chunks and extents and deletes the old data.
\ No newline at end of file
diff --git a/feature_guides/flexible_data_clustering_overview.rst b/feature_guides/flexible_data_clustering_overview.rst
deleted file mode 100644
index 3ba59a603..000000000
--- a/feature_guides/flexible_data_clustering_overview.rst
+++ /dev/null
@@ -1,16 +0,0 @@
-.. _flexible_data_clustering_overview:
-
-***********************
-Overeview
-***********************
-**Flexible data clustering** refers to sorting table data along naturally occuring dimensions, such as name, date, or location. Data clustering optimizes table structure to significantly improve query performance, especially on very large tables. A well-clustered table increases the effectivity of the metadata collected by focusing on a specific and limited range of rows, called **chunks**.
-
-The following are some scenarios ideal for data clustering:
-
-* Queries containg a ``WHERE`` predicate written as ``column COMPARISON value``, such as ``date_column > '2019-01-01'`` or ``id = 107`` when the columns referenced are clustering keys.
-
-  In such a case SQream reads the portion of data that contain values matching these predicates only.
-
-* Two clustered tables joined by their respective clustering keys.
-
-  In such a case SQream uses metadata to more easily identify matching chunks.
\ No newline at end of file
diff --git a/feature_guides/index.rst b/feature_guides/index.rst
index a0a996fc8..c94069c6b 100644
--- a/feature_guides/index.rst
+++ b/feature_guides/index.rst
@@ -1,23 +1,18 @@
 .. _feature_guides:
 
-***********************
+**************
 Feature Guides
-***********************
-The **Feature Guides** section describes background processes that SQream uses to manage several areas of operation, such as data ingestion, load balancing, and access control. 
+**************
 
-This section describes the following features: 
+The **Feature Guides** section describes background processes that SQreamDB uses to manage several areas of operation, such as data ingestion, load balancing, and access control. 
 
 .. toctree::
    :maxdepth: 1
-   :titlesonly:
+   :titlesonly:  
 
-   delete_guide
+   query_healer
    compression
-   flexible_data_clustering
    python_functions
-   saved_queries
-   viewing_system_objects_as_ddl
    workload_manager
-   transactions
    concurrency_and_locks
-   concurrency_and_scaling_in_sqream
\ No newline at end of file
+   data_encryption
\ No newline at end of file
diff --git a/feature_guides/python_functions.rst b/feature_guides/python_functions.rst
index 3717cdcd8..4df1a1eaa 100644
--- a/feature_guides/python_functions.rst
+++ b/feature_guides/python_functions.rst
@@ -1,92 +1,37 @@
 .. _python_functions:
 
 *************************************
-Python UDF (User-Defined Functions)
+Python User-Defined Functions
 *************************************
 
-User-defined functions (UDFs) are a feature that extends SQream DB's built in SQL functionality. SQream DB's Python UDFs allow developers to create new functionality in SQL by writing the lower-level language implementation in Python. 
+User-Defined Functions (UDFs) offer streamlined statements, enabling the creation of a function once, storing it in the database, and calling it multiple times within a statement. Additionally, UDFs can be shared among roles, created by a database administrator and utilized by others. Furthermore, they contribute to code simplicity by allowing independent modifications in SQream DB without altering program source code.
 
-.. contents:: In this topic:
-   :local:
-
-A simple example
-=====================
-
-Most databases have an :ref:`UPPER` function, including SQream DB. However, assume that this function is missing for the sake of this example.
-
-You can write a function in Python to uppercase a text value using the :ref:`create_function` syntax.
-
-.. code-block:: postgres
-
-   CREATE FUNCTION my_upper (x1 text)
-     RETURNS text
-     AS $$  
-   return x1.upper()
-   $$ LANGUAGE PYTHON;
-
-Let's break down this example:
-
-* ``CREATE FUNCTION my_upper`` - :ref:`Create a function` called ``my_upper``. This name must be unique in the current database
-* ``(x1 text)`` - the function accepts one argument named ``x1`` which is of the SQL type ``TEXT``. All :ref:`data types` are supported.
-* ``RETURNS text`` - the function returns the same type - ``TEXT``. All :ref:`data types` are supported.
-* ``AS $$`` - what follows is some code that we don't want to quote, so we use dollar-quoting (``$$``) instead of single quotes (``'``).
-* ``return x1.upper()`` - the Python function's body is the argument named ``x1``, uppercased.
-* ``$$ LANGUAGE PYTHON`` - this is the end of the function, and it's in the Python language.
-
-.. rubric:: Running this example
-
-After creating the function, you can use it in any SQL query.
 
-For example:
-
-.. code-block:: psql
-   
-   master=>CREATE TABLE jabberwocky(line text);
-   executed
-   master=> INSERT INTO jabberwocky VALUES 
-   .   ('''Twas brillig, and the slithy toves '), ('      Did gyre and gimble in the wabe: ')
-   .   ,('All mimsy were the borogoves, '), ('      And the mome raths outgrabe. ')
-   .   ,('"Beware the Jabberwock, my son! '), ('      The jaws that bite, the claws that catch! ')
-   .   ,('Beware the Jubjub bird, and shun '), ('      The frumious Bandersnatch!" ');
-   executed
-   master=> SELECT line, my_upper(line) FROM jabberwocky;
-   line                                             | my_upper                                        
-   -------------------------------------------------+-------------------------------------------------
-   'Twas brillig, and the slithy toves              | 'TWAS BRILLIG, AND THE SLITHY TOVES             
-         Did gyre and gimble in the wabe:           |       DID GYRE AND GIMBLE IN THE WABE:          
-   All mimsy were the borogoves,                    | ALL MIMSY WERE THE BOROGOVES,                   
-         And the mome raths outgrabe.               |       AND THE MOME RATHS OUTGRABE.              
-   "Beware the Jabberwock, my son!                  | "BEWARE THE JABBERWOCK, MY SON!                 
-         The jaws that bite, the claws that catch!  |       THE JAWS THAT BITE, THE CLAWS THAT CATCH! 
-   Beware the Jubjub bird, and shun                 | BEWARE THE JUBJUB BIRD, AND SHUN                
-         The frumious Bandersnatch!"                |       THE FRUMIOUS BANDERSNATCH!"               
-
-Why use UDFs?
-=====================
+.. contents::
+   :local:
+   :depth: 1
 
-* They allow simpler statements - You can create the function once, store it in the database, and call it any number of times in a statement.
+Before You Begin
+=================
 
-* They can be shared - UDFs can be created by a database administrator, and then used by other roles.
+* Ensure you have Python 3.11 or newer installed
 
-* They can simplify downstream code - UDFs can be modified in SQream DB independently of program source code.
+.. note::  This feature is deprecated on Q3 2025, and would be replaced by an enhanced implementation. Please consult SQream support for usage.
 
-SQream DB's UDF support
+SQreamDB's UDF Support
 =============================
 
-Scalar functions
+Scalar Functions
 ---------------------
 
-SQream DB's UDFs are scalar functions. This means that the UDF returns a single data value of the type defined in the ``RETURNS`` clause. For an inline scalar function, the returned scalar value is the result of a single statement.
+SQreamDB's UDFs are scalar functions. This means that the UDF returns a single data value of the type defined in the ``RETURNS`` clause. For an inline scalar function, the returned scalar value is the result of a single statement.
 
 Python
--------------------
+---------
 
-At this time, SQream DB's UDFs are supported for Python.
+Python is installed alongside SQreamDB, for use exclusively by SQreamDB. You may have a different version of Python installed on your server.
 
-Python 3.6.7 is installed alongside SQream DB, for use exclusively by SQream DB.
-You may have a different version of Python installed on your server.
-
-To find which version of Python is installed for use by SQream DB, create and run this UDF:
+To find which version of Python is installed for use by SQreamDB, create and run this UDF:
 
 .. code-block:: psql
    
@@ -98,19 +43,18 @@ To find which version of Python is installed for use by SQream DB, create and ru
    .  $$ LANGUAGE PYTHON;
    executed
    master=> SELECT py_version();
-   py_version                                                                           
-   -------------------------------------------------------------------------------------
-   Python version: 3.6.7 (default, Jul 22 2019, 11:03:54) [GCC 5.4.0].
-   Path: /opt/sqream/python-3.6.7-5.4.0
+   "Python version: 3.11.7 (main, Dec 22 2024, 18:29:20) [GCC 11.1.0]. Path: /usr/local"
 
-Using modules
+Using Modules
 ---------------------
 
 To import a Python module, use the standard ``import`` syntax in the first lines of the user-defined function.
 
+Working with Existing UDFs
+===========================
 
-Finding existing UDFs in the catalog
-========================================
+Finding Existing UDFs in the Catalog
+----------------------------------------
 
 The ``user_defined_functions`` catalog view contains function information.
 
@@ -124,8 +68,8 @@ Here's how you'd list all UDFs in the system:
    master        |           1 | my_upper  
 
 
-Getting the DDL for a function
-=====================================
+Getting Function DDL
+----------------------
 
 .. code-block:: psql
 
@@ -140,12 +84,12 @@ Getting the DDL for a function
 
 See :ref:`get_function_ddl` for more information.
 
-Error handling
-=====================
+Handling Errors
+-----------------
 
 In UDFs, any error that occurs causes the execution of the function to stop. This in turn causes the statement that invoked the function to be canceled.
 
-Permissions and sharing
+Permissions and Sharing
 ============================
 
 To create a UDF, the creator needs the ``CREATE FUNCTION`` permission at the database level.
@@ -154,7 +98,7 @@ For example, to grant ``CREATE FUNCTION`` to a non-superuser role:
 
 .. code-block:: postgres
    
-   GRANT CREATE FUNCTION ON DATABASE master TO mjordan;
+   GRANT CREATE FUNCTION ON DATABASE master TO role1;
 
 To execute a UDF, the role needs the ``EXECUTE FUNCTION`` permission for every function. 
 
@@ -168,9 +112,62 @@ For example, to grant the permission to the ``r_bi_users`` role group, run:
 
 See more information about permissions in the :ref:`Access control guide`.
 
+Example
+=========
+
+Most databases have an :ref:`UPPER` function, including SQream DB. However, assume that this function is missing for the sake of this example.
+
+You can write a function in Python to uppercase a text value using the :ref:`create_function` syntax.
+
+.. code-block:: postgres
+
+   CREATE FUNCTION my_upper (x1 text)
+     RETURNS text
+     AS $$  
+   return x1.upper()
+   $$ LANGUAGE PYTHON;
+
+Let's break down this example:
+
+* ``CREATE FUNCTION my_upper`` - :ref:`Create a function` called ``my_upper``. This name must be unique in the current database
+* ``(x1 text)`` - the function accepts one argument named ``x1`` which is of the SQL type ``TEXT``. All :ref:`data types` are supported.
+* ``RETURNS text`` - the function returns the same type - ``TEXT``. All :ref:`data types` are supported.
+* ``AS $$`` - what follows is some code that we don't want to quote, so we use dollar-quoting (``$$``) instead of single quotes (``'``).
+* ``return x1.upper()`` - the Python function's body is the argument named ``x1``, uppercased.
+* ``$$ LANGUAGE PYTHON`` - this is the end of the function, and it's in the Python language.
+
+.. rubric:: Running this example
+
+After creating the function, you can use it in any SQL query.
+
+For example:
+
+.. code-block:: psql
+   
+   master=>CREATE TABLE jabberwocky(line text);
+   executed
+   master=> INSERT INTO jabberwocky VALUES 
+   .   ('''Twas brillig, and the slithy toves '), ('      Did gyre and gimble in the wabe: ')
+   .   ,('All mimsy were the borogoves, '), ('      And the mome raths outgrabe. ')
+   .   ,('"Beware the Jabberwock, my son! '), ('      The jaws that bite, the claws that catch! ')
+   .   ,('Beware the Jubjub bird, and shun '), ('      The frumious Bandersnatch!" ');
+   executed
+   master=> SELECT line, my_upper(line) FROM jabberwocky;
+   line                                             | my_upper                                        
+   -------------------------------------------------+-------------------------------------------------
+   'Twas brillig, and the slithy toves              | 'TWAS BRILLIG, AND THE SLITHY TOVES             
+         Did gyre and gimble in the wabe:           |       DID GYRE AND GIMBLE IN THE WABE:          
+   All mimsy were the borogoves,                    | ALL MIMSY WERE THE BOROGOVES,                   
+         And the mome raths outgrabe.               |       AND THE MOME RATHS OUTGRABE.              
+   "Beware the Jabberwock, my son!                  | "BEWARE THE JABBERWOCK, MY SON!                 
+         The jaws that bite, the claws that catch!  |       THE JAWS THAT BITE, THE CLAWS THAT CATCH! 
+   Beware the Jubjub bird, and shun                 | BEWARE THE JUBJUB BIRD, AND SHUN                
+         The frumious Bandersnatch!"                |       THE FRUMIOUS BANDERSNATCH!"               
+
+
 
-Best practices
-=====================
+Best Practices
+===============
 
 Although user-defined functions add flexibility, they may have some performance drawbacks. They are not usually a replacement for subqueries or views.
 
diff --git a/feature_guides/query_healer.rst b/feature_guides/query_healer.rst
new file mode 100644
index 000000000..a619a4dbb
--- /dev/null
+++ b/feature_guides/query_healer.rst
@@ -0,0 +1,67 @@
+.. _query_healer:
+
+************
+Query Healer
+************
+ 
+
+The **Query Healer** periodically examines the progress of running statements and connections, creating a log entry for all statements deemed stuck (exceeding a defined time period with no progress) and connections with no data transfer over a specified time.
+It can also take action based on its findings, for two issues - a stuck query or a hung connection.
+The query healer runs on a separate thread of each worker, enables it to take action if the worker it is on has a problem.
+
+Configuration
+-------------
+
+The following worker flags are required to configure the Query Healer. These are all worker level flags:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Flag
+     - Description
+   * - ``isHealerOn``
+     - The :ref:`is_healer_on` enables and disables the Query Healer.
+   * - ``healerDetectionFrequencySeconds``
+     - The :ref:`healer_detection_frequency_seconds` triggers the healer to examine the progress of running statements. The default setting is one hour.
+   * - ``maxStatementInactivitySeconds``
+     - The :ref:`max_statement_inactivity_seconds` defines the threshold for creating a log recording a slow statement. The log includes information about the log memory, CPU and GPU. If a statement did not make any progress during this time, it is considerd stuck. The default setting is five hours.
+   * - ``healerRunActionAutomatically``
+     - The :ref:`healer_run_action_automatically` triggers the healer to take action once it detects a problem. In order for the healer to take an automatic correction action, this flag needs to be true, AND the flag that relates to the detected problem. The default setting is true. 
+   * - ``healerActionGracefulShutdown``
+     - The :ref:`healer_action_graceful_shutdown` triggers the healer to restart a stuck worker automatically (both this flag AND healerRunActionAutomatically need to be true). The default setting is false. 
+   * - ``healerActionCleanupConnection``
+     - The :ref:`healer_action_cleanup_connection` triggers the healer to close a hung connection automatically (both this flag AND healerRunActionAutomatically need to be true). The default setting is true. 
+
+
+Query Log
+---------
+
+The following is an example of a log record for a query stuck in the query detection phase for more than five hours:
+
+.. code-block:: console
+
+   |INFO|0x00007f9a497fe700:Healer|192.168.4.65|5001|-1|master|sqream|-1|sqream|0|"[ERROR]|cpp/SqrmRT/healer.cpp:140 |"Stuck query found. Statement ID: 72, Last chunk producer updated: 1.
+
+Once you identify the stuck worker, you can execute the ``shutdown_server`` utility function from this specific worker, as described in the next section.
+
+Activating a Graceful Shutdown
+------------------------------
+
+You can activate a graceful shutdown if your log entry says ``Stuck query found``, as shown in the example above. You can do this by setting the **shutdown_server** utility function to ``select shutdown_server();``.
+
+**To activte a graceful shutdown:**
+
+1. Locate the IP and the Port of the stuck worker from the logs.
+
+   .. note:: The log in the previous section identifies the IP **(192.168.4.65)** and port **(5001)** referring to the stuck query.
+
+2. From the machine of the stuck query (IP: **192.168.4.65**, port: **5001**), connect to SQream SQL client:
+
+   .. code-block:: console
+
+      ./sqream sql --port=$STUCK_WORKER_IP --username=$SQREAM_USER --password=$SQREAM_PASSWORD databasename=$SQREAM_DATABASE
+
+3. Execute ``shutdown_server``.
+
+For more information, see the :ref:`shutdown_server_command` utility function. This page describes all of ``shutdown_server`` options.
diff --git a/feature_guides/query_healer.rst_old b/feature_guides/query_healer.rst_old
new file mode 100644
index 000000000..ba07d3f11
--- /dev/null
+++ b/feature_guides/query_healer.rst_old
@@ -0,0 +1,58 @@
+.. _query_healer:
+
+************
+Query Healer
+************
+ 
+
+The **Query Healer** periodically examines the progress of running statements, creating a log entry for all statements exceeding a defined time period.   
+
+Configuration
+-------------
+
+The following worker flags are required to configure the Query Healer:
+
+.. list-table:: 
+   :widths: auto
+   :header-rows: 1
+
+   * - Flag
+     - Description
+   * - ``is_healer_on``
+     - The :ref:`is_healer_on` enables and disables the Query Healer.
+   * - ``maxStatementInactivitySeconds``
+     - The :ref:`max_statement_inactivity_seconds` worker level flag defines the threshold for creating a log recording a slow statement. The log includes information about the log memory, CPU and GPU. The default setting is five hours.
+   * - ``healerDetectionFrequencySeconds``
+     - The :ref:`healer_detection_frequency_seconds` worker level flag triggers the healer to examine the progress of running statements. The default setting is one hour. 
+
+Query Log
+---------
+
+The following is an example of a log record for a query stuck in the query detection phase for more than five hours:
+
+.. code-block:: console
+
+   |INFO|0x00007f9a497fe700:Healer|192.168.4.65|5001|-1|master|sqream|-1|sqream|0|"[ERROR]|cpp/SqrmRT/healer.cpp:140 |"Stuck query found. Statement ID: 72, Last chunk producer updated: 1.
+
+Once you identify the stuck worker, you can execute the ``shutdown_server`` utility function from this specific worker, as described in the next section.
+
+Activating a Graceful Shutdown
+------------------------------
+
+You can activate a graceful shutdown if your log entry says ``Stuck query found``, as shown in the example above. You can do this by setting the **shutdown_server** utility function to ``select shutdown_server();``.
+
+**To activte a graceful shutdown:**
+
+1. Locate the IP and the Port of the stuck worker from the logs.
+
+   .. note:: The log in the previous section identifies the IP **(192.168.4.65)** and port **(5001)** referring to the stuck query.
+
+2. From the machine of the stuck query (IP: **192.168.4.65**, port: **5001**), connect to SQream SQL client:
+
+   .. code-block:: console
+
+      ./sqream sql --port=$STUCK_WORKER_IP --username=$SQREAM_USER --password=$SQREAM_PASSWORD databasename=$SQREAM_DATABASE
+
+3. Execute ``shutdown_server``.
+
+For more information, see the :ref:`shutdown_server_command` utility function. This page describes all of ``shutdown_server`` options.
diff --git a/feature_guides/sqream_installer_cli_ref_admin.rst b/feature_guides/sqream_installer_cli_ref_admin.rst
deleted file mode 100644
index fad8107b9..000000000
--- a/feature_guides/sqream_installer_cli_ref_admin.rst
+++ /dev/null
@@ -1,142 +0,0 @@
-.. _sqream_installer_cli_ref_admin:
-
-*********************************
-SQream Installer
-*********************************
-``sqream-installer`` is an application that prepares and configures a dockerized SQream DB installation.
-
-This page serves as a reference for the options and parameters. 
-
-.. contents:: In this topic:
-   :local:
-
-
-Operations and flag reference
-===============================
-
-Command line flags
------------------------
-
-.. list-table:: 
-   :widths: auto
-   :header-rows: 1
-   
-   * - Flag
-     - Description
-   * - ``-i``
-     - Loads the docker images for installation
-   * - ``-k``
-     - Load new licenses from the ``license`` subdirectory
-   * - ``-K``
-     - Validate licenses
-   * - ``-f``
-     - Force overwrite any existing installation **and data directories currently in use**
-   * - ``-c ``
-     - Specifies a path to read and store configuration files in. Defaults to ``/etc/sqream``.
-   * - ``-v ``
-     - Specifies a path to the storage cluster. The path is created if it does not exist.
-   * - ``-l ``
-     - Specifies a path to store system startup logs. Defaults to ``/var/log/sqream``
-   * - ``-d ``
-     - Specifies a path to expose to SQream DB workers. To expose several paths, repeat the usage of this flag.
-   * - ``-s``
-     - Shows system settings
-   * - ``-r``
-     - Reset the system configuration. This flag can't be combined with other flags.
-
-Usage
-=============
-
-Install SQream DB for the first time
-----------------------------------------
-
-Assuming license package tarball has been placed in the ``license`` subfolder.
-
-* The path where SQream DB will store data is ``/home/rhendricks/sqream_storage``.
-
-* Logs will be stored in /var/log/sqream
-
-* Source CSV, Parquet, and ORC files can be accessed from ``/home/rhendricks/source_data``. All other directory paths are hidden from the Docker container.
-
-.. code-block:: console
-   
-   # ./sqream-install -i -k -v /home/rhendricks/sqream_storage -l /var/log/sqream -c /etc/sqream -d /home/rhendricks/source_data
-
-.. note:: Installation commands should be run with ``sudo`` or root access.
-
-Modify exposed directories
--------------------------------
-
-To expose more directory paths for SQream DB to read and write data from, re-run the installer with additional directory flags.
-
-.. code-block:: console
-   
-   # ./sqream-install -d /home/rhendricks/more_source_data
-
-There is no need to specify the initial installation flags - only the modified exposed directory paths flag.
-
-
-Install a new license package
-----------------------------------
-
-Assuming license package tarball has been placed in the ``license`` subfolder.
-
-.. code-block:: console
-   
-   # ./sqream-install -k
-
-View system settings
-----------------------------
-
-This information may be useful to identify problems accessing directory paths, or locating where data is stored.
-
-.. code-block:: console
-   
-   # ./sqream-install -s
-   SQREAM_CONSOLE_TAG=1.7.4
-   SQREAM_TAG=2020.1
-   SQREAM_EDITOR_TAG=3.1.0
-   license_worker_0=[...]
-   license_worker_1=[...]
-   license_worker_2=[...]
-   license_worker_3=[...]
-   SQREAM_VOLUME=/home/rhendricks/sqream_storage
-   SQREAM_DATA_INGEST=/home/rhendricks/source_data
-   SQREAM_CONFIG_DIR=/etc/sqream/
-   LICENSE_VALID=true
-   SQREAM_LOG_DIR=/var/log/sqream/
-   SQREAM_USER=sqream
-   SQREAM_HOME=/home/sqream
-   SQREAM_ENV_PATH=/home/sqream/.sqream/env_file
-   PROCESSOR=x86_64
-   METADATA_PORT=3105
-   PICKER_PORT=3108
-   NUM_OF_GPUS=8
-   CUDA_VERSION=10.1
-   NVIDIA_SMI_PATH=/usr/bin/nvidia-smi
-   DOCKER_PATH=/usr/bin/docker
-   NVIDIA_DRIVER=418
-   SQREAM_MODE=single_host
-
-
-.. _upgrade_with_docker:
-
-Upgrading to a new version of SQream DB
-----------------------------------------------
-
-When upgrading to a new version with Docker, most settings don't need to be modified.
-
-The upgrade process replaces the existing docker images with new ones.
-
-#. Obtain the new tarball, and untar it to an accessible location. Enter the newly extracted directory.
-
-#. 
-   Install the new images
-   
-   .. code-block:: console
-   
-      # ./sqream-install -i
-
-#. The upgrade process will check for running SQream DB processes. If any are found running, the installer will ask to stop them in order to continue the upgrade process. Once all services are stopped, the new version will be loaded. 
-
-#. After the upgrade, open :ref:`sqream_console_cli_reference` and restart the desired services.
\ No newline at end of file
diff --git a/feature_guides/transactions.rst b/feature_guides/transactions.rst
deleted file mode 100644
index 675957d51..000000000
--- a/feature_guides/transactions.rst
+++ /dev/null
@@ -1,17 +0,0 @@
-.. _transactions:
-
-***********************
-Transactions
-***********************
-
-SQream DB supports serializable transactions. This is also called 'ACID compliance'. 
-
-The implementation of transactions means that commit, rollback and recovery are all extremely fast.
-
-SQream DB has extremely fast bulk insert speed, with minimal slowdown when running concurrent inserts. There is no performance reason to break large inserts up into multiple transactions.
-
-The phrase "supporting transactions" for a database system sometimes means having good performance for OLTP workloads, SQream DB's transaction system does not have high performance for high concurrency OLTP workloads.
-
-SQream DB also supports :ref:`transactional DDL`.
-
-
diff --git a/feature_guides/workload_manager.rst b/feature_guides/workload_manager.rst
index c8eb2468d..aa18c4823 100644
--- a/feature_guides/workload_manager.rst
+++ b/feature_guides/workload_manager.rst
@@ -4,16 +4,11 @@
 Workload Manager
 ***********************
 
-The **Workload Manager** allows SQream DB workers to identify their availability to clients with specific service names. The load balancer uses that information to route statements to specific workers. 
-
-Overview
-===============================
-
-The Workload Manager allows a system engineer or database administrator to allocate specific workers and compute resoucres for various tasks.
+The Workload Manager enables SQream workers to identify their availability to clients with specific service names, allowing a system engineer or database administrator to allocate specific workers and compute resources for various tasks. The load balancer then uses this information to route statements to the designated workers.
 
 For example:
 
-#. Creating a service queue named ``ETL`` and allocating two workers exclusively to this service prevents non-``ETL`` statements from utilizing these compute resources.
+#. Creating a service queue named ``ETL`` and allocating two workers exclusively to this service prevents non-``ETL`` statements from using these compute resources.
 
 #. Creating a service for the company's leadership during working hours for dedicated access, and disabling this service at night to allow maintenance operations to use the available compute.
 
@@ -60,63 +55,57 @@ The configuration in this example allocates resources as shown below:
      - ✓
      - ✓
 
-This configuration gives the ETL queue dedicated access to two workers, one of which cannot be used by regular queries.
+This configuration gives the ETL queue dedicated access to one worker, which cannot be used..
 
 Queries from management uses any available worker.
 
 Creating the Configuration
 -----------------------------------
 
-The persistent configuration for this set-up is listed in the four configuration files shown below.
-
-Each worker gets a comma-separated list of service queues that it subscribes to. These services are specified in the ``initialSubscribedServices`` attribute.
-
 .. code-block:: json
-   :caption: Worker #1
-   :emphasize-lines: 7
 
-   {
-       "compileFlags": {
-       },
-       "runtimeFlags": {
-       },
-       "runtimeGlobalFlags": {
-          "initialSubscribedServices" : "etl,management"
-       },
-       "server": {
-           "gpu": 0,
-           "port": 5000,
-           "cluster": "/home/rhendricks/raviga_database",
-           "licensePath": "/home/sqream/.sqream/license.enc"
-       }
-   }
 
-.. code-block:: json
-   :caption: Workers #2, #3, #4
-   :emphasize-lines: 7
+	{
+	  "cluster": "/home/rhendricks/raviga_database",
+	  "cudaMemQuota": 25,
+	  "gpu": 0,
+	  "maxConnectionInactivitySeconds": 120,
+	  "legacyConfigFilePath": "tzah_legacy.json",
+	  "licensePath": "/home/sqream/.sqream/license.enc",
+	  "metadataServerIp": "192.168.0.103",
+	  "limitQueryMemoryGB": 250,
+	  "machineIP": "192.168.0.103",
+	  "metadataServerPort": 3105,
+	  "port": 5000,
+	  "useConfigIP": true
+	}
 
-   {
-       "compileFlags": {
-       },
-       "runtimeFlags": {
-       },
-       "runtimeGlobalFlags": {
-          "initialSubscribedServices" : "query,management"
-       },
-       "server": {
-           "gpu": 1,
-           "port": 5001,
-           "cluster": "/home/rhendricks/raviga_database",
-           "licensePath": "/home/sqream/.sqream/license.enc"
-       }
-   }
+.. code-block:: json
+   :caption: Legacy File
+
+
+	{
+	  "debugNetworkSession": false,
+	  "diskSpaceMinFreePercent": 1,
+	  "maxNumAutoCompressedChunksThreshold" : 1,
+	  "insertMergeRowsThreshold":40000000,
+	  "insertCompressors": 8,
+	  "insertParsers": 8,
+	  "nodeInfoLoggingSec": 60,
+	  "reextentUse": true,
+	  "separatedGatherThreads": 16,
+	  "showFullExceptionInfo": true,
+	  "spoolMemoryGB":200,
+	  "useClientLog": true,
+	  "useMetadataServer":true
+	}
 
 .. tip:: You can create this configuration temporarily (for the current session only) by using the :ref:`subscribe_service` and :ref:`unsubscribe_service` statements.
 
 Verifying the Configuration
 -----------------------------------
 
-Use :ref:`show_subscribed_instances` to view service subscriptions for each worker. Use `SHOW_SERVER_STATUS `_ to see the statement queues.
+Use :ref:`show_subscribed_instances` to view service subscriptions for each worker. Use :ref:`SHOW_SERVER_STATUS` to see the statement queues.
 
 
 
@@ -152,7 +141,7 @@ When using **SQream Studio**, you can configure a client connection to a specifi
 
 
 
-For more information, in Studio, see `Executing Statements from the Toolbar `_.
+For more information, in Studio, see :ref:`Executing Statements from the Toolbar`.
 
 
 
diff --git a/getting_started/connecting_clients_to_sqream.rst b/getting_started/connecting_clients_to_sqream.rst
deleted file mode 100644
index a1ed59cf1..000000000
--- a/getting_started/connecting_clients_to_sqream.rst
+++ /dev/null
@@ -1,26 +0,0 @@
-.. _connecting_clients_to_sqream:
-
-****************************
-Connecting Clients to SQream
-****************************
-SQream supports the most common database tools and interfaces, giving you direct access through a variety of drivers, connectors, and visualiztion tools and utilities.
-
-SQream supports the following client platforms:
-
-* Connect to SQream Using SQL Workbench
-* Connecting to SQream Using Tableau
-* Connect to SQream Using Pentaho Data Integration
-* Connect to SQream Using MicroStrategy
-* Connect to SQream Using R
-* Connect to SQream Using PHP
-* Connect to SQream Using SAS Viya
-
-SQream supports the following client drivers:
-
-* JDBC
-* Python (pysqream)
-* Node.JS
-* ODBC
-* C++ Driver
-
-For more information, see `Third Party Tools`_.
\ No newline at end of file
diff --git a/getting_started/creating_a_database.rst b/getting_started/creating_a_database.rst
new file mode 100644
index 000000000..c7378b5c3
--- /dev/null
+++ b/getting_started/creating_a_database.rst
@@ -0,0 +1,37 @@
+:orphan:
+
+.. _creating_a_database:
+
+****************************
+Creating a Database
+****************************
+
+Once you've installed SQream you can create a database.
+
+**To create a database:**
+
+1. Write a :ref:`create_database` statement.
+
+   The following is an example of creating a new database:
+
+   .. code-block:: psql
+
+      master=> CREATE DATABASE test;
+      executed
+
+2. Reconnect to the newly created database.
+
+   1. Exit the client by typing ``\q`` and pressing **Enter**.
+   2. From the Linux shell, restart the client with the new database name:
+
+      .. code-block:: psql
+
+         $ sqream sql --port=5000 --username=rhendricks -d test
+         Password:
+   
+         Interactive client mode
+         To quit, use ^D or \q.
+   
+         test=> _
+
+    The name of the new database that you are connected to is displayed in the prompt.
\ No newline at end of file
diff --git a/getting_started/creating_your_first_table.rst b/getting_started/creating_your_first_table.rst
index c070b43ed..f02ab3b21 100644
--- a/getting_started/creating_your_first_table.rst
+++ b/getting_started/creating_your_first_table.rst
@@ -3,6 +3,7 @@
 ****************************
 Creating Your First Table
 ****************************
+
 The **Creating Your First Table** section describes the following:
 
 * :ref:`Creating a table`
@@ -21,7 +22,7 @@ The ``CREATE TABLE`` syntax is used to create your first table. This table inclu
 
    CREATE TABLE cool_animals (
       id INT NOT NULL,
-      name VARCHAR(20),
+      name TEXT(20),
       weight INT
    );
 
@@ -37,7 +38,7 @@ You can drop an existing table and create a new one by adding the ``OR REPLACE``
 
    CREATE OR REPLACE TABLE cool_animals (
       id INT NOT NULL,
-      name VARCHAR(20),
+      name TEXT(20),
       weight INT
    );
 
@@ -54,13 +55,13 @@ You can list the full, verbose ``CREATE TABLE`` statement for a table by using t
    test=> SELECT GET_DDL('cool_animals');
    create table "public"."cool_animals" (
    "id" int not null,
-   "name" varchar(20),
+   "name" text(20),
    "weight" int
    );
 
 .. note:: 
 
-   * SQream DB identifier names such as table names and column names are not case sensitive. SQream DB lowercases all identifiers bu default. If you want to maintain case, enclose the identifiers with double-quotes.
+   * SQream DB identifier names such as table names and column names are not case sensitive. SQreamDB lowercases all identifiers by default. If you want to maintain case, enclose the identifiers with double-quotes.
    * SQream DB places all tables in the `public` schema, unless another schema is created and specified as part of the table name.
    
 For information on listing a ``CREATE TABLE`` statement, see :ref:`get_ddl`.
diff --git a/getting_started/deleting_rows.rst b/getting_started/deleting_rows.rst
index 2ff33b8dc..6ce20d489 100644
--- a/getting_started/deleting_rows.rst
+++ b/getting_started/deleting_rows.rst
@@ -3,6 +3,7 @@
 ****************************
 Deleting Rows
 ****************************
+
 The **Deleting Rows** section describes the following:
 
 * :ref:`Deleting selected rows`
@@ -16,17 +17,16 @@ You can delete rows in a table selectively using the ``DELETE`` command. You mus
 
 .. code-block:: psql
 
-   test=> DELETE FROM cool_animals WHERE weight is null;
+   DELETE FROM cool_animals WHERE weight is null;
    
    executed
-   master=> SELECT  * FROM cool_animals;
+   SELECT  * FROM cool_animals;
    1,Dog                 ,7
    2,Possum              ,3
    3,Cat                 ,5
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
 
-   5 rows
 
 .. _deleting_all_rows:
 
@@ -36,9 +36,8 @@ You can delete all rows in a table using the ``TRUNCATE`` command followed by th
 
 .. code-block:: psql
 
-   test=> TRUNCATE TABLE cool_animals;
-   
-   executed
+   TRUNCATE TABLE cool_animals;
+  
 
 .. note:: While :ref:`truncate` deletes data from disk immediately, :ref:`delete` does not physically remove the deleted rows.
 
diff --git a/getting_started/executing_statements_in_sqream.rst b/getting_started/executing_statements_in_sqream.rst
index 26916f1e6..a9fb5e7a5 100644
--- a/getting_started/executing_statements_in_sqream.rst
+++ b/getting_started/executing_statements_in_sqream.rst
@@ -1,10 +1,19 @@
 .. _executing_statements_in_sqream:
 
-****************************
-Executing Statements in SQream
-****************************
-You can execute statements in SQream using one of the following tools:
+********************************
+Executing Statements in SQreamDB
+********************************
 
-* `SQream SQL CLI `_ - a command line interface
-* `SQream Acceleration Studio `_ - an intuitive and easy-to-use interface.
+You may choose one of the following tools for executing statements in SQreamDB:
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Tool
+     - Description
+   * - :ref:`SQream SQL CLI `
+     - A command line interface
+   * - :ref:`SQreamDB Acceleration Studio `
+     - An intuitive and easy-to-use interface
 
diff --git a/getting_started/hardware_guide.rst b/getting_started/hardware_guide.rst
new file mode 100644
index 000000000..b1df6d969
--- /dev/null
+++ b/getting_started/hardware_guide.rst
@@ -0,0 +1,233 @@
+.. _hardware_guide:
+
+**************
+Hardware Guide
+**************
+
+The **Hardware Guide** describes the SQreamDB reference architecture, emphasizing the benefits to the technical audience, and provides guidance for end-users on selecting the right configuration for a SQreamDB installation.
+
+.. rubric:: Need help?
+
+This page is intended as a "reference" to suggested hardware. However, different workloads require different solution sizes. SQreamDB's experienced customer support has the experience to advise on these matters to ensure the best experience.
+
+Visit `SQreamDB's support portal `_ for additional support.
+
+.. contents:: 
+   :local:
+   :depth: 2
+
+
+Cluster Architectures
+=====================
+
+SQreamDB recommends rackmount servers by server manufacturers Dell, Lenovo, HP, Cisco, Supermicro, IBM, and others.
+
+A typical SQreamDB cluster includes one or more nodes, consisting of:
+
+* Two-socket enterprise processors, such as Intel® Xeon® Gold processors or the IBM® POWER9 processors, providing the high performance required for compute-bound database workloads.
+
+* NVIDIA Tesla GPU accelerators, with up to 5,120 CUDA and Tensor cores, running on PCIe or fast NVLINK busses, delivering high core count, and high-throughput performance on massive datasets.
+
+* High density chassis design, offering between 2 and 4 GPUs in a 1U, 2U, or 3U package, for best-in-class performance per cm\ :sup:`2`.
+
+Single-Node Cluster
+-------------------
+
+A single-node SQreamDB cluster can handle between 1 and 8 concurrent users, with up to 1PB of data storage (when connected via NAS).
+
+An average single-node cluster can be a rackmount server or workstation, containing the following components:
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - Dell R750, Dell R940xa, HP ProLiant DL380 Gen10 or similar (Intel only)
+   * - Processors
+     - 2x Intel Xeon Gold 6348 (28C/56HT) 3.5GHz or similar
+   * - RAM
+     - 1.5 TB
+   * - Onboard storage
+     - 
+         * 2x 960GB SSD 2.5in hot plug for OS, RAID1
+         * 2x 2TB SSD or NVMe, for temporary spooling, RAID0
+         * 10x 3.84TB SSD 2.5in Hot plug for storage, RAID6
+
+   * - GPU
+     - 
+        NVIDIA 2x A100, H100, or L40S
+    
+   * - Operating System
+     - Red Hat Enterprise Linux v8.9 or Amazon Linux 2
+
+.. note:: If you are using internal storage, your volumes must be formatted as xfs.
+
+In this system configuration, SQreamDB can store about 100TB of raw data (assuming an average compression ratio and ~30TB of usable raw storage).
+
+If a NAS is used, the 10x SSD drives can be omitted, but SQreamDB recommends 2TB of local spool space on SSD or NVMe drives.
+
+Multi-Node Cluster
+------------------
+
+Multi-node clusters can handle any number of concurrent users. A typical SQreamDB cluster relies on a minimum of two GPU-enabled servers and shared storage connected over a network fabric, such as InfiniBand EDR, 40GbE, or 100GbE.
+
+The **Multi-Node Cluster Examples** section describes the following specifications: 
+
+The following table shows SQreamDB's recommended hardware specifications:
+
+.. list-table::
+   :widths: 15 65
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - Dell R750, Dell R940xa, HP ProLiant DL380 Gen10 or similar (Intel only)
+   * - Processors
+     - 2x Intel Xeon Gold 6348 (28C/56HT) 3.5GHz or similar
+   * - RAM
+     - 2 TB
+   * - Onboard storage
+     -   
+         * 2x 960GB SSD 2.5in hot plug for OS, RAID1
+         * 2x 2TB SSD or NVMe, for temporary spooling, RAID0
+   * - Network Card (Storage)
+     - 2x Mellanox ConnectX-6 Single Port HDR VPI InfiniBand Adapter cards at 100GbE or similar.
+   * - Network Card (Client)
+     - 2x 1 GbE cards or similar   
+   * - External Storage
+     -   
+         * Mellanox Connectx5/6 100G NVIDIA Network Card (if applicable) or other high-speed network card minimum 40G compatible with customer’s infrastructure
+         * 50 TB (NAS connected over GPFS, Lustre, Weka, or VAST) GPFS recommended
+   * - GPU
+     - NVIDIA 2x A100, H100, or L40S
+   * - Operating System
+     - Red Hat Enterprise Linux v8.9 or Amazon Linux 2
+   
+Metadata Server
+---------------
+   
+The following table shows SQreamDB's recommended metadata server specifications:
+
+.. list-table::
+   :widths: 15 90
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - Dell R750, Dell R940xa, HP ProLiant DL380 Gen10 or similar (Intel only)
+   * - Processors
+     - 2x Intel Xeon Gold 6342 2.8 Ghz 24C processors or similar
+   * - RAM
+     - 512GB DDR4 RAM 8x64GB RDIMM or similar
+   * - Onboard storage
+     - 2x 960 GB MVMe SSD drives in RAID 1 or similar
+   * - Network Card (Storage)
+     - 2x Mellanox ConnectX-6 Single Port HDR VPI InfiniBand Adapter cards at 100GbE or similar.
+   * - Network Card (Client)
+     - 2x 1 GbE cards or similar
+   * - Operating System
+     - Red Hat Enterprise Linux v8.9 or Amazon Linux 2
+
+.. note:: With a NAS connected over GPFS, Lustre, Weka, or VAST, each SQreamDB worker can read data at 5GB/s or more.
+
+SQreamDB Studio Server
+----------------------
+
+The following table shows SQreamDB's recommended Studio server specifications:
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - Physical or virtual machine
+   * - Processor
+     - 1x Intel Core i7
+   * - RAM
+     - 16 GB
+   * - Onboard storage
+     - 50 GB SSD 2.5in Hot-plug for OS, RAID1
+   * - Operating System
+     - Red Hat Enterprise Linux v8.9
+
+Cluster Design Considerations
+=============================
+
+This section describes the following cluster design considerations:
+
+* In a SQreamDB installation, the storage and computing are logically separated. While they may reside on the same machine in a standalone installation, they may also reside on different hosts, providing additional flexibility and scalability.
+
+* SQreamDB uses all resources in a machine, including CPU, RAM, and GPU to deliver the best performance. At least 256GB of RAM per physical GPU is recommended.
+
+* Local disk space is required for good temporary spooling performance, particularly when performing intensive operations exceeding the available RAM, such as sorting. SQreamDB recommends an SSD or NVMe drive in RAID0 configuration with about twice the RAM size available for temporary storage. This can be shared with the operating system drive if necessary.
+
+* When using NAS devices, SQreamDB recommends approximately 5GB/s of burst throughput from storage per GPU.
+
+Balancing Cost and Performance
+------------------------------
+
+Prior to designing and deploying a SQreamDB cluster, a number of important factors must be considered. 
+
+The **Balancing Cost and Performance** section provides a breakdown of deployment details to ensure that this installation exceeds or meets the stated requirements. The rationale provided includes the necessary information for modifying configurations to suit the customer use-case scenario, as shown in the following table:
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Component
+     - Value
+   * - Compute - CPU
+     - Balance price and performance
+   * - Compute – GPU
+     - Balance price with performance and concurrency
+   * - Memory – GPU RAM
+     - Balance price with concurrency and performance.
+   * - Memory - RAM
+     - Balance price and performance
+   * - Operating System
+     - Availability, reliability, and familiarity
+   * - Storage
+     - Balance price with capacity and performance
+   * - Network
+     - Balance price and performance
+
+CPU Compute
+-----------
+
+SQreamDB relies on multi-core Intel Gold Xeon processors or IBM POWER9 processors and recommends a dual-socket machine populated with CPUs with 18C/36HT or better. While a higher core count may not necessarily affect query performance, more cores will enable higher concurrency and better load performance.
+
+GPU Compute and RAM
+-------------------
+
+The NVIDIA Tesla range of high-throughput GPU accelerators provides the best performance for enterprise environments. Most cards have ECC memory, which is crucial for delivering correct results every time. SQreamDB recommends the  NVIDIA Tesla A100 80GB GPU for the best performance and highest concurrent user support.
+
+GPU RAM, sometimes called GRAM or VRAM, is used for processing queries. It is possible to select GPUs with less RAM. However, the smaller GPU RAM results in reduced concurrency, as the GPU RAM is used extensively in operations like JOINs, ORDER BY, GROUP BY, and all SQL transforms.
+
+RAM
+---
+
+SQreamDB requires using **Error-Correcting Code memory (ECC)**, standard on most enterprise servers. Large amounts of memory are required for improved performance for heavy external operations, such as sorting and joining.
+
+SQreamDB recommends at least 256GB of RAM per GPU on your machine. 
+
+Operating System
+----------------
+
+SQreamDB can run on the following 64-bit Linux operating systems:
+
+   * Red Hat Enterprise Linux v8.9
+   * Amazon Linux 2
+
+
+Storage
+-------
+
+For clustered scale-out installations, SQreamDB relies on NAS storage. For stand-alone installations, SQreamDB relies on redundant disk configurations, such as RAID 5, 6, or 10. These RAID configurations replicate blocks of data between disks to avoid data loss or system unavailability. 
+
+SQreamDB recommends using enterprise-grade SAS SSD or NVMe drives. For a 32-user configuration, the number of GPUs should roughly match the number of users. SQreamDB recommends 1 Tesla A100 / H100 or L40S GPU per 2 users, for full, uninterrupted dedicated access.
diff --git a/getting_started/index.rst b/getting_started/index.rst
index f9a57a460..5f37bfc41 100644
--- a/getting_started/index.rst
+++ b/getting_started/index.rst
@@ -1,9 +1,10 @@
 .. _getting_started:
 
-*************************
+***************
 Getting Started
-*************************
-The **Getting Started** page describes the following things you need to start using SQream:
+***************
+
+The **Getting Started** page describes the following things you need to start using SQreamDB:
 
 .. toctree::
    :maxdepth: 1
@@ -11,6 +12,7 @@ The **Getting Started** page describes the following things you need to start us
 
    preparing_your_machine_to_install_sqream
    installing_sqream
-   creating_a_database
    executing_statements_in_sqream
-   performing_basic_sqream_operations
\ No newline at end of file
+   performing_basic_sqream_operations
+   hardware_guide
+   non_production_hardware_guide
diff --git a/getting_started/ingesting_data.rst b/getting_started/ingesting_data.rst
index ff82a685e..4da4d764c 100644
--- a/getting_started/ingesting_data.rst
+++ b/getting_started/ingesting_data.rst
@@ -3,6 +3,7 @@
 ****************************
 Ingesting Data
 ****************************
+
 After creating a database you can begin ingesting data into SQream.
 
 For more information about ingesting data, see `Data Ingestion Guides `_.
\ No newline at end of file
diff --git a/getting_started/inserting_rows.rst b/getting_started/inserting_rows.rst
index 890befebf..aa2bcd534 100644
--- a/getting_started/inserting_rows.rst
+++ b/getting_started/inserting_rows.rst
@@ -1,8 +1,9 @@
 .. _inserting_rows:
 
-****************************
+**************
 Inserting Rows
-****************************
+**************
+
 The **Inserting Rows** section describes the following:
 
 * :ref:`Inserting basic rows`
@@ -19,9 +20,8 @@ You can insert basic rows into a table using the ``INSERT`` statement. The inser
 
 .. code-block:: psql
 
-   test=> INSERT INTO cool_animals VALUES (1, 'Dog', 7);
+   INSERT INTO cool_animals VALUES (1, 'Dog', 7);
    
-   executed
 
 .. _changing_value_order:
 
@@ -31,9 +31,8 @@ You can change the order of values by specifying the column order, as shown in t
 
 .. code-block:: psql
 
-   test=> INSERT INTO cool_animals(weight, id, name) VALUES (3, 2, 'Possum');
+   INSERT INTO cool_animals(weight, id, name) VALUES (3, 2, 'Possum');
    
-   executed
 
 .. _inserting_multiple_rows:
 
@@ -43,14 +42,13 @@ You can insert multiple rows using the ``INSERT`` statement by using sets of par
 
 .. code-block:: psql
 
-   test=> INSERT INTO cool_animals VALUES
-         (3, 'Cat', 5) ,
-         (4, 'Elephant', 6500) ,
-         (5, 'Rhinoceros', 2100);
+   INSERT INTO cool_animals VALUES
+     (3, 'Cat', 5) ,
+     (4, 'Elephant', 6500) ,
+     (5, 'Rhinoceros', 2100);
    
-   executed
 
-.. note:: You can load large data sets using bulk loading methods instead. For more information, see :ref:`inserting_data`.
+.. note:: You can load large data sets using bulk loading methods instead. For more information, see :ref:`ingesting_data`.
 
 .. _omitting_columns:
 
@@ -60,16 +58,16 @@ Omitting columns that have a default values (including default ``NULL`` values)
 
 .. code-block:: psql
 
-   test=> INSERT INTO cool_animals (id) VALUES (6);
+   INSERT INTO cool_animals (id) VALUES (6);
    
-   executed
+
 
 .. code-block:: psql
 
-   test=> INSERT INTO cool_animals (id) VALUES (6);
+   INSERT INTO cool_animals (id) VALUES (6);
    
-   executed
-   test=> SELECT * FROM cool_animals;
+
+   SELECT * FROM cool_animals;
    1,Dog                 ,7
    2,Possum              ,3
    3,Cat                 ,5
@@ -77,7 +75,7 @@ Omitting columns that have a default values (including default ``NULL`` values)
    5,Rhinoceros          ,2100
    6,\N,\N
    
-   6 rows
+
 
 .. note:: Null row values are represented as ``\N``
 
diff --git a/getting_started/installing_sqream.rst b/getting_started/installing_sqream.rst
index 2ed54c1ab..09eefed8b 100644
--- a/getting_started/installing_sqream.rst
+++ b/getting_started/installing_sqream.rst
@@ -1,11 +1,14 @@
 .. _installing_sqream:
 
-****************************
-Installing SQream
-****************************
+*******************
+Installing SQreamDB
+*******************
 
-The **Installing SQream** section includes the following SQream installation methods:
-
-* `Installing SQream natively `_ - Describes installing SQream using binary packages provided by SQream.
-* `Installing SQream with Kubernetes `_ - Describes installing SQream using the Kubernetes open source platform.
-* `Installing and running SQream in a Docker container `_ - Describes how to run SQream in a Docker container.
\ No newline at end of file
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Method
+     - Description
+   * - :ref:`Installing SQreamDB natively `
+     - Describes installing SQreamDB using binary packages provided by SQreamDB
\ No newline at end of file
diff --git a/getting_started/listing_tables.rst b/getting_started/listing_tables.rst
index d376cb7ee..a197207e1 100644
--- a/getting_started/listing_tables.rst
+++ b/getting_started/listing_tables.rst
@@ -3,11 +3,11 @@
 ****************************
 Listing Tables
 ****************************
+
 To see the tables in the current database you can query the catalog, as shown in the following example:
 
 .. code-block:: psql
 
-   test=> SELECT table_name FROM sqream_catalog.tables;
+   SELECT table_name FROM sqream_catalog.tables;
    cool_animals
    
-   1 rows
\ No newline at end of file
diff --git a/getting_started/non_production_hardware_guide.rst b/getting_started/non_production_hardware_guide.rst
new file mode 100644
index 000000000..54c7780b0
--- /dev/null
+++ b/getting_started/non_production_hardware_guide.rst
@@ -0,0 +1,54 @@
+.. non_production_hardware_guide:
+
+**************************************
+Staging and Development Hardware Guide
+**************************************
+
+The **Staging and Development Hardware Guide** describes the SQream recommended HW for development, staging and or QA desktop and servers.
+
+.. warning:: The HW specification in this page are not intended for production use!
+
+Development Desktop
+-------------------
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - PC
+   * - Processor
+     - Intel i7
+   * - RAM
+     - 64GB RAM
+   * - Onboard storage
+     - 2TB SSD
+   * - GPU
+     - 1x NVIDIA RTX A4000 16GB
+   * - Operating System
+     - Red Hat Enterprise Linux v8.8
+
+
+Lab Server
+----------
+
+.. list-table::
+   :widths: auto
+   :header-rows: 1
+   
+   * - Component
+     - Type
+   * - Server
+     - Dell R640 or similar
+   * - Processor
+     - x2 Intel(R) Xeon(R) Silver 4112 CPU @ 2.60GHz
+   * - RAM
+     - 128 or 256 GB
+   * - Onboard storage
+     - "2x 960GB SSD 2.5in hot plug for OS, RAID1, 1(or more)x 3.84TB SSD 2.5in Hot plug for storage, RAID5"
+   * - GPU
+     - 1xNVIDIA A40 or A10
+   * - Operating System
+     - Red Hat Enterprise Linux v8.8 
diff --git a/getting_started/performing_basic_sqream_operations.rst b/getting_started/performing_basic_sqream_operations.rst
index ba0a6fc3f..af12ed813 100644
--- a/getting_started/performing_basic_sqream_operations.rst
+++ b/getting_started/performing_basic_sqream_operations.rst
@@ -1,8 +1,9 @@
 .. _performing_basic_sqream_operations:
 
-****************************
+**********************************
 Performing Basic SQream Operations
-****************************
+**********************************
+
 After installing SQream you can perform the operations described on this page:
 
 .. toctree::
@@ -15,4 +16,9 @@ After installing SQream you can perform the operations described on this page:
    inserting_rows
    running_queries
    deleting_rows
-   saving_query_results_to_a_csv_or_psv_file
\ No newline at end of file
+   saving_query_results_to_a_csv_or_psv_file
+
+For more information on other basic SQream operations, see the following:
+
+* `Creating a Database `_
+* :ref:`data_ingestion`
\ No newline at end of file
diff --git a/getting_started/preparing_your_machine_to_install_sqream.rst b/getting_started/preparing_your_machine_to_install_sqream.rst
index 435f35de0..01dc3560d 100644
--- a/getting_started/preparing_your_machine_to_install_sqream.rst
+++ b/getting_started/preparing_your_machine_to_install_sqream.rst
@@ -1,25 +1,22 @@
 .. _preparing_your_machine_to_install_sqream:
 
-****************************
-Preparing Your Machine to Install SQream
-****************************
-To prepare your machine to install SQream, do the following:
+*******************************************
+Preparing Your Machine to Install SQreamDB
+*******************************************
 
- * Set up your local machine according to SQream's recommended pre-installation configurations.
- 
-    ::
+To prepare your machine to install SQreamDB, do the following:
+
+ * Set up your local machine according to SQreamDB's recommended pre-installation configurations.
    
  * Verify you have an NVIDIA-capable server, either on-premise or on supported cloud platforms: 
 
-   * Red Hat Enterprise Linux v7.x   
- 
-   * CentOS v7.x
+   * Red Hat Enterprise Linux v8.9 
  
-   * Amazon Linux 7
+   * Amazon Linux 2
 	 
  * Verify that you have the following:
  
-   * An NVIDIA GPU - SQream recommends using a Tesla GPU.
+   * An NVIDIA GPU - SQreamDB recommends using a Tesla GPU.
  
 
    * An SSH connection to your server.
@@ -28,8 +25,8 @@ To prepare your machine to install SQream, do the following:
    * SUDO permissions for installation and configuration purposes.
  
  
-   * A SQream license - Contact support@sqream.com or your SQream account manager for your license key.
+   * A SQreamDB license - Contact `SQreamDB Support `_ for your license key.
 For more information, see the following:
 
-* :ref:`recommended_pre-installation_configurations`
-* `Hardware Guide `_
+* :ref:`pre-installation_configurations`
+* :ref:`hardware_guide`
\ No newline at end of file
diff --git a/getting_started/querying_data.rst b/getting_started/querying_data.rst
deleted file mode 100644
index 36bf9e78b..000000000
--- a/getting_started/querying_data.rst
+++ /dev/null
@@ -1,51 +0,0 @@
-.. _querying_data:
-
-****************************
-Querying Data
-****************************
-One of the most basic operations when using SQream is querying data.
-
-To begin familiarizing yourself with querying data, you can create the following table using the ``CREATE TABLE`` statement:
-
-.. code-block:: postgres
-   
-   CREATE TABLE nba
-   (
-      Name varchar(40),
-      Team varchar(40),
-      Number tinyint,
-      Position varchar(2),
-      Age tinyint,
-      Height varchar(4),
-      Weight real,
-      College varchar(40),
-      Salary float
-    );
-
-
-You can down download the above (:download:`nba.csv table `) if needed, shown below:
-
-.. csv-table:: nba.csv
-   :file: nba-t10.csv
-   :widths: auto
-   :header-rows: 1
-
-The above query gets the following from the table above, limited to showing the first ten results:
-
-* Name
-* Team name
-* Age
-
-.. code-block:: psql
-   
-   nba=> SELECT Name, Team, Age FROM nba LIMIT 10;
-   Avery Bradley,Boston Celtics,25
-   Jae Crowder,Boston Celtics,25
-   John Holland,Boston Celtics,27
-   R.J. Hunter,Boston Celtics,22
-   Jonas Jerebko,Boston Celtics,29
-   Amir Johnson,Boston Celtics,29
-   Jordan Mickey,Boston Celtics,21
-   Kelly Olynyk,Boston Celtics,25
-   Terry Rozier,Boston Celtics,22
-   Marcus Smart,Boston Celtics,22
\ No newline at end of file
diff --git a/getting_started/running_queries.rst b/getting_started/running_queries.rst
index 57f6811d7..fe3891e1c 100644
--- a/getting_started/running_queries.rst
+++ b/getting_started/running_queries.rst
@@ -3,6 +3,7 @@
 ****************************
 Running Queries
 ****************************
+
 The **Running Queries** section describes the following:
 
 * :ref:`Running basic queries`
@@ -21,15 +22,14 @@ You can run a basic query using the ``SELECT`` keyword, followed by a list of co
 
 .. code-block:: psql
 
-   test=> SELECT id, name, weight FROM cool_animals;
+   SELECT id, name, weight FROM cool_animals;
    1,Dog                 ,7
    2,Possum              ,3
    3,Cat                 ,5
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
    6,\N,\N
-   
-   6 rows
+  
    
 For more information on the ``SELECT`` keyword, see :ref:`select`.
 
@@ -41,15 +41,14 @@ You can output all columns without specifying them using the star operator ``*``
 
 .. code-block:: psql
 
-   test=> SELECT * FROM cool_animals;
+   SELECT * FROM cool_animals;
    1,Dog                 ,7
    2,Possum              ,3
    3,Cat                 ,5
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
    6,\N,\N
-   
-   6 rows
+  
 
 .. _outputting_shorthand_table_values:
 
@@ -59,10 +58,9 @@ You can output the number of values in a table without getting the full result s
 
 .. code-block:: psql
 
-   test=> SELECT COUNT(*) FROM cool_animals;
+   SELECT COUNT(*) FROM cool_animals;
    6
-   
-   1 row
+  
 
 .. _filtering_results:
 
@@ -72,11 +70,10 @@ You can filter results by adding a ``WHERE`` clause and specifying the filter co
 
 .. code-block:: psql
 
-   test=> SELECT id, name, weight FROM cool_animals WHERE weight > 1000;
+   SELECT id, name, weight FROM cool_animals WHERE weight > 1000;
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
-   
-   2 rows
+  
 
 .. _sorting_results:
 
@@ -86,7 +83,7 @@ You can sort results by adding an ``ORDER BY`` clause and specifying ascending (
 
 .. code-block:: psql
 
-   test=> SELECT * FROM cool_animals ORDER BY weight DESC;
+   SELECT * FROM cool_animals ORDER BY weight DESC;
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
    1,Dog                 ,7
@@ -94,8 +91,6 @@ You can sort results by adding an ``ORDER BY`` clause and specifying ascending (
    2,Possum              ,3
    6,\N,\N
 
-   6 rows
-
 .. _filtering_null_rows:
 
 **Filtering Null Rows**
@@ -104,14 +99,13 @@ You can filter null rows by adding an ``IS NOT NULL`` filter, as shown in the fo
 
 .. code-block:: psql
 
-   test=> SELECT * FROM cool_animals WHERE weight IS NOT NULL ORDER BY weight DESC;
+   SELECT * FROM cool_animals WHERE weight IS NOT NULL ORDER BY weight DESC;
    4,Elephant            ,6500
    5,Rhinoceros          ,2100
    1,Dog                 ,7
    3,Cat                 ,5
    2,Possum              ,3
 
-   5 rows
    
 For more information, see the following:
 
diff --git a/getting_started/running_the_sqream_sql_client.rst b/getting_started/running_the_sqream_sql_client.rst
index faba6494d..0731a20aa 100644
--- a/getting_started/running_the_sqream_sql_client.rst
+++ b/getting_started/running_the_sqream_sql_client.rst
@@ -1,8 +1,9 @@
 .. _running_the_sqream_sql_client:
 
-****************************
+*****************************
 Running the SQream SQL Client
-****************************
+*****************************
+
 The following example shows how to run the SQream SQL client:
 
 .. code-block:: psql
@@ -12,8 +13,7 @@ The following example shows how to run the SQream SQL client:
    
    Interactive client mode
    To quit, use ^D or \q.
-   
-   master=> _
+  
 
 Running the SQream SQL client prompts you to provide your password. Use the username and password that you have set up, or your DBA has provided.
   
diff --git a/getting_started/saving_query_results_to_a_csv_or_psv_file.rst b/getting_started/saving_query_results_to_a_csv_or_psv_file.rst
index 9a40ad440..f51ddaa10 100644
--- a/getting_started/saving_query_results_to_a_csv_or_psv_file.rst
+++ b/getting_started/saving_query_results_to_a_csv_or_psv_file.rst
@@ -1,15 +1,16 @@
 .. _saving_query_results_to_a_csv_or_psv_file:
 
-****************************
+*****************************************
 Saving Query Results to a CSV or PSV File
-****************************
+*****************************************
+
 You can save query results to a CSV or PSV file using the ``sqream sql`` command from a CLI client. This saves your query results to the selected delimited file format, as shown in the following example:
 
 .. code-block:: console
 
    $ sqream sql --username=mjordan --database=nba --host=localhost --port=5000 -c "SELECT * FROM nba LIMIT 5" --results-only --delimiter='|' > nba.psv
    $ cat nba.psv
-   Avery Bradley           |Boston Celtics        |0|PG|25|6-2 |180|Texas                |7730337
+   Avery Bradley           |Boston Celtics        |0|PG|25|6-2 |180|Texas                 |7730337
    Jae Crowder             |Boston Celtics        |99|SF|25|6-6 |235|Marquette            |6796117
    John Holland            |Boston Celtics        |30|SG|27|6-5 |205|Boston University    |\N
    R.J. Hunter             |Boston Celtics        |28|SG|22|6-5 |185|Georgia State        |1148640
@@ -21,4 +22,4 @@ For more output options, see :ref:`Controlling the Client Output`.
 * See the full :ref:`SQream SQL CLI reference `.
-* Connect a :ref:`third party tool ` to start analyzing data.
\ No newline at end of file
+* Connect a :ref:`third party tool ` to start analyzing data.
\ No newline at end of file
diff --git a/glossary.rst b/glossary.rst
index fd3cb246e..818cf3cf0 100644
--- a/glossary.rst
+++ b/glossary.rst
@@ -1,7 +1,7 @@
 .. glossary:
 
 Glossary
-=====================================
+========
 
 
 The following table shows the **Glossary** descriptions: 
@@ -27,7 +27,7 @@ The following table shows the **Glossary** descriptions:
 +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | Node            | A machine used to run SQream workers.                                                                                                                                                                        |
 +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-| Role            | A group or a user. For more information see `SQream Studio `_.                             |
+| Role            | A group or a user. For more information see :ref:`SQream Studio `.                                                                                    |
 +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | Storage cluster | The directory where SQream stores data.                                                                                                                                                                      |
 +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
diff --git a/index.rst b/index.rst
index e1455f21a..cc7d23ed0 100644
--- a/index.rst
+++ b/index.rst
@@ -1,97 +1,64 @@
 .. _index:
 
 *************************
-SQream DB Documentation
+SQreamDB Documentation
 *************************
 
-For SQream version 2021.2.
 
-.. only:: html
-
-   .. tip::
-      Want to read this offline?
-      `Download the documentation as a single PDF `_ .
-
-.. only:: pdf or latex
-   
-   .. tip:: This documentation is available online at https://docs.sqream.com/
-
-SQream DB is a columnar analytic SQL database management system. 
-
-SQream DB supports regular SQL including :ref:`a substantial amount of ANSI SQL`, uses :ref:`serializable transactions`, and :ref:`scales horizontally` for concurrent statements.
-
-Even a :ref:`basic SQream DB machine` can support tens to hundreds of terabytes of data.
-
-SQream DB easily plugs in to third-party tools like :ref:`Tableau` comes with standard SQL client drivers, including :ref:`JDBC`, :ref:`ODBC`, and :ref:`Python DB-API`.
-
-.. 
-   .. ref`features_tour`
-
-.. list-table::
-   :widths: 33 33 33
-   :header-rows: 0
-
-   * - **Get Started**
-     - **Reference**
-     - **Guides**
-   * -
-         `Getting Started `_
-         
-         :ref:`sql_feature_support`
-         
-         :ref:`Bulk load CSVs`
-     - 
-         :ref:`SQL Reference`
-         
-         :ref:`sql_statements`
-         
-         :ref:`sql_functions`
-     - 
-         `Setting up SQream `_
-         
-         :ref:`Best practices`
-         
-         :ref:`connect_to_tableau`
-
-   * - **Releases**
-     - **Driver and Deployment**
-     - **Help and Support**
-   * -
-         :ref:`2021.2<2021.2>`
-
-         :ref:`2021.1<2021.1>`
-        
-         :ref:`2020.3<2020.3>`
-
-         :ref:`2020.2<2020.2>`
-         
-         :ref:`2020.1<2020.1>`
-                  
-         :ref:`All recent releases`
+SQreamDB is a columnar analytic SQL database management system. SQreamDB supports regular SQL including :ref:`a substantial amount of ANSI SQL`, uses :ref:`serializable transactions`, and :ref:`scales horizontally` for concurrent statements. Even a :ref:`basic SQreamDB machine` can support tens to hundreds of terabytes of data. SQreamDB easily plugs in to third-party tools like :ref:`Tableau` comes with standard SQL client drivers, including :ref:`JDBC`, :ref:`ODBC`, and :ref:`Python DB-API`.
+
+
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| Topic                                                       | Description                                                                                                                              |
++=============================================================+==========================================================================================================================================+
+| **Getting Started**                                                                                                                                                                                    |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`preparing_your_machine_to_install_sqream`             | Set up your local machine according to SQreamDB’s recommended pre-installation configurations.                                           |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`executing_statements_in_sqream`                       | Provides more information about the available methods for executing statements in SQreamDB.                                              |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`performing_basic_sqream_operations`                   | Provides more information on performing basic operations.                                                                                |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`hardware_guide`                                       | Describes SQreamDB’s mandatory and recommended hardware settings, designed for a technical audience.                                     |
++---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Installation Guides**                                                                                                                                                                                |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`installing_and_launching_sqream`                      | Refers to SQreamDB’s installation guides.                                                                                                |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`sqream_studio_installation`                           | Refers to all installation guides required for installations related to Studio.                                                          |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| **Ingesting Data**                                                                                                                                                                                     |
++--------------------------+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`csv`               | :ref:`avro`                      |                                                                                                                                          |
++--------------------------+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`parquet`           | :ref:`orc`                       |                                                                                                                                          |
++--------------------------+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`json`              | :ref:`sqloader_as_a_service`     |                                                                                                                                          |
++--------------------------+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| **Connecting to SQreamDB**                                                                                                                                                                             |
++--------------------------+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`client_platforms`                                     | Describes how to install and connect a variety of third party connection platforms and tools.                                            |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`client_drivers`                                       | Describes how to use the SQreamDB client drivers and client applications with SQreamDB.                                                  |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`sqream_sql_cli_reference`                             | Describes how to use the SQreamDB client command line tool (CLI) with SQreamDB.                                                          |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| **External Storage Platforms**                                                                                                                                                                         |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`s3`                                                   | Describes how to insert data over a native S3 connector.                                                                                 |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
+| :ref:`hdfs`                                                 | Describes how to configure an HDFS environment for the user sqream and is only relevant for users with an HDFS environment.              |
++-------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------+
 
-     - 
-         :ref:`Client drivers`
-
-         :ref:`Third party tools integration`
-
-         :ref:`connect_to_tableau`
-     - 
-         :ref:`troubleshooting` guide
-         
-         :ref:`information_for_support`
 
+.. only:: html
 
+.. only:: pdf or latex
 
 .. rubric:: Need help?
 
-If you couldn't find what you're looking for, we're always happy to help. Visit `SQream's support portal `_ for additional support.
-
-
-.. rubric:: Looking for older versions?
-
-This version of the documentation is for SQream DB Version 2021.2.
+If you couldn't find what you're looking for, we're always happy to help. Visit `SQreamDB's support portal `_ for additional support.
 
-If you're looking for an older version of the documentation, versions 1.10 through 2019.2.1 are available at http://previous.sqream.com .
 
 .. toctree::
    :caption: Contents:
@@ -102,13 +69,15 @@ If you're looking for an older version of the documentation, versions 1.10 throu
 
    getting_started/index
    installation_guides/index
-   data_ingestion/index
-   third_party_tools/index
-   feature_guides/index
+   sqreamdb_on_aws/index
    operational_guides/index
-   sqream_studio_5.4.3/index
-   architecture/index
    configuration_guides/index
+   architecture/index
+   sqream_studio/index
+   connecting_to_sqream/index
+   data_ingestion/index
+   external_storage_platforms/index
+   feature_guides/index
    reference/index
    data_type_guides/index
    releases/index
diff --git a/installation_guides/index.rst b/installation_guides/index.rst
index 47f89c219..09bc68023 100644
--- a/installation_guides/index.rst
+++ b/installation_guides/index.rst
@@ -3,7 +3,8 @@
 *************************
 Installation Guides
 *************************
-Before you get started using SQream, consider your business needs and available resources. SQream was designed to run in a number of environments, and to be installed using different methods depending on your requirements. This determines which installation method to use.
+
+Before you get started using SQreamDB, consider your business needs and available resources. SQreamDB was designed to run in a number of environments, and to be installed using different methods depending on your requirements. This determines which installation method to use.
 
 The **Installation Guides** section describes the following installation guide sets:
 
@@ -12,4 +13,5 @@ The **Installation Guides** section describes the following installation guide s
    :glob:
 
    installing_and_launching_sqream
-   sqream_studio_installation
\ No newline at end of file
+   sqream_studio_installation
+   upgrade_guide/index   
\ No newline at end of file
diff --git a/installation_guides/installing_and_launching_sqream.rst b/installation_guides/installing_and_launching_sqream.rst
index 4ef1ef706..506d8e814 100644
--- a/installation_guides/installing_and_launching_sqream.rst
+++ b/installation_guides/installing_and_launching_sqream.rst
@@ -1,20 +1,16 @@
 .. _installing_and_launching_sqream:
 
-*************************
-Installing and Launching SQream
-*************************
-The **Installing SQream Studio** page incudes the following installation guides:
+*********************************
+Installing and Launching SQreamDB
+*********************************
+
+The **Installing and Launching SQreamDB** page includes the following installation guides:
 
 .. toctree::
    :maxdepth: 1
    :glob:
 
-   recommended_pre-installation_configurations
+   pre-installation_configurations
    installing_sqream_with_binary
-   running_sqream_in_a_docker_container
-   installing_sqream_with_kubernetes
    installing_monit
-   launching_sqream_with_monit
-
-
-
+   launching_sqream_with_monit
\ No newline at end of file
diff --git a/installation_guides/installing_dashboard_data_collector.rst b/installation_guides/installing_dashboard_data_collector.rst
deleted file mode 100644
index ba0475a6d..000000000
--- a/installation_guides/installing_dashboard_data_collector.rst
+++ /dev/null
@@ -1,163 +0,0 @@
-.. _installing_dashboard_data_collector:
-
-
-
-***********************
-Installing the Dashboard Data Collector
-***********************
-
-Installing the Dashboard Data Collector
-^^^^^^^^^^^^^^^
-After accessing the Prometheus user interface, you can install the **Dashboard Data Collector**. You must install the Dashboard Data Collector to enable the Dashboard in Studio.
-
-.. note:: Before installing the Dashboard Data collector, verify that Prometheus has been installed and configured for the cluster.
-
-How to install Prometheus from tarball - **Comment - this needs to be its own page.**
-
-**To install the Dashboard Data Collector:**
-
-1. Store the Data Collector Package obtained from `SQream Artifactory `_.
-
-  ::
-
-2. Extract and rename the package:
-
-   .. code-block:: console
-   
-      $ tar -xvf dashboard-data-collector-0.5.2.tar.gz 
-      $ mv package dashboard-data-collector
-	  
-3. Change your directory to the location of the package folder: 
-
-   .. code-block:: console
-   
-      $ cd dashboard-data-collector
-
-4. Set up the data collection by modifying the SQream and Data Collector IPs, ports, user name, and password according to the cluster:
-
-   .. code-block:: console
-   
-      $ npm run setup -- \
-      $ 	--host=127.0.0.1 \
-      $ 	--port=3108 \
-      $ 	--database=master \
-      $ 	--is-cluster=true \
-      $ 	--service=sqream \
-      $ 	--dashboard-user=sqream \
-      $ 	--dashboard-password=sqream \
-      $ 	--prometheus-url=http://127.0.0.1:9090/api/v1/query
-
-5. Debug the Data Collector: (**Comment** - *using the npm project manager*).
-
-   .. code-block:: console
-   
-      $ npm start
-
-   A json file is generated in the log, as shown below:   
-
-   .. code-block:: console
-   
-      $ {
-      $   "machines": [
-      $     {
-      $       "machineId": "dd4af489615",
-      $       "name": "Server 0",
-      $       "location": "192.168.4.94",
-      $       "totalMemory": 31.19140625,
-      $       "gpus": [
-      $         {
-      $           "gpuId": "GPU-b17575ec-eeba-3e0e-99cd-963967e5ee3f",
-      $           "machineId": "dd4af489615",
-      $           "name": "GPU 0",
-      $           "totalMemory": 3.9453125
-      $         }
-      $       ],
-      $       "workers": [
-      $         {
-      $           "workerId": "sqream_01",
-      $           "gpuId": "",
-      $           "name": "sqream_01"
-      $         }
-      $       ],
-      $       "storageWrite": 0,
-      $       "storageRead": 0,
-      $       "freeStorage": 0
-      $     },
-      $     {
-      $       "machineId": "704ec607174",
-      $       "name": "Server 1",
-      $       "location": "192.168.4.95",
-      $       "totalMemory": 31.19140625,
-      $       "gpus": [
-      $         {
-      $           "gpuId": "GPU-8777c14f-7611-517a-e9c7-f42eeb21700b",
-      $           "machineId": "704ec607174",
-      $           "name": "GPU 0",
-      $           "totalMemory": 3.9453125
-      $         }
-      $       ],
-      $       "workers": [
-      $         {
-      $           "workerId": "sqream_02",
-      $           "gpuId": "",
-      $           "name": "sqream_02"
-      $         }
-      $       ],
-      $       "storageWrite": 0,
-      $       "storageRead": 0,
-      $       "freeStorage": 0
-      $     }
-      $   ],
-      $   "clusterStatus": true,
-      $   "storageStatus": {
-      $     "dataStorage": 49.9755859375,
-      $     "totalDiskUsage": 52.49829018075231,
-      $     "storageDetails": {
-      $       "data": 0,
-      $       "freeData": 23.7392578125,
-      $       "tempData": 0,
-      $       "deletedData": 0,
-      $       "other": 26.236328125
-      $     },
-      $     "avgThroughput": {
-      $       "read": 0,
-      $       "write": 0
-      $     },
-      $     "location": "/"
-      $   },
-      $   "queues": [
-      $     {
-      $       "queueId": "sqream",
-      $       "name": "sqream",
-      $       "workerIds": [
-      $         "sqream_01",
-      $         "sqream_02"
-      $       ]
-      $     }
-      $   ],
-      $   "queries": [],
-      $   "collected": true,
-      $   "lastCollect": "2021-11-17T12:46:31.601Z"
-      $ }
-	  
-.. note:: Verify that all machines and workers are correctly registered.
-
-
-6. Press **CTRL + C** to stop ``npm start`` (**Comment** - *It may be better to refer to it as the npm project manager*).
-
-  ::
-
-
-7. Start the Data Collector with the ``pm2`` service:
-
-   .. code-block:: console
-   
-      $ pm2 start ./index.js --name=dashboard-data-collector
-	  
-8. Add the following parameter to the SQream Studio setup defined in :ref:`Step 4` in **Installing Studio** below.
-
-   .. code-block:: console
-   
-      --data-collector-url=http://127.0.0.1:8100/api/dashboard/data
-
-Back to :ref:`Installing Studio on a Stand-Alone Server`
diff --git a/installation_guides/installing_monit.rst b/installation_guides/installing_monit.rst
index b27800cce..51260f0b5 100644
--- a/installation_guides/installing_monit.rst
+++ b/installation_guides/installing_monit.rst
@@ -1,319 +1,207 @@
-.. _installing_monit:
-
-*********************************************
-Installing Monit
-*********************************************
-
-Getting Started
-==============================
-
-Before installing SQream with Monit, verify that you have followed the required :ref:`recommended pre-installation configurations `.
-
-The procedures in the **Installing Monit** guide must be performed on each SQream cluster node.
-
-
-
-
-
-.. _back_to_top:
-
-Overview
-==============================
-
-
-Monit is a free open source supervision utility for managing and monitoring Unix and Linux. Monit lets you view system status directly from the command line or from a native HTTP web server. Monit can be used to conduct automatic maintenance and repair, such as executing meaningful causal actions in error situations.
-
-SQream uses Monit as a watchdog utility, but you can use any other utility that provides the same or similar functionality.
-
-The **Installing Monit** procedures describes how to install, configure, and start Monit.
-
-You can install Monit in one of the following ways:
-
-* :ref:`Installing Monit on CentOS `
-* :ref:`Installing Monit on CentOS offline `
-* :ref:`Installing Monit on Ubuntu `
-* :ref:`Installing Monit on Ubuntu offline `
- 
- 
- 
-
-
-
-
-.. _installing-monit-on-centos:
-
-Installing Monit on CentOS:
-------------------------------------
-
-
-
-**To install Monit on CentOS:**   
-   
-1. Install Monit as a superuser on CentOS:
- 
-    .. code-block:: console
-     
-       $ sudo yum install monit  
-       
-       
-.. _installing-monit-on-centos-offline:
-
-
-	   
-Installing Monit on CentOS Offline:
-------------------------------------
-
-
-Installing Monit on CentOS offline can be done in either of the following ways:
-
-* :ref:`Building Monit from Source Code `
-* :ref:`Building Monit from Pre-Built Binaries `
-
- 
- 
- 
-.. _building_monit_from_source_code:
-
-Building Monit from Source Code
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-
-
-**To build Monit from source code:**
-
-1. Copy the Monit package for the current version:
-       
-   .. code-block:: console
-     
-      $ tar zxvf monit-.tar.gz
-       
- The value ``x.y.z`` denotes the version numbers.
-       
-2. Navigate to the directory where you want to store the package:
-
-   .. code-block:: console
-     
-      $ cd monit-x.y.z
- 
-3. Configure the files in the package:
-
-   .. code-block:: console
-     
-      $ ./configure (use ./configure --help to view available options)
- 
-4. Build and install the package:
-
-   .. code-block:: console
-     
-      $ make && make install
-      
-The following are the default storage directories:
-
-* The Monit package: **/usr/local/bin/**
-* The **monit.1 man-file**: **/usr/local/man/man1/**
-
-5. **Optional** - To change the above default location(s), use the **--prefix** option to ./configure.
-
-..
-  _**Comment - I took this line directly from the external online documentation. Is the "prefix option" referrin gto the "--help" in Step 3? URL: https://mmonit.com/wiki/Monit/Installation**
-
-6. **Optional** - Create an RPM package for CentOS directly from the source code:
-
-   .. code-block:: console
-     
-      $ rpmbuild -tb monit-x.y.z.tar.gz
-      
-..
-  _**Comment - Is this an optional or mandatory step?**
-
- 
-
-
-.. _building_monit_from_pre_built_binaries:   
-
-Building Monit from Pre-Built Binaries
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-**To build Monit from pre-built binaries:**
-
-1. Copy the Monit package for the current version:
-       
-   .. code-block:: console
-
-      $ tar zxvf monit-x.y.z-linux-x64.tar.gz
-      
-   The value ``x.y.z`` denotes the version numbers.
-
-2. Navigate to the directory where you want to store the package:
-
-   .. code-block:: console$ cd monit-x.y.z
-   
-3. Copy the **bin/monit** and **/usr/local/bin/** directories:
- 
-    .. code-block:: console
-
-      $ cp bin/monit /usr/local/bin/
- 
-4. Copy the **conf/monitrc** and **/etc/** directories:
- 
-    .. code-block:: console
-
-      $ cp conf/monitrc /etc/
-       
-..
-  _**Comment - please review this procedure.**
-
-For examples of pre-built Monit binarties, see :ref:`Download Precompiled Binaries`.
-
-:ref:`Back to top `
-
-
-
-.. _installing-monit-on-ubuntu:
-
-
-      
-Installing Monit on Ubuntu:
-------------------------------------
-
-
-**To install Monit on Ubuntu:**   
-   
-1. Install Monit as a superuser on Ubuntu:
-
-    .. code-block:: console
-     
-       $ sudo apt-get install monit
-	   
-:ref:`Back to top `
-
-
-	   
-.. _installing-monit-on-ubuntu-offline:
-
-
-Installing Monit on Ubuntu Offline:
--------------------------------------
-
-
-You can install Monit on Ubuntu when you do not have an internet connection.
-
-**To install Monit on Ubuntu offline:**   
-   
-1. Compress the required file:
-
-   .. code-block:: console
-     
-      $ tar zxvf monit--linux-x64.tar.gz
-      
-   **NOTICE:** ** denotes the version number.
-
-2. Navigate to the directory where you want to save the file:
-   
-   .. code-block:: console
-     
-      $ cd monit-x.y.z
-       
-3. Copy the **bin/monit** directory into the **/usr/local/bin/** directory:
-
-   .. code-block:: console
-     
-      $ cp bin/monit /usr/local/bin/
-       
-4. Copy the **conf/monitrc** directory into the **/etc/** directory:
-       
-   .. code-block:: console
-     
-      $ cp conf/monitrc /etc/
-	  
-:ref:`Back to top `
-
-       
-Configuring Monit
-====================================
-
-When the installation is complete, you can configure Monit. You configure Monit by modifying the Monit configuration file, called **monitrc**. This file contains blocks for each service that you want to monitor.
-
-The following is an example of a service block:
-
-    .. code-block:: console
-     
-       $ #SQREAM1-START
-       $ check process sqream1 with pidfile /var/run/sqream1.pid
-       $ start program = "/usr/bin/systemctl start sqream1"
-       $ stop program = "/usr/bin/systemctl stop sqream1"
-       $ #SQREAM1-END
-
-For example, if you have 16 services, you can configure this block by copying the entire block 15 times and modifying all service names as required, as shown below:
-
-    .. code-block:: console
-     
-       $ #SQREAM2-START
-       $ check process sqream2 with pidfile /var/run/sqream2.pid
-       $ start program = "/usr/bin/systemctl start sqream2"
-       $ stop program = "/usr/bin/systemctl stop sqream2"
-       $ #SQREAM2-END
-       
-For servers that don't run the **metadataserver** and **serverpicker** commands, you can use the block example above, but comment out the related commands, as shown below:
-
-    .. code-block:: console
-     
-       $ #METADATASERVER-START
-       $ #check process metadataserver with pidfile /var/run/metadataserver.pid
-       $ #start program = "/usr/bin/systemctl start metadataserver"
-       $ #stop program = "/usr/bin/systemctl stop metadataserver"
-       $ #METADATASERVER-END
-
-**To configure Monit:**   
-   
-1. Copy the required block for each required service.
-2. Modify all service names in the block.
-3. Copy the configured **monitrc** file to the **/etc/monit.d/** directory:
-
-   .. code-block:: console
-     
-      $ cp monitrc /etc/monit.d/
-       
-4. Set file permissions to **600** (full read and write access):
- 
-    .. code-block:: console
-
-       $ sudo chmod 600 /etc/monit.d/monitrc
-       
-5. Reload the system to activate the current configurations:
- 
-    .. code-block:: console
-     
-       $ sudo systemctl daemon-reload
- 
-6. **Optional** - Navigate to the **/etc/sqream** directory and create a symbolic link to the **monitrc** file:
- 
-    .. code-block:: console
-     
-      $ cd /etc/sqream
-      $ sudo ln -s /etc/monit.d/monitrc monitrc    
-         
-Starting Monit
-====================================  
-
-After configuring Monit, you can start it.
-
-**To start Monit:**
-
-1. Start Monit as a super user:
-
-   .. code-block:: console
-     
-      $ sudo systemctl start monit   
- 
-2. View Monit's service status:
-
-   .. code-block:: console
-     
-      $ sudo systemctl status monit
-
-3. If Monit is functioning correctly, enable the Monit service to start on boot:
-    
-   .. code-block:: console
-     
-      $ sudo systemctl enable monit
+.. _installing_monit:
+
+****************
+Installing Monit
+****************
+
+Monit is a free open source supervision utility for managing and monitoring Unix and Linux. Monit lets you view system status directly from the command line or from a native HTTP web server. Monit can be used to conduct automatic maintenance and repair, such as executing meaningful causal actions in error situations.
+
+SQreamDB uses Monit as a watchdog utility, but you can use any other utility that provides the same or similar functionality.
+
+The **Installing Monit** procedures describes how to install, configure, and start Monit.
+
+You can install Monit in one of the following ways:
+
+Getting Started
+===============
+
+Before installing SQreamDB with Monit, verify that you have followed the required :ref:`pre-installation_configurations` section. 
+
+The procedures in the **Installing Monit** guide must be performed on each SQreamDB cluster node.
+
+Installing Monit on RHEL:
+=========================
+   
+1. Install Monit as a superuser:
+ 
+    .. code-block:: console
+     
+       $ sudo yum install monit  
+       
+.. _building_monit_from_source_code:
+
+Building Monit
+==============
+
+Building Monit from Source Code
+-------------------------------
+
+**To build Monit from source code:**
+
+1. Copy the Monit package for the current version:
+       
+   .. code-block:: console
+     
+      $ tar zxvf monit-.tar.gz
+       
+ The value ``x.y.z`` denotes the version numbers.
+       
+2. Navigate to the directory where you want to store the package:
+
+   .. code-block:: console
+     
+      $ cd monit-x.y.z
+ 
+3. Configure the files in the package:
+
+   .. code-block:: console
+     
+      $ ./configure (use ./configure --help to view available options)
+ 
+4. Build and install the package:
+
+   .. code-block:: console
+     
+      $ make && make install
+      
+The following are the default storage directories:
+
+* The Monit package: **/usr/local/bin/**
+* The **monit.1 man-file**: **/usr/local/man/man1/**
+
+5. **Optional** - To change the above default location(s), use the **--prefix** option to ./configure.
+
+..
+  _**Comment - I took this line directly from the external online documentation. Is the "prefix option" referrin gto the "--help" in Step 3? URL: https://mmonit.com/wiki/Monit/Installation**
+
+6. **Optional** - Create an RPM package for RHEL directly from the source code:
+
+   .. code-block:: console
+     
+      $ rpmbuild -tb monit-x.y.z.tar.gz
+      
+
+.. _building_monit_from_pre_built_binaries:   
+
+Building Monit from Pre-Built Binaries
+--------------------------------------
+
+**To build Monit from pre-built binaries:**
+
+1. Copy the Monit package for the current version:
+       
+   .. code-block:: console
+
+      $ tar zxvf monit-x.y.z-linux-x64.tar.gz
+      
+   The value ``x.y.z`` denotes the version numbers.
+
+2. Navigate to the directory where you want to store the package:
+
+   .. code-block:: console$ cd monit-x.y.z
+   
+3. Copy the **bin/monit** and **/usr/local/bin/** directories:
+ 
+    .. code-block:: console
+
+      $ cp bin/monit /usr/local/bin/
+ 
+4. Copy the **conf/monitrc** and **/etc/** directories:
+ 
+    .. code-block:: console
+
+      $ cp conf/monitrc /etc/
+       
+..
+
+For examples of pre-built Monit binarties, see `Download Precompiled Binaries `_.
+
+       
+Configuring Monit
+=================
+
+When the installation is complete, you can configure Monit. You configure Monit by modifying the Monit configuration file, called **monitrc**. This file contains blocks for each service that you want to monitor.
+
+The following is an example of a service block:
+
+    .. code-block:: console
+     
+       $ #SQREAM1-START
+       $ check process sqream1 with pidfile /var/run/sqream1.pid
+       $ start program = "/usr/bin/systemctl start sqream1"
+       $ stop program = "/usr/bin/systemctl stop sqream1"
+       $ #SQREAM1-END
+
+For example, if you have 16 services, you can configure this block by copying the entire block 15 times and modifying all service names as required, as shown below:
+
+    .. code-block:: console
+     
+       $ #SQREAM2-START
+       $ check process sqream2 with pidfile /var/run/sqream2.pid
+       $ start program = "/usr/bin/systemctl start sqream2"
+       $ stop program = "/usr/bin/systemctl stop sqream2"
+       $ #SQREAM2-END
+       
+For servers that don't run the **metadataserver** and **serverpicker** commands, you can use the block example above, but comment out the related commands, as shown below:
+
+    .. code-block:: console
+     
+       $ #METADATASERVER-START
+       $ #check process metadataserver with pidfile /var/run/metadataserver.pid
+       $ #start program = "/usr/bin/systemctl start metadataserver"
+       $ #stop program = "/usr/bin/systemctl stop metadataserver"
+       $ #METADATASERVER-END
+
+**To configure Monit:**   
+   
+1. Copy the required block for each required service.
+2. Modify all service names in the block.
+3. Copy the configured **monitrc** file to the **/etc/monit.d/** directory:
+
+   .. code-block:: console
+     
+      $ cp monitrc /etc/monit.d/
+       
+4. Set file permissions to **600** (full read and write access):
+ 
+    .. code-block:: console
+
+       $ sudo chmod 600 /etc/monit.d/monitrc
+       
+5. Reload the system to activate the current configurations:
+ 
+    .. code-block:: console
+     
+       $ sudo systemctl daemon-reload
+ 
+6. **Optional** - Navigate to the **/etc/sqream** directory and create a symbolic link to the **monitrc** file:
+ 
+    .. code-block:: console
+     
+      $ cd /etc/sqream
+      $ sudo ln -s /etc/monit.d/monitrc monitrc    
+         
+Starting Monit
+==============  
+
+After configuring Monit, you can start it.
+
+**To start Monit:**
+
+1. Start Monit as a super user:
+
+   .. code-block:: console
+     
+      $ sudo systemctl start monit   
+ 
+2. View Monit's service status:
+
+   .. code-block:: console
+     
+      $ sudo systemctl status monit
+
+3. If Monit is functioning correctly, enable the Monit service to start on boot:
+    
+   .. code-block:: console
+     
+      $ sudo systemctl enable monit
diff --git a/installation_guides/installing_nginx_proxy_over_secure_connection.rst b/installation_guides/installing_nginx_proxy_over_secure_connection.rst
new file mode 100644
index 000000000..e4290b6fc
--- /dev/null
+++ b/installation_guides/installing_nginx_proxy_over_secure_connection.rst
@@ -0,0 +1,385 @@
+.. _installing_nginx_proxy_over_secure_connection:
+
+**************************************************
+Installing an NGINX Proxy Over a Secure Connection
+**************************************************
+
+Configuring your NGINX server to use a strong encryption for client connections provides you with secure servers requests, preventing outside parties from gaining access to your traffic.
+
+The Node.js platform that SQreamDB uses with our Studio user interface is susceptible to web exposure. This page describes how to implement HTTPS access on your proxy server to establish a secure connection.
+
+**TLS (Transport Layer Security)**, and its predecessor **SSL (Secure Sockets Layer)**, are standard web protocols used for wrapping normal traffic in a protected, encrypted wrapper. This technology prevents the interception of server-client traffic. It also uses a certificate system for helping users verify the identity of sites they visit. The **Installing an NGINX Proxy Over a Secure Connection** guide describes how to set up a self-signed SSL certificate for use with an NGINX web server on a RHEL server.
+
+.. note:: A self-signed certificate encrypts communication between your server and any clients. However, because it is not signed by trusted certificate authorities included with web browsers, you cannot use the certificate to automatically validate the identity of your server.
+
+A self-signed certificate may be appropriate if your domain name is not associated with your server, and in cases where your encrypted web interface is not user-facing. If you do have a domain name, using a CA-signed certificate is generally preferable.
+
+
+.. contents::
+   :local:
+   :depth: 1
+
+Prerequisites
+=============
+
+The following prerequisites are required for installing an NGINX proxy over a secure connection:
+
+* Super user privileges
+   
+* A domain name to create a certificate for
+
+Installing NGINX and Adjusting the Firewall
+===========================================
+
+After verifying that you have the above prerequisites, you must verify that the NGINX web server has been installed on your machine.
+
+Though NGINX is not available in the default RHEL repositories, it is available from the **EPEL (Extra Packages for Enterprise Linux)** repository.
+
+**To install NGINX and adjust the firewall:**
+
+1. Enable the EPEL repository to enable server access to the NGINX package:
+
+   .. code-block:: console
+
+      sudo yum install epel-release
+
+2. Install NGINX:
+
+   .. code-block:: console
+
+      sudo yum install nginx
+ 
+3. Start the NGINX service:
+
+   .. code-block:: console
+
+      sudo systemctl start nginx
+ 
+4. Verify that the service is running:
+
+   .. code-block:: console
+
+      systemctl status nginx
+
+   The following is an example of the correct output:
+
+   .. code-block:: console
+
+      Output● nginx.service - The nginx HTTP and reverse proxy server
+         Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
+         Active: active (running) since Fri 2017-01-06 17:27:50 UTC; 28s ago
+
+5. Enable NGINX to start when your server boots up:
+
+   .. code-block:: console
+
+      sudo systemctl enable nginx
+ 
+6. Verify that access to **ports 80 and 443** are not blocked by a firewall.
+
+	
+7. Do one of the following:
+
+   * If you are not using a firewall, skip to :ref:`Creating Your SSL Certificate`.
+
+	  
+   * If you have a running firewall, open ports 80 and 443:
+
+     .. code-block:: console
+
+        sudo firewall-cmd --add-service=http
+        sudo firewall-cmd --add-service=https
+        sudo firewall-cmd --runtime-to-permanent 
+
+8. If you have a running **iptables firewall**, for a basic rule set, add HTTP and HTTPS access:
+
+   .. code-block:: console
+
+      sudo iptables -I INPUT -p tcp -m tcp --dport 80 -j ACCEPT
+      sudo iptables -I INPUT -p tcp -m tcp --dport 443 -j ACCEPT
+
+   .. note:: The commands in Step 8 above are highly dependent on your current rule set.
+
+9. Verify that you can access the default NGINX page from a web browser.
+
+.. _creating_your_ssl_certificate:
+
+Creating Your SSL Certificate
+=============================
+
+After installing NGINX and adjusting your firewall, you must create your SSL certificate.
+
+TLS/SSL combines public certificates with private keys. The SSL key, kept private on your server, is used to encrypt content sent to clients, while the SSL certificate is publicly shared with anyone requesting content. In addition, the SSL certificate can be used to decrypt the content signed by the associated SSL key. Your public certificate is located in the **/etc/ssl/certs** directory on your server.
+
+This section describes how to create your **/etc/ssl/private directory**, used for storing your private key file. Because the privacy of this key is essential for security, the permissions must be locked down to prevent unauthorized access:
+
+**To create your SSL certificate:**
+
+1. Set the following permissions to **private**:
+
+   .. code-block:: console
+
+      sudo mkdir /etc/ssl/private
+      sudo chmod 700 /etc/ssl/private
+ 
+2. Create a self-signed key and certificate pair with OpenSSL with the following command:
+
+   .. code-block:: console
+
+      sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/nginx-selfsigned.key -out /etc/ssl/certs/nginx-selfsigned.crt
+ 
+   The following list describes the elements in the command above:
+   
+   * **openssl** - The basic command line tool used for creating and managing OpenSSL certificates, keys, and other files.
+
+
+   * **req** - A subcommand for using the X.509 **Certificate Signing Request (CSR)** management. A public key infrastructure standard, SSL and TLS adhere X.509 key and certificate management regulations.
+
+
+   * **-x509** - Used for modifying the previous subcommand by overriding the default functionality of generating a certificate signing request with making a self-signed certificate.
+
+
+   * **-nodes** - Sets **OpenSSL** to skip the option of securing our certificate with a passphrase, letting NGINX read the file without user intervention when the server is activated. If you don't use **-nodes** you must enter your passphrase after every restart.
+
+   * **-days 365** - Sets the certificate's validation duration to one year.
+
+
+   * **-newkey rsa:2048** - Simultaneously generates a new certificate and new key. Because the key required to sign the certificate was not created in the previous step, it must be created along with the certificate. The **rsa:2048** generates an RSA 2048 bits long.
+
+
+   * **-keyout** - Determines the location of the generated private key file.
+
+
+   * **-out** - Determines the location of the certificate.
+
+  After creating a self-signed key and certificate pair with OpenSSL, a series of prompts about your server is presented to correctly embed the information you provided in the certificate.
+
+3. Provide the information requested by the prompts.
+
+   The most important piece of information is the **Common Name**, which is either the server **FQDN** or **your** name. You must enter the domain name associated with your server or your server’s public IP address.
+
+   The following is an example of a filled out set of prompts:
+
+   .. code-block:: console
+
+      OutputCountry Name (2 letter code) [AU]:US
+      State or Province Name (full name) [Some-State]:New York
+      Locality Name (eg, city) []:New York City
+      Organization Name (eg, company) [Internet Widgits Pty Ltd]:Bouncy Castles, Inc.
+      Organizational Unit Name (eg, section) []:Ministry of Water Slides
+      Common Name (e.g. server FQDN or YOUR name) []:server_IP_address
+      Email Address []:admin@your_domain.com
+
+   Both files you create are stored in their own subdirectories of the **/etc/ssl** directory.
+
+   Although SQreamDB uses OpenSSL, in addition we recommend creating a strong **Diffie-Hellman** group, used for negotiating **Perfect Forward Secrecy** with clients.
+   
+4. Create a strong Diffie-Hellman group:
+
+   .. code-block:: console
+
+      sudo openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048
+ 
+   Creating a Diffie-Hellman group takes a few minutes, which is stored as the **dhparam.pem** file in the **/etc/ssl/certs** directory. This file can use in the configuration.
+   
+Configuring NGINX to use SSL
+============================
+
+After creating your SSL certificate, you must configure NGINX to use SSL.
+
+The default RHEL NGINX configuration is fairly unstructured, with the default HTTP server block located in the main configuration file. NGINX checks for files ending in **.conf** in the **/etc/nginx/conf.d** directory for additional configuration.
+
+SQreamDB creates a new file in the **/etc/nginx/conf.d** directory to configure a server block. This block serves content using the certificate files we generated. In addition, the default server block can be optionally configured to redirect HTTP requests to HTTPS.
+
+.. note:: The example on this page uses the IP address **127.0.0.1**, which you should replace with your machine's IP address.
+
+**To configure NGINX to use SSL:**
+
+1. Create and open a file called **ssl.conf** in the **/etc/nginx/conf.d** directory:
+
+   .. code-block:: console
+
+      sudo vi /etc/nginx/conf.d/ssl.conf
+
+2. In the file you created in Step 1 above, open a server block:
+
+   1. Listen to **port 443**, which is the TLS/SSL default port.
+   
+   
+   2. Set the ``server_name`` to the server’s domain name or IP address you used as the Common Name when generating your certificate.
+
+	   
+   3. Use the ``ssl_certificate``, ``ssl_certificate_key``, and ``ssl_dhparam`` directives to set the location of the SSL files you generated, as shown in the **/etc/nginx/conf.d/ssl.conf** file below:
+   
+   .. code-block:: console
+
+          upstream ui {
+              server 127.0.0.1:8080;
+          }
+      server {
+          listen 443 http2 ssl;
+          listen [::]:443 http2 ssl;
+
+          server_name nginx.sq.l;
+
+          ssl_certificate /etc/ssl/certs/nginx-selfsigned.crt;
+          ssl_certificate_key /etc/ssl/private/nginx-selfsigned.key;
+          ssl_dhparam /etc/ssl/certs/dhparam.pem;
+
+      root /usr/share/nginx/html;
+
+      #    location / {
+      #    }
+
+        location / {
+              proxy_pass http://ui;
+              proxy_set_header           X-Forwarded-Proto https;
+              proxy_set_header           X-Forwarded-For $proxy_add_x_forwarded_for;
+              proxy_set_header           X-Real-IP       $remote_addr;
+              proxy_set_header           Host $host;
+                      add_header                 Front-End-Https   on;
+              add_header                 X-Cache-Status $upstream_cache_status;
+              proxy_cache                off;
+              proxy_cache_revalidate     off;
+              proxy_cache_min_uses       1;
+              proxy_cache_valid          200 302 1h;
+              proxy_cache_valid          404 3s;
+              proxy_cache_use_stale      error timeout invalid_header updating http_500 http_502 http_503 http_504;
+              proxy_no_cache             $cookie_nocache $arg_nocache $arg_comment $http_pragma $http_authorization;
+              proxy_redirect             default;
+              proxy_max_temp_file_size   0;
+              proxy_connect_timeout      90;
+              proxy_send_timeout         90;
+              proxy_read_timeout         90;
+              proxy_buffer_size          4k;
+              proxy_buffering            on;
+              proxy_buffers              4 32k;
+              proxy_busy_buffers_size    64k;
+              proxy_temp_file_write_size 64k;
+              proxy_intercept_errors     on;
+
+              proxy_set_header           Upgrade $http_upgrade;
+              proxy_set_header           Connection "upgrade";
+          }
+
+          error_page 404 /404.html;
+          location = /404.html {
+          }
+
+          error_page 500 502 503 504 /50x.html;
+          location = /50x.html {
+          }
+      }
+ 
+4. Open and modify the **nginx.conf** file located in the **/etc/nginx/conf.d** directory as follows:
+
+   .. code-block:: console
+
+      sudo vi /etc/nginx/conf.d/nginx.conf
+	 
+   .. code-block:: console      
+
+       server {
+           listen       80;
+           listen       [::]:80;
+           server_name  _;
+           root         /usr/share/nginx/html;
+
+           # Load configuration files for the default server block.
+           include /etc/nginx/default.d/*.conf;
+
+           error_page 404 /404.html;
+           location = /404.html {
+           }
+
+           error_page 500 502 503 504 /50x.html;
+           location = /50x.html {
+           }
+       }
+	   
+Redirecting Studio Access from HTTP to HTTPS
+============================================
+
+After configuring NGINX to use SSL, you must redirect Studio access from HTTP to HTTPS.
+
+According to your current configuration, NGINX responds with encrypted content for requests on port 443, but with **unencrypted** content for requests on **port 80**. This means that our site offers encryption, but does not enforce its usage. This may be fine for some use cases, but it is usually better to require encryption. This is especially important when confidential data like passwords may be transferred between the browser and the server.
+
+The default NGINX configuration file allows us to easily add directives to the default port 80 server block by adding files in the /etc/nginx/default.d directory.
+
+**To create a redirect from HTTP to HTTPS:**
+
+1. Create a new file called **ssl-redirect.conf** and open it for editing:
+
+   .. code-block:: console
+
+      sudo vi /etc/nginx/default.d/ssl-redirect.conf
+
+2. Copy and paste this line:
+
+   .. code-block:: console
+
+      return 301 https://$host$request_uri:8080/;
+	  
+Activating Your NGINX Configuration
+===================================
+
+After redirecting from HTTP to HTTPs, you must restart NGINX to activate your new configuration.
+
+**To activate your NGINX configuration:**
+
+1. Verify that your files contain no syntax errors:
+
+   .. code-block:: console
+
+      sudo nginx -t
+   
+   The following output is generated if your files contain no syntax errors:
+
+   .. code-block:: console
+
+      nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
+      nginx: configuration file /etc/nginx/nginx.conf test is successful
+
+2. Restart NGINX to activate your configuration:
+
+   .. code-block:: console
+
+      sudo systemctl restart nginx
+
+Verifying that NGINX is Running
+===============================
+
+After activating your NGINX configuration, you must verify that NGINX is running correctly.
+
+**To verify that NGINX is running correctly:**
+
+1. Check that the service is up and running:
+
+   .. code-block:: console
+
+      systemctl status nginx
+  
+   The following is an example of the correct output:
+
+   .. code-block:: console
+
+      Output● nginx.service - The nginx HTTP and reverse proxy server
+         Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
+         Active: active (running) since Fri 2017-01-06 17:27:50 UTC; 28s ago
+
+ 
+2. Run the following command:
+
+   .. code-block:: console
+
+      sudo netstat -nltp |grep nginx
+ 
+   The following is an example of the correct output:
+
+   .. code-block:: console
+
+      [sqream@dorb-pc etc]$ sudo netstat -nltp |grep nginx
+      tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      15486/nginx: master 
+      tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      15486/nginx: master 
+      tcp6       0      0 :::80                   :::*                    LISTEN      15486/nginx: master 
+      tcp6       0      0 :::443                  :::*                    LISTEN      15486/nginx: master
\ No newline at end of file
diff --git a/installation_guides/installing_prometheus_exporters.rst b/installation_guides/installing_prometheus_exporters.rst
deleted file mode 100644
index a1381fea3..000000000
--- a/installation_guides/installing_prometheus_exporters.rst
+++ /dev/null
@@ -1,196 +0,0 @@
-.. _installing_prometheus_exporters:
-
-*********************************************
-Installing Prometheus Exporter
-*********************************************
-
-The **Installing Prometheus Exporters** guide includes the following sections:
-
-.. contents::
-   :local:
-   :depth: 1
-
-Overview
-==============================
-The **Prometheus** exporter is an open-source systems monitoring and alerting toolkit. It is used for collecting metrics from an operating system and exporting them to a graphic user interface. 
-
-The Installing Prometheus Exporters guide describes how to installing the following exporters:
-
-* The **Node_exporter** - the basic exporter used for displaying server metrics, such as CPU and memory.
-
-* The **Nvidia_exporter** - shows Nvidia GPU metrics.
-
-* The **process_exporter** - shows data belonging to the server's running processes.
-
-For information about more exporters, see `Exporters and Integration `_
-
-Adding a User and Group
-=====================
-Adding a user and group determines who can run processes.
-
-You can add users with the following command:
-
-.. code-block:: console
-     
-   $ sudo groupadd --system prometheus
-	  
-You can add groups with the following command:
-
-.. code-block:: console
-     
-   $ sudo useradd -s /sbin/nologin --system -g prometheus prometheus
-
-Cloning the Prometheus GIT Project
-=====================
-After adding a user and group you must clone the Prometheus GIT project.
-
-You can clone the Prometheus GIT project with the following command:
-
-.. code-block:: console
-     
-   $ git clone http://gitlab.sq.l/IT/promethues.git prometheus
-	  
-.. note:: If you experience difficulties cloning the Prometheus GIT project or receive an error, contact your IT department.
-
-The following shows the result of cloning your Prometheus GIT project:
-
-.. code-block:: console
-     
-   $ prometheus/
-   $ ├── node_exporter
-   $ │   └── node_exporter
-   $ ├── nvidia_exporter
-   $ │   └── nvidia_exporter
-   $ ├── process_exporter
-   $ │   └── process-exporter_0.5.0_linux_amd64.rpm
-   $ ├── README.md
-   $ └── services
-   $     ├── node_exporter.service
-   $     └── nvidia_exporter.service	  
-	  
-Installing the Node Exporter and NVIDIA Exporter
-=====================
-After cloning the Prometheus GIT project you must install the **node_exporter** and **NVIDIA_exporter**.
-
-**To install the node_exporter and NVIDIA_exporter:**
-
-1. Navigate to the cloned folder:
-
-   .. code-block:: console
-     
-      $ cd prometheus
-   
-2. Copy **node_exporter** and **nvidia_exporter** to **/usr/bin/**.	  
-
-   .. code-block:: console
-     
-      $ sudo cp node_exporter/node_exporter /usr/bin/
-      $ sudo cp nvidia_exporter/nvidia_exporter /usr/bin/
-   	  
-3. Copy the **services** files to the services folder:	  
-
-   .. code-block:: console
-     
-      $ sudo cp services/node_exporter.service /etc/systemd/system/
-      $ sudo cp services/nvidia_exporter.service /etc/systemd/system/
-   	  
-4. Reload the services so that they can be run:	  
-
-   .. code-block:: console
-     
-      $ sudo systemctl daemon-reload  
-   	  
-5. Set the permissions and group for both service files:
-
-   .. code-block:: console
-     
-      $ sudo chown prometheus:prometheus /usr/bin/node_exporter
-      $ sudo chmod u+x /usr/bin/node_exporter
-      $ sudo chown prometheus:prometheus /usr/bin/nvidia_exporter
-      $ sudo chmod u+x /usr/bin/nvidia_exporter
-   
-6. Start both services:
-
-   .. code-block:: console
-     
-      $ sudo systemctl start node_exporter && sudo systemctl enable node_exporter
-   
-7. Set both services to start automatically when the server is booted up:
-
-   .. code-block:: console
-
-      $ sudo systemctl start nvidia_exporter && sudo systemctl enable nvidia_exporter
-   
-8. Verify that the server's status is **active (running)**:
-
-   .. code-block:: console
-     
-      $ sudo systemctl status node_exporter && sudo systemctl status nvidia_exporter
-   
-   The following is the correct output:
-
-   .. code-block:: console
-     
-      $ ● node_exporter.service - Node Exporter
-      $    Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
-      $    Active: active (running) since Wed 2019-12-11 12:28:31 IST; 1 months 5 days ago
-      $  Main PID: 28378 (node_exporter)
-      $    CGroup: /system.slice/node_exporter.service
-      $ 
-      $ ● nvidia_exporter.service - Nvidia Exporter
-      $    Loaded: loaded (/etc/systemd/system/nvidia_exporter.service; enabled; vendor preset: disabled)
-      $    Active: active (running) since Wed 2020-01-22 13:40:11 IST; 31min ago
-      $  Main PID: 1886 (nvidia_exporter)
-      $    CGroup: /system.slice/nvidia_exporter.service
-      $            └─1886 /usr/bin/nvidia_exporter
-   	  
-Installing the Process Exporter
-=====================
-After installing the **node_exporter** and **Nvidia_exporter** you must install the **process_exporter**.
-
-**To install the process_exporter:**
-
-1. Do one of the following:
-
-   * For **CentOS**, run ``sudo rpm -i process_exporter/process-exporter_0.5.0_linux_amd64.rpm``.
-   * For **Ubuntu**, run ``sudo dpkg -i process_exporter/process-exporter_0.6.0_linux_amd64.deb``.
-   
-2. Verify that the process_exporter is running:
-
-   .. code-block:: console
-     
-      $ sudo systemctl status process-exporter  
-	  
-3. Set the process_exporter to start automatically when the server is booted up:
-	  
-   .. code-block:: console
-     
-      $ sudo systemctl enable process-exporter
-	  
-Opening the Firewall Ports
-=====================
-After installing the **process_exporter** you must open the firewall ports for the following services:
-
-* **node_exporter** - port: 9100
-
-* **nvidia_exporter** - port: 9445
-
-* **process-exporter** - port: 9256
-
-.. note:: This procedure is only relevant if your firwall is running.
-
-**To open the firewall ports:**
-
-1. Run the following command:
-	  
-   .. code-block:: console
-     
-      $ sudo firewall-cmd --zone=public --add-port=/tcp --permanent
-	  
-2. Reload the firewall:
-	  
-   .. code-block:: console
-     
-      $ sudo firewall-cmd --reload
-	  
-3. Verify that the changes have taken effect.
\ No newline at end of file
diff --git a/installation_guides/installing_prometheus_using_binary_packages.rst b/installation_guides/installing_prometheus_using_binary_packages.rst
deleted file mode 100644
index a6104bdd0..000000000
--- a/installation_guides/installing_prometheus_using_binary_packages.rst
+++ /dev/null
@@ -1,241 +0,0 @@
-.. _installing_prometheus_using_binary_packages:
-
-.. _install_prometheus_binary_top:
-
-***********************
-Installing Prometheus Using Binary Packages
-***********************
-
-
-
-The **Installing Prometheus Using Binary Packages** guide includes the following sections:
-
-.. contents::
-   :local:
-   :depth: 1
-
-Overview
-^^^^^^^^^^^^^^^
-Prometheus is an application used for event monitoring and alerting.
-
-Installing Prometheus
-^^^^^^^^^^^^^^^
-You must install Prometheus before installing the Dashboard Data Collector.
-
-**To install Prometheus:**
-
-1. Verify the following:
-
-   1. That you have **sudo** access to your Linux server.
-   2. That your server has access to the internet (for downloading the Prometheus binary package).
-   3. That your firewall rules are opened for accessing Prometheus Port 9090.
-   
-2. Navigate to the Prometheus `Download `_ page and download the **prometheus-2.32.0-rc.1.linux-amd64.tar.gz** package.
-
-    ::
-
-3. Do the following:
-
-   1. Download the source using the ``curl`` command:
-
-      .. code-block:: console
-     
-         $ curl -LO url -LO https://github.com/prometheus/prometheus/releases/download/v2.22.0/prometheus-2.22.0.linux-amd64.tar.gz
-
-   2. Extract the file contents:
-
-      .. code-block:: console
-     
-         $ tar -xvf prometheus-2.22.0.linux-amd64.tar.gz
-
-   3. Rename the extracted folder **prometheus-files**:
-
-      .. code-block:: console
-     
-         $ mv prometheus-2.22.0.linux-amd64 prometheus-files
-
-4. Create a Prometheus user:
-
-   .. code-block:: console
-     
-      $ sudo useradd --no-create-home --shell /bin/false prometheus
-
-5. Create your required directories:
-
-   .. code-block:: console
-     
-      $ sudo mkdir /etc/prometheus
-      $ sudo mkdir /var/lib/prometheus
-	  
-6. Set the Prometheus user as the owner of your required directories:
-
-   .. code-block:: console
-     
-      $ sudo chown prometheus:prometheus /etc/prometheus
-      $ sudo chown prometheus:prometheus /var/lib/prometheus
-	  
-7. Copy the Prometheus and Promtool binary packages from the **prometheus-files** folder to **/usr/local/bin**:
-
-   .. code-block:: console
-     
-      $ sudo cp prometheus-files/prometheus /usr/local/bin/
-      $ sudo cp prometheus-files/promtool /usr/local/bin/
-
-8. Change the ownership to the prometheus user:
-
-   .. code-block:: console
-     
-      $ sudo chown prometheus:prometheus /usr/local/bin/prometheus
-      $ sudo chown prometheus:prometheus /usr/local/bin/promtool
-
-9. Move the **consoles** and **consoles_libraries** directories from **prometheus-files** folder to **/etc/prometheus** folder:
-
-   .. code-block:: console
-     
-      $ sudo cp -r prometheus-files/consoles /etc/prometheus
-	  $ sudo cp -r prometheus-files/console_libraries /etc/prometheus
-
-10. Change the ownership to the prometheus user:
-
-    .. code-block:: console
-     
-       $ sudo chown -R prometheus:prometheus /etc/prometheus/consoles
-       $ sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries
-
-For more information on installing the Dashboard Data Collector, see `Installing the Dashboard Data Collector `_.
-
-Back to :ref:`Installing Prometheus Using Binary Packages`
-
-Configuring Your Prometheus Settings
-^^^^^^^^^^^^^^^
-After installing Prometheus you must configure your Prometheus settings. You must perform all Prometheus configurations in the **/etc/prometheus/prometheus.yml** file.
-
-**To configure your Prometheus settings:**
-
-1. Create your **prometheus.yml** file:
-
-   .. code-block:: console
-     
-      $ sudo vi /etc/prometheus/prometheus.yml
-	  
-2. Copy the contents below into your prometheus.yml file:
-
-   .. code-block:: console
-     
-      $ #node_exporter port : 9100
-      $ #nvidia_exporter port: 9445
-      $ #process-exporter port: 9256
-      $ 
-      $ global:
-      $   scrape_interval: 10s
-      $ 
-      $ scrape_configs:
-      $   - job_name: 'prometheus'
-      $     scrape_interval: 5s
-      $     static_configs:
-      $       - targets:
-      $         - :9090
-      $   - job_name: 'processes'
-      $     scrape_interval: 5s
-      $     static_configs:
-      $       - targets:
-      $         - :9256
-      $         - :9256
-      $   - job_name: 'nvidia'
-      $     scrape_interval: 5s
-      $     static_configs:
-      $       - targets:
-      $         - :9445
-      $         - :9445
-      $   - job_name: 'nodes'
-      $     scrape_interval: 5s
-      $     static_configs:
-      $       - targets:
-      $         - :9100
-      $         - :9100
-  
-3. Change the ownership of the file to the prometheus user:
-
-   .. code-block:: console
-     
-      $ sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
-	  
-Back to :ref:`Installing Prometheus Using Binary Packages`
-
-Configuring Your Prometheus Service File	  
-^^^^^^^^^^^^^^^
-After configuring your Prometheus settings you must configure your Prometheus service file.
-
-**To configure your Prometheus service file**:
-
-1. Create your **prometheus.yml** file:
-
-   .. code-block:: console
-     
-      $ sudo vi /etc/systemd/system/prometheus.service
-	  
-2. Copy the contents below into your prometheus service file:
-
-   .. code-block:: console
-     
-      $ [Unit]
-      $ Description=Prometheus
-      $ Wants=network-online.target
-      $ After=network-online.target
-      $ 
-      $ [Service]
-      $ User=prometheus
-      $ Group=prometheus
-      $ Type=simple
-      $ ExecStart=/usr/local/bin/prometheus \
-      $     --config.file /etc/prometheus/prometheus.yml \
-      $     --storage.tsdb.path /var/lib/prometheus/ \
-      $     --web.console.templates=/etc/prometheus/consoles \
-      $     --web.console.libraries=/etc/prometheus/console_libraries
-      $ 
-      $ [Install]
-      $ WantedBy=multi-user.target
-
-3. Register the prometheus service by reloading the **systemd** service:
-
-   .. code-block:: console
-     
-      $ sudo systemctl daemon-reload
-	  
-4. Start the prometheus service:
-
-   .. code-block:: console
-     
-      $ sudo systemctl start prometheus
-
-5. Check the status of the prometheus service:
-
-   .. code-block:: console
-     
-      $ sudo systemctl status prometheus
-	  
- If the status is ``active (running)``, you have configured your Prometheus service file correctly.
-
-Back to :ref:`Installing Prometheus Using Binary Packages`
-
-Accessing the Prometheus User Interface
-^^^^^^^^^^^^^^^
-After configuring your prometheus service file, you can access the Prometheus user interface.
-
-You can access the Prometheus user interface by running the following command:
-
-.. code-block:: console
-     
-   $ http://:9090/graph
-
-The Prometheus user interface is displayed.
-
-From the **Query** tab you can query metrics, as shown below:
-
-.. list-table::
-   :widths: auto
-   :header-rows: 0
-   
-   * - .. image:: /_static/images/3c9c4e8b-49bd-44a8-9829-81d1772ed962.gif   
-
-Back to :ref:`Installing Prometheus Using Binary Packages`
diff --git a/installation_guides/installing_sqream_with_binary.rst b/installation_guides/installing_sqream_with_binary.rst
index dd1207ab7..9c79c1688 100644
--- a/installation_guides/installing_sqream_with_binary.rst
+++ b/installation_guides/installing_sqream_with_binary.rst
@@ -1,279 +1,175 @@
-.. _installing_sqream_with_binary:
-
-*********************************************
-Installing SQream Using Binary Packages
-*********************************************
-This procedure describes how to install SQream using Binary packages and must be done on all servers.
-
-**To install SQream using Binary packages:**
-
-1. Copy the SQream package to the **/home/sqream** directory for the current version:
-
-   .. code-block:: console
-   
-      $ tar -xf sqream-db-v<2020.2>.tar.gz
-
-2. Append the version number to the name of the SQream folder. The version number in the following example is **v2020.2**:
-
-   .. code-block:: console
-   
-      $ mv sqream sqream-db-v<2020.2>
-
-3. Move the new version of the SQream folder to the **/usr/local/** directory:
-
-   .. code-block:: console
-   
-      $ sudo mv sqream-db-v<2020.2> /usr/local/
-      
-4. Change the ownership of the folder to **sqream folder**:
-
-   .. code-block:: console
-   
-      $ sudo chown -R sqream:sqream  /usr/local/sqream-db-v<2020.2>
-
-5. Navigate to the **/usr/local/** directory and create a symbolic link to SQream:
-
-   .. code-block:: console
-   
-      $ cd /usr/local
-      $ sudo ln -s sqream-db-v<2020.2> sqream
-      
-6. Verify that the symbolic link that you created points to the folder that you created:
-
-   .. code-block:: console
-   
-      $ ls -l
-      
-7. Verify that the symbolic link that you created points to the folder that you created:
-
-   .. code-block:: console
-   
-      $ sqream -> sqream-db-v<2020.2>
-      
-8. Create the SQream configuration file destination folders and set their ownership to **sqream**:
-
-   .. code-block:: console
-   
-      $ sudo mkdir /etc/sqream
-      $ sudo chown -R sqream:sqream /etc/sqream
-      
-9. Create the SQream service log destination folders and set their ownership to **sqream**:
-
-   .. code-block:: console
-   
-      $ sudo mkdir /var/log/sqream
-      $ sudo chown -R sqream:sqream /var/log/sqream
-
-10. Navigate to the **/usr/local/** directory and copy the SQream configuration files from them:
-
-   .. code-block:: console
-   
-      $ cd /usr/local/sqream/etc/
-      $ cp * /etc/sqream
-      
-The configuration files are **service configuration files**, and the JSON files are **SQream configuration files**, for a total of four files. The number of SQream configuration files and JSON files must be identical.
-      
-**NOTICE** - Verify that the JSON files have been configured correctly and that all required flags have been set to the correct values.
-
-In each JSON file, the following parameters **must be updated**:
-
-* instanceId
-* machineIP
-* metadataServerIp
-* spoolMemoryGB
-* limitQueryMemoryGB
-* gpu
-* port
-* ssl_port
-
-Note the following:
-
-* The value of the **metadataServerIp** parameter must point to the IP that the metadata is running on.
-* The value of the **machineIP** parameter must point to the IP of your local machine.
-
-It would be same on server running metadataserver and different on other server nodes.
-
-11. **Optional** - To run additional SQream services, copy the required configuration files and create additional JSON files:
-
-   .. code-block:: console
-   
-      $ cp sqream2_config.json sqream3_config.json
-      $ vim sqream3_config.json
-      
-**NOTICE:** A unique **instanceID** must be used in each JSON file. IN the example above, the instanceID **sqream_2** is changed to **sqream_3**.
-
-12. **Optional** - If you created additional services in **Step 11**, verify that you have also created their additional configuration files:
-
-    .. code-block:: console
-   
-       $ cp sqream2-service.conf sqream3-service.conf
-       $ vim sqream3-service.conf
-      
-13. For each SQream service configuration file, do the following:
-
-    1. Change the **SERVICE_NAME=sqream2** value to **SERVICE_NAME=sqream3**.
-    
-    2. Change **LOGFILE=/var/log/sqream/sqream2.log** to **LOGFILE=/var/log/sqream/sqream3.log**.
-    
-**NOTE:** If you are running SQream on more than one server, you must configure the ``serverpicker`` and ``metadatserver`` services to start on only one of the servers. If **metadataserver** is running on the first server, the ``metadataServerIP`` value in the second server's /etc/sqream/sqream1_config.json file must point to the IP of the server on which the ``metadataserver`` service is running.
-    
-14. Set up **servicepicker**:
-
-    1. Do the following:
-
-       .. code-block:: console
-   
-          $ vim /etc/sqream/server_picker.conf
-    
-    2. Change the IP **127.0.0.1** to the IP of the server that the **metadataserver** service is running on.    
-    
-    3. Change the **CLUSTER** to the value of the cluster path.
-     
-15. Set up your service files:      
-      
-    .. code-block:: console
-   
-       $ cd /usr/local/sqream/service/
-       $ cp sqream2.service sqream3.service
-       $ vim sqream3.service      
-       
-16. Increment each **EnvironmentFile=/etc/sqream/sqream2-service.conf** configuration file for each SQream service file, as shown below:
-
-    .. code-block:: console
-     
-       $ EnvironmentFile=/etc/sqream/sqream<3>-service.conf
-       
-17. Copy and register your service files into systemd:       
-       
-    .. code-block:: console
-     
-       $ sudo cp metadataserver.service /usr/lib/systemd/system/
-       $ sudo cp serverpicker.service /usr/lib/systemd/system/
-       $ sudo cp sqream*.service /usr/lib/systemd/system/
-       
-18. Verify that your service files have been copied into systemd:
-
-    .. code-block:: console
-     
-       $ ls -l /usr/lib/systemd/system/sqream*
-       $ ls -l /usr/lib/systemd/system/metadataserver.service
-       $ ls -l /usr/lib/systemd/system/serverpicker.service
-       $ sudo systemctl daemon-reload       
-       
-19. Copy the license into the **/etc/license** directory:
-
-    .. code-block:: console
-     
-       $ cp license.enc /etc/sqream/   
-
-       
-If you have an HDFS environment, see :ref:`Configuring an HDFS Environment for the User sqream `.
-
-
-
-
-
-
-Upgrading SQream Version
--------------------------
-Upgrading your SQream version requires stopping all running services while you manually upgrade SQream.
-
-**To upgrade your version of SQream:**
-
-1. Stop all actively running SQream services.
-
-**Notice-** All SQream services must remain stopped while the upgrade is in process. Ensuring that SQream services remain stopped depends on the tool being used.
-
-For an example of stopping actively running SQream services, see :ref:`Launching SQream with Monit `.
-
-
-      
-2. Verify that SQream has stopped listening on ports **500X**, **510X**, and **310X**:
-
-   .. code-block:: console
-
-      $ sudo netstat -nltp    #to make sure sqream stopped listening on 500X, 510X and 310X ports.
-
-3. Replace the old version ``sqream-db-v2020.2``, with the new version ``sqream-db-v2021.1``:
-
-   .. code-block:: console
-    
-      $ cd /home/sqream
-      $ mkdir tempfolder
-      $ mv sqream-db-v2021.1.tar.gz tempfolder/
-      $ tar -xf sqream-db-v2021.1.tar.gz
-      $ sudo mv sqream /usr/local/sqream-db-v2021.1
-      $ cd /usr/local
-      $ sudo chown -R sqream:sqream sqream-db-v2021.1
-   
-4. Remove the symbolic link:
-
-   .. code-block:: console
-   
-      $ sudo rm sqream
-   
-5. Create a new symbolic link named "sqream" pointing to the new version:
-
-   .. code-block:: console  
-
-      $ sudo ln -s sqream-db-v2021.1 sqream
-
-6. Verify that the symbolic SQream link points to the real folder:
-
-   .. code-block:: console  
-
-      $ ls -l
-	 
-   The following is an example of the correct output:
-
-   .. code-block:: console
-    
-      $ sqream -> sqream-db-v2021.1
-
-7. **Optional-** (for major versions) Upgrade your version of SQream storage cluster, as shown in the following example:
-
-   .. code-block:: console  
-
-      $ cat /etc/sqream/sqream1_config.json |grep cluster
-      $ ./upgrade_storage 
-	  
-   The following is an example of the correct output:
-	  
-   .. code-block:: console  
-
-	  get_leveldb_version path{}
-	  current storage version 23
-      upgrade_v24
-      upgrade_storage to 24
-	  upgrade_storage to 24 - Done
-	  upgrade_v25
-	  upgrade_storage to 25
-	  upgrade_storage to 25 - Done
-	  upgrade_v26
-	  upgrade_storage to 26
-	  upgrade_storage to 26 - Done
-	  validate_leveldb
-	  ...
-      upgrade_v37
-	  upgrade_storage to 37
-	  upgrade_storage to 37 - Done
-	  validate_leveldb
-      storage has been upgraded successfully to version 37
- 
-8. Verify that the latest version has been installed:
-
-   .. code-block:: console
-    
-      $ ./sqream sql --username sqream --password sqream --host localhost --databasename master -c "SELECT SHOW_VERSION();"
-      
-   The following is an example of the correct output:
- 
-   .. code-block:: console
-    
-      v2021.1
-      1 row
-      time: 0.050603s 
- 
-For more information, see the `upgrade_storage `_ command line program.
-
-For more information about installing Studio on a stand-alone server, see `Installing Studio on a Stand-Alone Server `_.
\ No newline at end of file
+.. _installing_sqream_with_binary:
+
+*********************************************
+Installing SQream Using Binary Packages
+*********************************************
+
+This procedure describes how to install SQream using Binary packages and must be done on all servers.
+
+**To install SQream using Binary packages:**
+
+1. Copy the SQream package to the **/home/sqream** directory for the current version:
+
+   .. code-block:: console
+   
+      $ tar -xf sqream-db-v<2020.2>.tar.gz
+
+2. Append the version number to the name of the SQream folder. The version number in the following example is **v2020.2**:
+
+   .. code-block:: console
+   
+      $ mv sqream sqream-db-v<2020.2>
+
+3. Move the new version of the SQream folder to the **/usr/local/** directory:
+
+   .. code-block:: console
+   
+      $ sudo mv sqream-db-v<2020.2> /usr/local/
+      
+4. Change the ownership of the folder to **sqream folder**:
+
+   .. code-block:: console
+   
+      $ sudo chown -R sqream:sqream  /usr/local/sqream-db-v<2020.2>
+
+5. Navigate to the **/usr/local/** directory and create a symbolic link to SQream:
+
+   .. code-block:: console
+   
+      $ cd /usr/local
+      $ sudo ln -s sqream-db-v<2020.2> sqream
+      
+6. Verify that the symbolic link that you created points to the folder that you created:
+
+   .. code-block:: console
+   
+      $ ls -l
+      
+7. Verify that the symbolic link that you created points to the folder that you created:
+
+   .. code-block:: console
+   
+      $ sqream -> sqream-db-v<2020.2>
+      
+8. Create the SQream configuration file destination folders and set their ownership to **sqream**:
+
+   .. code-block:: console
+   
+      $ sudo mkdir /etc/sqream
+      $ sudo chown -R sqream:sqream /etc/sqream
+      
+9. Create the SQream service log destination folders and set their ownership to **sqream**:
+
+   .. code-block:: console
+   
+      $ sudo mkdir /var/log/sqream
+      $ sudo chown -R sqream:sqream /var/log/sqream
+
+10. Navigate to the **/usr/local/** directory and copy the SQream configuration files from them:
+
+   .. code-block:: console
+   
+      $ cd /usr/local/sqream/etc/
+      $ cp * /etc/sqream
+      
+The configuration files are **service configuration files**, and the JSON files are **SQream configuration files**, for a total of four files. The number of SQream configuration files and JSON files must be identical.
+      
+.. note:: Verify that the JSON files have been configured correctly and that all required flags have been set to the correct values.
+
+In each JSON file, the following parameters **must be updated**:
+
+* instanceId
+* machineIP
+* metadataServerIp
+* spoolMemoryGB
+* limitQueryMemoryGB
+* gpu
+* port
+* ssl_port
+
+See how to :ref:`configure ` the Spool Memory and Limit Query Memory.
+
+Note the following:
+
+* The value of the **metadataServerIp** parameter must point to the IP that the metadata is running on.
+* The value of the **machineIP** parameter must point to the IP of your local machine.
+
+It would be same on server running metadataserver and different on other server nodes.
+
+11. **Optional** - To run additional SQream services, copy the required configuration files and create additional JSON files:
+
+   .. code-block:: console
+   
+      $ cp sqream2_config.json sqream3_config.json
+      $ vim sqream3_config.json
+      
+.. note:: A unique **instanceID** must be used in each JSON file. IN the example above, the instanceID **sqream_2** is changed to **sqream_3**.
+
+12. **Optional** - If you created additional services in **Step 11**, verify that you have also created their additional configuration files:
+
+    .. code-block:: console
+   
+       $ cp sqream2-service.conf sqream3-service.conf
+       $ vim sqream3-service.conf
+      
+13. For each SQream service configuration file, do the following:
+
+    1. Change the **SERVICE_NAME=sqream2** value to **SERVICE_NAME=sqream3**.
+    
+    2. Change **LOGFILE=/var/log/sqream/sqream2.log** to **LOGFILE=/var/log/sqream/sqream3.log**.
+    
+.. note:: If you are running SQream on more than one server, you must configure the ``serverpicker`` and ``metadatserver`` services to start on only one of the servers. If **metadataserver** is running on the first server, the ``metadataServerIP`` value in the second server's /etc/sqream/sqream1_config.json file must point to the IP of the server on which the ``metadataserver`` service is running.
+    
+14. Set up **servicepicker**:
+
+    1. Do the following:
+
+       .. code-block:: console
+   
+          $ vim /etc/sqream/server_picker.conf
+    
+    2. Change the IP **127.0.0.1** to the IP of the server that the **metadataserver** service is running on.    
+    
+    3. Change the **CLUSTER** to the value of the cluster path.
+     
+15. Set up your service files:      
+      
+    .. code-block:: console
+   
+       $ cd /usr/local/sqream/service/
+       $ cp sqream2.service sqream3.service
+       $ vim sqream3.service      
+       
+16. Increment each **EnvironmentFile=/etc/sqream/sqream2-service.conf** configuration file for each SQream service file, as shown below:
+
+    .. code-block:: console
+     
+       $ EnvironmentFile=/etc/sqream/sqream<3>-service.conf
+       
+17. Copy and register your service files into systemd:       
+       
+    .. code-block:: console
+     
+       $ sudo cp metadataserver.service /usr/lib/systemd/system/
+       $ sudo cp serverpicker.service /usr/lib/systemd/system/
+       $ sudo cp sqream*.service /usr/lib/systemd/system/
+       
+18. Verify that your service files have been copied into systemd:
+
+    .. code-block:: console
+     
+       $ ls -l /usr/lib/systemd/system/sqream*
+       $ ls -l /usr/lib/systemd/system/metadataserver.service
+       $ ls -l /usr/lib/systemd/system/serverpicker.service
+       $ sudo systemctl daemon-reload       
+       
+19. Copy the license into the **/etc/license** directory:
+
+    .. code-block:: console
+     
+       $ cp license.enc /etc/sqream/   
+
+       
+If you have an HDFS environment, see :ref:`Configuring an HDFS Environment for the User sqream `.
+
+
diff --git a/installation_guides/installing_sqream_with_kubernetes.rst b/installation_guides/installing_sqream_with_kubernetes.rst
deleted file mode 100644
index 093f21ba3..000000000
--- a/installation_guides/installing_sqream_with_kubernetes.rst
+++ /dev/null
@@ -1,1805 +0,0 @@
-.. _installing_sqream_with_kubernetes:
-
-*********************************************
-Installing SQream with Kubernetes
-*********************************************
-**Kubernetes**, also known as **k8s**, is a portable open source platform that automates Linux container operations. Kubernetes supports outsourcing data centers to public cloud service providers or can be scaled for web hosting. SQream uses Kubernetes as an orchestration and recovery solution.
-
-The **Installing SQream with Kubernetes** guide describes the following:
-
-.. contents:: 
-   :local:
-   :depth: 1
-   
-.. _preparing_sqream_environment:
-   
-Preparing the SQream Environment to Launch SQream Using Kubernetes
-===============
-
-The **Preparing the SQream environment to Launch SQream Using Kubernetes** section describes the following:
-
-.. contents:: 
-   :local:
-   :depth: 1
-   
-Overview
---------------
-   
-A minimum of three servers is required for preparing the SQream environment using Kubernetes.
-
-Kubernetes uses clusters, which are sets of nodes running containterized applications. A cluster consists of at least two GPU nodes and one additional server without GPU to act as the quorum manager.
-
-Each server must have the following IP addresses:
-
-* An IP address located in the management network.
-* An additional IP address from the same subnet to function as a floating IP.
-
-All servers must be mounted in the same shared storage folder.
-
-The following list shows the server host name format requirements:
-
-* A maximum of 253 characters.
-* Only lowercase alphanumeric characters, such as ``-`` or ``.``.
-* Starts and ends with alphanumeric characters.
-
-Go back to :ref:`Preparing the SQream Environment to Launch SQream Using Kubernetes`
-
-
-Operating System Requirements
-------------------------------
-The required operating system is a version of x86 CentOS/RHEL between 7.6 and 7.9. Regarding PPC64le, the required version is RHEL 7.6.
-
-Go back to :ref:`Preparing the SQream Environment to Launch SQream Using Kubernetes`
-
-
-Compute Server Specifications
-------------------------------
-Installing SQream with Kubernetes includes the following compute server specifications:
-
-* **CPU:** 4 cores
-* **RAM:** 16GB
-* **HD:** 500GB
-
-Go back to :ref:`Preparing the SQream Environment to Launch SQream Using Kubernetes`
-
-.. _set_up_your_hosts:
-
-Setting Up Your Hosts
-===============================
-SQream requires you to set up your hosts. Setting up your hosts requires the following:
-
-.. contents:: 
-   :local:
-   :depth: 1
-
-Configuring the Hosts File
---------------------------------
-**To configure the /etc/hosts file:**
-
-1. Edit the **/etc/hosts** file:
-
-   .. code-block:: console
-
-      $ sudo vim /etc/hosts
-
-2. Call your local host:
-
-   .. code-block:: console
-
-      $ 127.0.0.1	localhost
-      $ 	
- 
-
-Installing the Required Packages
-----------------------------------
-The first step in setting up your hosts is to install the required packages.
-
-**To install the required packages:**
-
-1. Run the following command based on your operating system:
-
-   * RHEL:
-    
-    .. code-block:: postgres
-   
-       $ sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm    
- 
-   * CentOS:
-    
-    .. code-block:: postgres
-   
-       $ sudo yum install epel-release	   
-       $ sudo yum install pciutils openssl-devel python36 python36-pip kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc jq net-tools ntp
-
-2. Verify that that the required packages were successfully installed. The following is the correct output:
-    
-    .. code-block:: postgres
-   
-       ntpq --version
-       jq --version
-       python3 --version
-       pip3 --version
-       rpm -qa |grep kernel-devel-$(uname -r)
-       rpm -qa |grep kernel-headers-$(uname -r)
-       gcc --version
-
-3. Enable the **ntpd (Network Time Protocol daemon)** program on all servers:
-    
-    .. code-block:: postgres
-   
-       $ sudo systemctl start ntpd
-       $ sudo systemctl enable ntpd
-       $ sudo systemctl status ntpd
-       $ sudo ntpq -p
-	   
-Go back to :ref:`Setting Up Your Hosts`
-
-     
-Disabling the Linux UI
-----------------------------------
-After installing the required packages, you must disable the Linux UI if it has been installed.
-
-You can disable Linux by running the following command:
-
-   .. code-block:: postgres
-   
-      $ sudo systemctl set-default multi-user.target
-
-Go back to :ref:`Setting Up Your Hosts`
-
-
-Disabling SELinux
-----------------------------------
-After disabling the Linux UI you must disable SELinux.
-
-**To disable SELinux:**
-
- 1.  Run the following command:
- 
-    .. code-block:: postgres
-   
-       $ sed -i -e s/enforcing/disabled/g /etc/selinux/config
-       $ sudo reboot
-      
- 2. Reboot the system as a root user:
-      
-    .. code-block:: postgres
-   
-       $ sudo reboot      
-
-Go back to :ref:`Setting Up Your Hosts`
-
-Disabling Your Firewall
-----------------------------------
-After disabling SELinux, you must disable your firewall by running the following commands:
-   
-      .. code-block:: postgres
-   
-         $ sudo systemctl stop firewalld
-         $ sudo systemctl disable firewalld
-
-Go back to :ref:`Setting Up Your Hosts`
-
-  
-Checking the CUDA Version
-----------------------------------
-After completing all of the steps above, you must check the CUDA version.
-
-**To check the CUDA version:**
-
-1. Check the CUDA version:
-
-   .. code-block:: postgres
-   
-      $ nvidia-smi
-      
-   The following is an example of the correct output:
-
-   .. code-block:: postgres
-   
-      $ +-----------------------------------------------------------------------------+
-      $ | NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
-      $ |-------------------------------+----------------------+----------------------+
-      $ | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
-      $ | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
-      $ |===============================+======================+======================|
-      $ |   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
-      $ | 32%   38C    P0    N/A /  75W |      0MiB /  4039MiB |      0%      Default |
-      $ +-------------------------------+----------------------+----------------------+
-      $                                                                                
-      $ +-----------------------------------------------------------------------------+
-      $ | Processes:                                                       GPU Memory |
-      $ |  GPU       PID   Type   Process name                             Usage      |
-      $ |=============================================================================|
-      $ |  No running processes found                                                 |
-      $ +-----------------------------------------------------------------------------+
-
-In the above output, the CUDA version is **10.1**.
-
-If the above output is not generated, CUDA has not been installed. To install CUDA, see `Installing the CUDA driver `_.
-
-
-Go back to :ref:`Setting Up Your Hosts`
-
-.. _install_kubernetes_cluster:
-
-Installing Your Kubernetes Cluster
-===================================
-After setting up your hosts, you must install your Kubernetes cluster. The Kubernetes and SQream software must be installed from the management host, and can be installed on any server in the cluster.
-
-Installing your Kubernetes cluster requires the following:
-
-.. contents:: 
-   :local:
-   :depth: 1
-
-Generating and Sharing SSH Keypairs Across All Existing Nodes
-------------------------------------
-You can generate and share SSH keypairs across all existing nodes. Sharing SSH keypairs across all nodes enables passwordless access from the management server to all nodes in the cluster. All nodes in the cluster require passwordless access.
-
-.. note::  You must generate and share an SSH keypair across all nodes even if you are installing the Kubernetes cluster on a single host.
-
-**To generate and share an SSH keypair:**
-
-1. Switch to root user access:
-
-  .. code-block:: postgres
-   
-     $ sudo su -
-
-2. Generate an RSA key pair:
-
-  .. code-block:: postgres
-   
-     $ ssh-keygen
-
-The following is an example of the correct output:
-
-  .. code-block:: postgres
-   
-     $ ssh-keygen
-     $ Generating public/private rsa key pair.
-     $ Enter file in which to save the key (/root/.ssh/id_rsa):
-     $ Created directory '/root/.ssh'.
-     $ Enter passphrase (empty for no passphrase):
-     $ Enter same passphrase again:
-     $ Your identification has been saved in /root/.ssh/id_rsa.
-     $ Your public key has been saved in /root/.ssh/id_rsa.pub.
-     $ The key fingerprint is:
-     $ SHA256:xxxxxxxxxxxxxxdsdsdffggtt66gfgfg root@localhost.localdomain
-     $ The key's randomart image is:
-     $ +---[RSA 2048]----+
-     $ |            =*.  |
-     $ |            .o   |
-     $ |            ..o o|
-     $ |     .     .oo +.|
-     $ |      = S =...o o|
-     $ |       B + *..o+.|
-     $ |      o * *..o .+|
-     $ |       o * oo.E.o|
-     $ |      . ..+..B.+o|
-     $ +----[SHA256]-----+
-
-The generated file is ``/root/.ssh/id_rsa.pub``.
-	 
-3. Copy the public key to all servers in the cluster, including the one that you are running on. 
-
-  .. code-block:: postgres
-   
-     $ ssh-copy-id -i ~/.ssh/id_rsa.pub root@remote-host
-   
-4. Replace the ``remote host`` with your host IP address.
-      
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-Installing and Deploying a Kubernetes Cluster Using Kubespray
-------------------------------------
-SQream uses the Kubespray software package to install and deploy Kubernetes clusters.
-
-**To install and deploy a Kubernetes cluster using Kubespray:**
-
-
-1. Clone Kubernetes:
-
-   1. Clone the **kubespray.git** repository:
-
-      .. code-block:: postgres
-   
-         $ git clone https://github.com/kubernetes-incubator/kubespray.git
-     
-   2. Nagivate to the **kubespray** directory:
-     
-       .. code-block:: postgres
-   
-          $ cd kubespray
-     
-   3. Install the **requirements.txt** configuration file:
-   
-      .. code-block:: postgres
-   
-         $ pip3 install -r requirements.txt		 
-
-2. Create your SQream inventory directory:
-
-   1. Run the following command:
-   
-      .. code-block:: postgres
-   
-         $ cp -rp inventory/sample inventory/sqream
-   
-   2. Replace the **** with the defined cluster node IP address(es).
-   
-      .. code-block:: postgres
-   
-         $ declare -a IPS=(, ) 
-   
-      For example, the following replaces ``192.168.0.93`` with ``192.168.0.92``:
-
-      .. code-block:: postgres
-   
-         $ declare -a IPS=(host-93,192.168.0.93 host-92,192.168.0.92)
-
-Note the following:
- * Running a declare requires defining a pair (host name and cluster node IP address), as shown in the above example.
- * You can define more than one pair.
- 
-3. When the reboot is complete, switch back to the root user:
-   
-    .. code-block:: postgres
-   
-       $ sudo su -
-
-4. Navigate to **root/kubespray**:
-   
-    .. code-block:: postgres
-   
-       $ cd /root/kubespray
-
-5. Copy ``inventory/sample`` as ``inventory/sqream``:
-   
-    .. code-block:: postgres
-   
-       $ cp -rfp inventory/sample inventory/sqream
-
-6. Update the Ansible inventory file with the inventory builder:
-   
-    .. code-block:: postgres
-   
-       $ declare -a IPS=(, , ,)
-	   
-7. In the **kubespray hosts.yml** file, set the node IP's: 
-
-   .. code-block:: postgres
-   
-      $ CONFIG_FILE=inventory/sqream/hosts.yml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
-	  
-   If you do not set a specific hostname in declare, the server hostnames will change to ``node1``, ``node2``, etc. To maintain specific hostnames, run declare as in the following example:
-
-   .. code-block:: postgres
-   
-      $ declare -a IPS=(eks-rhl-1,192.168.5.81 eks-rhl-2,192.168.5.82 eks-rhl-3,192.168.5.83)
-
-   Note that the declare must contain pairs (hostname,ip).
-   
-::
-	  
-8. Verify that the following have been done:
- 
-   * That the **hosts.yml** file is configured correctly.
-   * That all children are included with their relevant nodes.
-
-You can save your current server hostname by replacing  with your server hostname.
-
-9. Generate the content output of the **hosts.yml** file. Make sure to include the file's directory:
-
-   .. code-block:: postgres
-   
-      $ cat  inventory/sqream/hosts.yml
-	  
-The hostname can be lowercase and contain ``-`` or ``.`` only, and must be aligned with the server's hostname.
-
-The following is an example of the correct output. Each host and IP address that you provided in Step 2 should be displayed once:
-
-   .. code-block:: postgres
-   
-      $ all:
-      $   hosts:
-      $     node1:
-      $       ansible_host: 192.168.5.81
-      $       ip: 192.168.5.81
-      $       access_ip: 192.168.5.81
-      $     node2:
-      $       ansible_host: 192.168.5.82
-      $       ip: 192.168.5.82
-      $       access_ip: 192.168.5.82
-      $     node3:
-      $       ansible_host: 192.168.5.83
-      $       ip: 192.168.5.83
-      $       access_ip: 192.168.5.83
-      $   children:
-      $     kube-master:
-      $       hosts:
-      $         node1:
-      $         node2:
-      $         node3:
-      $     kube-node:
-      $       hosts:
-      $         node1:
-      $         node2:
-      $         node3:
-      $     etcd:
-      $       hosts:
-      $         node1:
-      $         node2:
-      $         node3:
-      $     k8s-cluster:
-      $       children:
-      $         kube-master:
-      $         kube-node:
-      $     calico-rr:
-      $       hosts: {}
-	  
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-     
-Adjusting Kubespray Deployment Values
--------------------------------------    
-After downloading and configuring Kubespray, you can adjust your Kubespray deployment values. A script is used to modify how the Kubernetes cluster is deployed, and you must set the cluster name variable before running this script.
-
-.. note:: The script must be run from the **kubespray** folder.
-
-**To adjust Kubespray deployment values:**
-
-1. Add the following export to the local user’s **~/.bashrc** file by replacing the  with the user's Virtual IP address:
-
-   .. code-block:: postgres
-   
-      $ export VIP_IP=
-	  
-2. Logout, log back in, and verify the following:
-
-   .. code-block:: postgres
-   
-      $ echo $VIP_IP
-	  
-3. Make the following replacements to the **kubespray.settings.sh** file:
-
-   .. code-block:: postgres
-   
-      $ cat < kubespray_settings.sh
-      $ sed -i "/cluster_name: cluster.local/c   \cluster_name: cluster.local.$cluster_name" inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ sed -i "/dashboard_enabled/c   \dashboard_enabled\: "false"" inventory/sqream/group_vars/k8s-cluster/addons.yml
-      $ sed -i "/kube_version/c   \kube_version\: "v1.18.3"" inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ sed -i "/metrics_server_enabled/c   \metrics_server_enabled\: "true"" inventory/sample/group_vars/k8s-cluster/addons.yml
-      $ echo 'kube_apiserver_node_port_range: "3000-6000"' >> inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ echo 'kube_controller_node_monitor_grace_period: 20s' >> inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ echo 'kube_controller_node_monitor_period: 2s' >> inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ echo 'kube_controller_pod_eviction_timeout: 30s' >> inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ echo 'kubelet_status_update_frequency: 4s' >> inventory/sqream/group_vars/k8s-cluster/k8s-cluster.yml
-      $ echo 'ansible ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers
-      $ EOF
-	  
-.. note:: In most cases, the Docker data resides on the system disk. Because Docker requires a high volume of data (images, containers, volumes, etc.), you can change the default Docker data location to prevent the system disk from running out of space.
-
-4. *Optional* - Change the default Docker data location:
-
-   .. code-block:: postgres
-   
-      $ sed -i "/docker_daemon_graph/c   \docker_daemon_graph\: """ inventory/sqream/group_vars/all/docker.yml
- 	  
-5. Make the **kubespray_settings.sh** file executable for your user:
-
-   .. code-block:: postgres
-   
-      $ chmod u+x kubespray_settings.sh && ./kubespray_settings.sh
-	  
-6. Run the following script:
-
-   .. code-block:: postgres
-   
-      $ ./kubespray_settings.sh
-
-7. Run a playbook on the **inventory/sqream/hosts.yml cluster.yml** file:
-
-   .. code-block:: postgres
-   
-      $ ansible-playbook -i inventory/sqream/hosts.yml cluster.yml -v
-
-The Kubespray installation takes approximately 10 - 15 minutes.
-
-The following is an example of the correct output:
-
-   .. code-block:: postgres
-   
-      $ PLAY RECAP
-      $ *********************************************************************************************
-      $ node-1             : ok=680  changed=133  unreachable=0    failed=0
-      $ node-2             : ok=583  changed=113  unreachable=0    failed=0
-      $ node-3             : ok=586  changed=115  unreachable=0    failed=0
-      $ localhost          : ok=1    changed=0    unreachable=0    failed=0
-
-In the event that the output is incorrect, or a failure occurred during the installation, please contact a SQream customer support representative.
-
-Go back to :ref:`Installing Your Kubernetes Cluster`.   
-      
-Checking Your Kubernetes Status
--------------------------------
-After adjusting your Kubespray deployment values, you must check your Kubernetes status.
-
-**To check your Kuberetes status:**
-
-1. Check the status of the node:
-
-   .. code-block:: postgres
-   
-      $ kubectl get nodes
-	  
-The following is an example of the correct output:
-
-   .. code-block:: postgres
-   
-      $ NAME        STATUS   ROLES                  AGE   VERSION
-      $ eks-rhl-1   Ready    control-plane,master   29m   v1.21.1
-      $ eks-rhl-2   Ready    control-plane,master   29m   v1.21.1
-      $ eks-rhl-3   Ready                     28m   v1.21.1
-
-2. Check the status of the pod:
-
-   .. code-block:: postgres
-   
-      $ kubectl get pods --all-namespaces 
-
-   The following is an example of the correct output:
-
-      .. code-block:: postgres
-   
-         $ NAMESPACE                NAME                                         READY   STATUS    RESTARTS   AGE
-         $ kube-system              calico-kube-controllers-68dc8bf4d5-n9pbp     1/1     Running   0          160m
-         $ kube-system              calico-node-26cn9                            1/1     Running   1          160m
-         $ kube-system              calico-node-kjsgw                            1/1     Running   1          160m
-         $ kube-system              calico-node-vqvc5                            1/1     Running   1          160m
-         $ kube-system              coredns-58687784f9-54xsp                     1/1     Running   0          160m
-         $ kube-system              coredns-58687784f9-g94xb                     1/1     Running   0          159m
-         $ kube-system              dns-autoscaler-79599df498-hlw8k              1/1     Running   0          159m
-         $ kube-system              kube-apiserver-k8s-host-1-134                1/1     Running   0          162m
-         $ kube-system              kube-apiserver-k8s-host-194                  1/1     Running   0          161m
-         $ kube-system              kube-apiserver-k8s-host-68                   1/1     Running   0          161m
-         $ kube-system              kube-controller-manager-k8s-host-1-134       1/1     Running   0          162m
-         $ kube-system              kube-controller-manager-k8s-host-194         1/1     Running   0          161m
-         $ kube-system              kube-controller-manager-k8s-host-68          1/1     Running   0          161m
-         $ kube-system              kube-proxy-5f42q                             1/1     Running   0          161m
-         $ kube-system              kube-proxy-bbwvk                             1/1     Running   0          161m
-         $ kube-system              kube-proxy-fgcfb                             1/1     Running   0          161m
-         $ kube-system              kube-scheduler-k8s-host-1-134                1/1     Running   0          161m
-         $ kube-system              kube-scheduler-k8s-host-194                  1/1     Running   0          161m
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-        
-Adding a SQream Label to Your Kubernetes Cluster Nodes
--------------------------------------------------
-After checking your Kubernetes status, you must add a SQream label on your Kubernetes cluster nodes.
-
-**To add a SQream label on your Kubernetes cluster nodes:**
-
-1. Get the cluster node list:
-
-   .. code-block:: postgres
-   
-      $ kubectl get nodes
-	  
-   The following is an example of the correct output:
-   
-   .. code-block:: postgres
-   
-      $ NAME        STATUS   ROLES                  AGE   VERSION
-      $ eks-rhl-1   Ready    control-plane,master   29m   v1.21.1
-      $ eks-rhl-2   Ready    control-plane,master   29m   v1.21.1
-      $ eks-rhl-3   Ready                     28m   v1.21.1
-	  
-2. Set the node label, change the ``node-name`` to the node NAME(s) in the above example:
-
-   .. code-block:: postgres
-
-      $ kubectl label nodes  cluster=sqream
-   
-   The following is an example of the correct output:
-
-   .. code-block:: postgres
-   
-      $ [root@edk-rhl-1 kubespray]# kubectl label nodes eks-rhl-1 cluster=sqream
-      $ node/eks-rhl-1 labeled
-      $ [root@edk-rhl-1 kubespray]# kubectl label nodes eks-rhl-2 cluster=sqream
-      $ node/eks-rhl-2 labeled
-      $ [root@edk-rhl-1 kubespray]# kubectl label nodes eks-rhl-3 cluster=sqream
-      $ node/eks-rhl-3 labeled
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-   
-Copying Your Kubernetes Configuration API File to the Master Cluster Nodes
--------------------------------------------------  
-After adding a SQream label on your Kubernetes cluster nodes, you must copy your Kubernetes configuration API file to your Master cluster nodes.
-
-When the Kubernetes cluster installation is complete, an API configuration file is automatically created in the **.kube** folder of the root user. This file enables the **kubectl** command access Kubernetes' internal API service. Following this step lets you run **kubectl** commands from any node in the cluster.
-
-
-.. warning:: You must perform this on the management server only!
-
-**To copy your Kubernetes configuration API file to your Master cluster nodes:**
-
-1. Create the **.kube** folder in the **local user** directory:
-
-   .. code-block:: postgres
-   
-      $ mkdir /home//.kube
-
-2. Copy the configuration file from the root user directory to the  directory:
-
-   .. code-block:: postgres
-   
-      $ sudo cp /root/.kube/config /home//.kube
-
-3. Change the file owner from **root user** to the :
-
-   .. code-block:: postgres
-   
-      $  sudo chown . /home//.kube/config
-
-4. Create the **.kube** folder in the other nodes located in the  directory:
-
-   .. code-block:: postgres
-   
-      $ ssh @ mkdir .kube
-
-5. Copy the configuration file from the management node to the other nodes:
-
-   .. code-block:: postgres
-   
-      $ scp /home//.kube/config @:/home//.kube/
-	  
-6. Under local user on each server you copied **.kube** to, run the following command:
-
-   .. code-block:: postgres
-   
-      $ sudo usermod -aG docker $USER
-
-This grants the local user the necessary permissions to run Docker commands.
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-Creating an env_file in Your Home Directory
--------------------------------------------------
-After copying your Kubernetes configuration API file to your Master cluster nodes, you must create an **env_file** in your home directory, and must set the VIP address as a variable.
-
-.. warning:: You must perform this on the management server only!
-
-
-
-**To create an env_file for local users in the user's home directory:**
-
-1. Set a variable that includes the VIP IP address:
-
-   .. code-block:: postgres
-   
-      $ export VIP_IP=
-	  
-.. note:: If you use Kerberos, replace the ``KRB5_SERVER`` value with the IP address of your Kerberos server.
-   
-2. Do one of the following:
-
-   * For local users:
-
-     .. code-block:: postgres
-   
-        $ mkdir /home/$USER/.sqream
-   
-3. Make the following replacements to the **kubespray.settings.sh** file, verifying that the ``KRB5_SERVER`` parameter is set to your server IP:
-
-
-   .. code-block:: postgres  
-   
-        $ cat < /home/$USER/.sqream/env_file
-        SQREAM_K8S_VIP=$VIP_IP
-        SQREAM_ADMIN_UI_PORT=8080
-        SQREAM_DASHBOARD_DATA_COLLECTOR_PORT=8100
-        SQREAM_DATABASE_NAME=master
-        SQREAM_K8S_ADMIN_UI=sqream-admin-ui
-        SQREAM_K8S_DASHBOARD_DATA_COLLECTOR=dashboard-data-collector
-        SQREAM_K8S_METADATA=sqream-metadata
-        SQREAM_K8S_NAMESPACE=sqream
-        SQREAM_K8S_PICKER=sqream-picker
-        SQREAM_K8S_PROMETHEUS=prometheus
-        SQREAM_K8S_REGISTRY_PORT=6000
-        SQREAM_METADATA_PORT=3105
-        SQREAM_PICKER_PORT=3108
-        SQREAM_PROMETHEUS_PORT=9090
-        SQREAM_SPOOL_MEMORY_RATIO=0.25
-        SQREAM_WORKER_0_PORT=5000
-        KRB5CCNAME=FILE:/tmp/tgt
-        KRB5_SERVER=kdc.sq.com:1
-        KRB5_CONFIG_DIR=${        $ SQREAM_MOUNT_DIR}/krb5
-        KRB5_CONFIG_FILE=${KRB5_CONFIG_DIR}/krb5.conf
-        HADOOP_CONFIG_DIR=${        $ SQREAM_MOUNT_DIR}/hadoop
-        HADOOP_CORE_XML=${HADOOP_CONFIG_DIR}/core-site.xml
-        HADOOP_HDFS_XML=${HADOOP_CONFIG_DIR}/hdfs-site.xml
-        EOF  
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-		
-
-
-
-
-Creating a Base Kubernetes Namespace
-------------------------------------
-After creating an env_file in the user's home directory, you must create a base Kubernetes namespace.
-
-You can create a Kubernetes namespace by running the following command:
-
-.. code-block:: postgres
-   
-   $ kubectl create namespace sqream-init    
-   
-The following is an example of the correct output:
-
-.. code-block:: postgres
-   
-   $ namespace/sqream-init created
-   
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-   
-Pushing the **env_file** File to the Kubernetes Configmap
---------------------------------------   
-After creating a base Kubernetes namespace, you must push the **env_file** to the Kubernetes configmap. You must push the **env_file** file to the Kubernetes **configmap** in the **sqream-init** namespace.
-
-This is done by running the following command:
-
-.. code-block:: postgres
-   
-   $ kubectl create configmap sqream-init -n sqream-init --from-env-file=/home/$USER/.sqream/env_file
-
-The following is an example of the correct output:
-
-.. code-block:: postgres
-   
-   $ configmap/sqream-init created
-
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-
-Installing the NVIDIA Docker2 Toolkit
--------------------------------------
-After pushing the **env_file** file to the Kubernetes configmap, you must install the NVIDIA Docker2 Toolkit. The **NVIDIA Docker2 Toolkit** lets users build and run GPU-accelerated Docker containers, and must be run only on GPU servers. The NVIDIA Docker2 Toolkit includes a container runtime library and utilities that automatically configure containers to leverage NVIDIA GPUs.
-
-
-Installing the NVIDIA Docker2 Toolkit on an x86_64 Bit Processor on CentOS
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-**To install the NVIDIA Docker2 Toolkit on an x86_64 bit processor on CentOS:**
-
-1. Add the repository for your distribution:
-
-   .. code-block:: postgres
-   
-      $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
-      $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
-      $ sudo tee /etc/yum.repos.d/nvidia-docker.repo
-
-2. Install the **nvidia-docker2** package and reload the Docker daemon configuration:
-   
-   .. code-block:: postgres
-   
-      $ sudo yum install nvidia-docker2
-      $ sudo pkill -SIGHUP dockerd
-
-3. Verify that the **nvidia-docker2** package has been installed correctly:
-
-   .. code-block:: postgres
-   
-      $ docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
-
-   The following is an example of the correct output:
-
-   .. code-block:: postgres
-   
-      $ docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
-      $ Unable to find image 'nvidia/cuda:10.1-base' locally
-      $ 10.1-base: Pulling from nvidia/cuda
-      $ d519e2592276: Pull complete 
-      $ d22d2dfcfa9c: Pull complete 
-      $ b3afe92c540b: Pull complete 
-      $ 13a10df09dc1: Pull complete 
-      $ 4f0bc36a7e1d: Pull complete 
-      $ cd710321007d: Pull complete 
-      $ Digest: sha256:635629544b2a2be3781246fdddc55cc1a7d8b352e2ef205ba6122b8404a52123
-      $ Status: Downloaded newer image for nvidia/cuda:10.1-base
-      $ Sun Feb 14 13:27:58 2021       
-      $ +-----------------------------------------------------------------------------+
-      $ | NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
-      $ |-------------------------------+----------------------+----------------------+
-      $ | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
-      $ | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
-      $ |===============================+======================+======================|
-      $ |   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
-      $ | 32%   37C    P0    N/A /  75W |      0MiB /  4039MiB |      0%      Default |
-      $ +-------------------------------+----------------------+----------------------+
-      $                                                                                
-      $ +-----------------------------------------------------------------------------+
-      $ | Processes:                                                       GPU Memory |
-      $ |  GPU       PID   Type   Process name                             Usage      |
-      $ |=============================================================================|
-      $ |  No running processes found                                                 |
-      $ +-----------------------------------------------------------------------------+
-
-For more information on installing the NVIDIA Docker2 Toolkit on an x86_64 Bit Processor on CentOS, see `NVIDIA Docker Installation - CentOS distributions `_
-     
-Installing the NVIDIA Docker2 Toolkit on an x86_64 Bit Processor on Ubuntu
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-**To install the NVIDIA Docker2 Toolkit on an x86_64 bit processor on Ubuntu:**
-
-1. Add the repository for your distribution:
-
-   .. code-block:: postgres
-   
-      $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
-      $ sudo apt-key add -     
-      $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)  
-      $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
-      $ sudo tee /etc/apt/sources.list.d/nvidia-docker.list  
-      $ sudo apt-get update
-     
-2. Install the **nvidia-docker2** package and reload the Docker daemon configuration:
-   
-   .. code-block:: postgres
-   
-      $ sudo apt-get install nvidia-docker2
-      $ sudo pkill -SIGHUP dockerd
-     
-3. Verify that the nvidia-docker2 package has been installed correctly:
-
-   .. code-block:: postgres
-   
-      $ docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi  
-     
-For more information on installing the NVIDIA Docker2 Toolkit on an x86_64 Bit Processor on Ubuntu, see `NVIDIA Docker Installation - Ubuntu distributions `_
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-    
-Modifying the Docker Daemon JSON File for GPU and Compute Nodes
--------------------------------------
-After installing the NVIDIA Docker2 toolkit, you must modify the Docker daemon JSON file for GPU and Compute nodes.
-
-Modifying the Docker Daemon JSON File for GPU Nodes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-
-**To modify the Docker daemon JSON file for GPU nodes:**     
-     
-1. Enable GPU and set HTTP access to the local Kubernetes Docker registry.
-
-.. note:: The Docker daemon JSON file must be modified on all GPU nodes.
-
-.. note:: Contact your IT department for a virtual IP.
-
-2. Replace the ``VIP address`` with your assigned VIP address.
-
-::
-
-3. Connect as a root user:
-
-   .. code-block:: postgres
-   
-      $  sudo -i
-     
-4. Set a variable that includes the VIP address:    
-     
-   .. code-block:: postgres
-   
-      $ export VIP_IP=
-
-5. Replace the  with the VIP address:      
-     
-    .. code-block:: postgres
-   
-      $ cat < /etc/docker/daemon.json
-      $ {
-      $    "insecure-registries": ["$VIP_IP:6000"],
-      $     "default-runtime": "nvidia",
-      $     "runtimes": {
-      $         "nvidia": {
-      $             "path": "nvidia-container-runtime",
-      $             "runtimeArgs": []
-      $         }
-      $     }
-      $ }
-      $ EOF   
-
-6. Apply the changes and restart Docker:
-
-   .. code-block:: postgres
-   
-      $ systemctl daemon-reload && systemctl restart docker
-      
-7. Exit the root user:
- 
-  .. code-block:: postgres
-   
-     $ exit
-	 
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-      
-Modifying the Docker Daemon JSON File for Compute Nodes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-You must follow this procedure only if you have a Compute node.
-
-**To modify the Docker daemon JSON file for Compute nodes:**
-
-1. Switch to a root user:
-
-   .. code-block:: postgres
-   
-      $  sudo -i
-
-2. Set a variable that includes a VIP address.
-
-.. note:: Contact your IT department for a virtual IP.
-
-3. Replace the ``VIP address`` with your assigned VIP address.
-
-   .. code-block:: postgres
-   
-      $ cat < /etc/docker/daemon.json
-      $ {
-      $    "insecure-registries": ["$VIP_IP:6000"]
-      $ }
-      $ EOF 
-
-4. Restart the services:
-
-   .. code-block:: postgres
-   
-      $ systemctl daemon-reload && systemctl restart docker
-
-5. Exit the root user:
- 
- 
-  .. code-block:: postgres
-   
-     $ exit
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-   
-Installing the Nvidia-device-plugin Daemonset
-----------------------------------------------
-After modifying the Docker daemon JSON file for GPU or Compute Nodes, you must installing the Nvidia-device-plugin daemonset. The Nvidia-device-plugin daemonset is only relevant to GPU nodes.
-
-**To install the Nvidia-device-plugin daemonset:**
-
-1. Set ``nvidia.com/gpu`` to ``true`` on all GPU nodes:
-
-.. code-block:: postgres
-   
-   $ kubectl label nodes  nvidia.com/gpu=true
-      
-2. Replace the ** with your GPU node name:
-   
-   For a complete list of GPU node names, run the ``kubectl get nodes`` command.
-
-   The following is an example of the correct output:
-   
-   .. code-block:: postgres
-   
-      $ [root@eks-rhl-1 ~]# kubectl label nodes eks-rhl-1 nvidia.com/gpu=true
-      $ node/eks-rhl-1 labeled
-      $ [root@eks-rhl-1 ~]# kubectl label nodes eks-rhl-2 nvidia.com/gpu=true
-      $ node/eks-rhl-2 labeled
-      $ [root@eks-rhl-1 ~]# kubectl label nodes eks-rhl-3 nvidia.com/gpu=true
-      $ node/eks-rhl-3 labeled  
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-   
-Creating an Nvidia Device Plugin
-----------------------------------------------   
-After installing the Nvidia-device-plugin daemonset, you must create an Nvidia-device-plugin. You can create an Nvidia-device-plugin by running the following command 
-
-.. code-block:: postgres
-   
-   $  kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta6/nvidia-device-plugin.yml
-   
-If needed, you can check the status of the Nvidia-device-plugin-daemonset pod status:
-
-.. code-block:: postgres
-   
-   $ kubectl get pods -n kube-system -o wide | grep nvidia-device-plugin
-
-The following is an example of the correct output:   
-
-.. code-block:: postgres
-   
-   $ NAME                                       READY   STATUS    RESTARTS   AGE
-   $ nvidia-device-plugin-daemonset-fxfct       1/1     Running   0          6h1m
-   $ nvidia-device-plugin-daemonset-jdvxs       1/1     Running   0          6h1m
-   $ nvidia-device-plugin-daemonset-xpmsv       1/1     Running   0          6h1m
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-Checking GPU Resources Allocatable to GPU Nodes
--------------------------------------
-After creating an Nvidia Device Plugin, you must check the GPU resources alloctable to the GPU nodes. Each GPU node has records, such as ``nvidia.com/gpu:     <#>``. The ``#`` indicates the number of allocatable, or available, GPUs in each node.
-
-You can output a description of allocatable resources by running the following command:
-
-.. code-block:: postgres
-   
-   $ kubectl describe node | grep -i -A 7 -B 2 allocatable: 
-
-The following is an example of the correct output:
-
-.. code-block:: postgres
-   
-   $ Allocatable:
-   $  cpu:                3800m
-   $  ephemeral-storage:  94999346224
-   $  hugepages-1Gi:      0
-   $  hugepages-2Mi:      0
-   $  memory:             15605496Ki
-   $  nvidia.com/gpu:     1
-   $  pods:               110 
-
-Go back to :ref:`Installing Your Kubernetes Cluster`     
-
-Preparing the WatchDog Monitor
-------------------------------
-SQream's deployment includes installing two watchdog services. These services monitor Kuberenetes management and the server's storage network.
-
-You can enable the storage watchdogs by adding entries in the **/etc/hosts** file on each server:
-
-.. code-block:: postgres
-   
-   $ 
k8s-node1.storage - $
k8s-node2.storage - $
k8s-node3.storage - -The following is an example of the correct syntax: - -.. code-block:: postgres - - $ 10.0.0.1 k8s-node1.storage - $ 10.0.0.2 k8s-node2.storage - $ 10.0.0.3 k8s-node3.storage - -Go back to :ref:`Installing Your Kubernetes Cluster` - -.. _installing_sqream_software: - -Installing the SQream Software -================================= -Once you've prepared the SQream environment for launching it using Kubernetes, you can begin installing the SQream software. - -The **Installing the SQream Software** section describes the following: - -.. contents:: - :local: - :depth: 1 - - -.. _getting_sqream_package: - -Getting the SQream Package --------------------------------- -The first step in installing the SQream software is getting the SQream package. Please contact the SQream Support team to get the **sqream_k8s-nnn-DBnnn-COnnn-SDnnn-.tar.gz** tarball file. - -This file includes the following values: - -* **sqream_k8s-** - the SQream installer version. -* **DB** - the SQreamDB version. -* **CO** - the SQream console version. -* **SD** - the SQream Acceleration Studio version. -* **arch** - the server architecture. - -You can extract the contents of the tarball by running the following command: - -.. code-block:: postgres - - $ tar -xvf sqream_k8s-1.0.15-DB2020.1.0.2-SD0.7.3-x86_64.tar.gz - $ cd sqream_k8s-1.0.15-DB2020.1.0.2-SD0.7.3-x86_64 - $ ls - -Extracting the contents of the tarball file generates a new folder with the same name as the tarball file. - -The following shows the output of the extracted file: - -.. code-block:: postgres - - drwxrwxr-x. 2 sqream sqream 22 Jan 27 11:39 license - lrwxrwxrwx. 1 sqream sqream 49 Jan 27 11:39 sqream -> .sqream/sqream-sql-v2020.3.1_stable.x86_64/sqream - -rwxrwxr-x. 1 sqream sqream 9465 Jan 27 11:39 sqream-install - -rwxrwxr-x. 1 sqream sqream 12444 Jan 27 11:39 sqream-start - -Go back to :ref:`Installing Your SQream Software` - -Setting Up and Configuring Hadoop --------------------------------- -After getting the SQream package, you can set up and configure Hadoop by configuring the **keytab** and **krb5.conf** files. - -.. note:: You only need to configure the **keytab** and **krb5.conf** files if you use Hadoop with Kerberos authentication. - - -**To set up and configure Hadoop:** - -1. Contact IT for the **keytab** and **krb5.conf** files. - -:: - -2. Copy both files into the respective empty **.hadoop/** and **.krb5/** directories: - -.. code-block:: postgres - - $ cp hdfs.keytab krb5.conf .krb5/ - $ cp core-site.xml hdfs-site.xml .hadoop/ - -The SQream installer automatically copies the above files during the installation process. - -Go back to :ref:`Installing Your SQream Software` - -Starting a Local Docker Image Registry --------------------------------- -After getting the SQream package, or (optionally) setting up and configuring Hadoop, you must start a local Docker image registry. Because Kubernetes is based on Docker, you must start the local Docker image registry on the host's shared folder. This allows all hosts to pull the SQream Docker images. - -**To start a local Docker image registry:** - -1. Create a Docker registry folder: - - .. code-block:: postgres - - $ mkdir /docker-registry/ - -2. Set the ``docker_path`` for the Docker registry folder: - - .. code-block:: postgres - - $ export docker_path= - -3. Apply the **docker-registry** service to the cluster: - - .. code-block:: postgres - - $ cat .k8s/admin/docker_registry.yaml | envsubst | kubectl create -f - - - The following is an example of the correct output: - - .. code-block:: postgres - - namespace/sqream-docker-registry created - configmap/sqream-docker-registry-config created - deployment.apps/sqream-docker-registry created - service/sqream-docker-registry created - -4. Check the pod status of the **docker-registry** service: - - .. code-block:: postgres - - $ kubectl get pods -n sqream-docker-registry - -The following is an example of the correct output: - - .. code-block:: postgres - - NAME READY STATUS RESTARTS AGE - sqream-docker-registry-655889fc57-hmg7h 1/1 Running 0 6h40m - -Go back to :ref:`Installing Your SQream Software` - -Installing the Kubernetes Dashboard --------------------------------- -After starting a local Docker image registry, you must install the Kubernetes dashboard. The Kubernetes dashboard lets you see the Kubernetes cluster, nodes, services, and pod status. - -**To install the Kubernetes dashboard:** - -1. Apply the **k8s-dashboard** service to the cluster: - - .. code-block:: postgres - - $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml - - The following is an example of the correct output: - - .. code-block:: postgres - - namespace/kubernetes-dashboard created - serviceaccount/kubernetes-dashboard created - service/kubernetes-dashboard created - secret/kubernetes-dashboard-certs created - secret/kubernetes-dashboard-csrf created - secret/kubernetes-dashboard-key-holder created - configmap/kubernetes-dashboard-settings created - role.rbac.authorization.k8s.io/kubernetes-dashboard created - clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created - rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created - clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created - deployment.apps/kubernetes-dashboard created - service/dashboard-metrics-scraper created - deployment.apps/dashboard-metrics-scraper created - -2. Grant the user external access to the Kubernetes dashboard: - - .. code-block:: postgres - - $ cat .k8s/admin/kubernetes-dashboard-svc-metallb.yaml | envsubst | kubectl create -f - - - The following is an example of the correct output: - - .. code-block:: postgres - - service/kubernetes-dashboard-nodeport created - -3. Create the ``cluster-admin-sa.yaml`` file: - - .. code-block:: postgres - - $ kubectl create -f .k8s/admin/cluster-admin-sa.yaml - - The following is an example of the correct output: - - .. code-block:: postgres - - clusterrolebinding.rbac.authorization.k8s.io/cluster-admin-sa-cluster-admin created - -4. Check the pod status of the **K8s-dashboard** service: - - .. code-block:: postgres - - $ kubectl get pods -n kubernetes-dashboard - - The following is an example of the correct output: - - .. code-block:: postgres - - NAME READY STATUS RESTARTS AGE - dashboard-metrics-scraper-6b4884c9d5-n8p57 1/1 Running 0 4m32s - kubernetes-dashboard-7b544877d5-qc8b4 1/1 Running 0 4m32s - -5. Obtain the **k8s-dashboard** access token: - - .. code-block:: postgres - - $ kubectl -n kube-system describe secrets cluster-admin-sa-token - - The following is an example of the correct output: - - .. code-block:: postgres - - Name: cluster-admin-sa-token-rbl9p - Namespace: kube-system - Labels: - Annotations: kubernetes.io/service-account.name: cluster-admin-sa - kubernetes.io/service-account.uid: 81866d6d-8ef3-4805-840d-58618235f68d - - Type: kubernetes.io/service-account-token - - Data - ==== - ca.crt: 1025 bytes - namespace: 11 bytes - token: eyJhbGciOiJSUzI1NiIsImtpZCI6IjRMV09qVzFabjhId09oamQzZGFFNmZBeEFzOHp3SlJOZWdtVm5lVTdtSW8ifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjbHVzdGVyLWFkbWluLXNhLXRva2VuLXJibDlwIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImNsdXN0ZXItYWRtaW4tc2EiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI4MTg2NmQ2ZC04ZWYzLTQ4MDUtODQwZC01ODYxODIzNWY2OGQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06Y2x1c3Rlci1hZG1pbi1zYSJ9.mNhp8JMr5y3hQ44QrvRDCMueyjSHSrmqZcoV00ZC7iBzNUqh3n-fB99CvC_GR15ys43jnfsz0tdsTy7VtSc9hm5ENBI-tQ_mwT1Zc7zJrEtgFiA0o_eyfYZOARdhdyFEJg84bzkIxJFPKkBWb4iPWU1Xb7RibuMCjNTarZMZbqzKYfQEcMZWJ5UmfUqp-HahZZR4BNbjSWybs7t6RWdcQZt6sO_rRCDrOeEJlqKKjx4-5jFZB8Du_0kKmnw2YJmmSCEOXrpQCyXIiZJpX08HyDDYfFp8IGzm61arB8HDA9dN_xoWvuz4Cj8klUtTzL9effJJPjHJlZXcEqQc9hE3jw - -6. Navigate to ``https://:5999``. - -:: - -7. Select the **Token** radio button, paste the token from the previous command output, and click **Sign in**. - -The Kubernetes dashboard is displayed. - -Go back to :ref:`Installing Your SQream Software` - -Installing the SQream Prometheus Package --------------------------------- -After installing the Kubernetes dashboard, you must install the SQream Prometheus package. To properly monitor the host and GPU statistics the **exporter service** must be installed on each Kubernetes cluster node. - -This section describes how to install the following: - -* **node_exporter** - collects host data, such as CPU memory usage. -* **nvidia_exporter** - collects GPU utilization data. - - -.. note:: The steps in this section must be done on **all** cluster nodes. - -To install the **sqream-prometheus** package, you must do the following: - -1. :ref:`Install the exporter service ` - -:: - -2. :ref:`Check the exporter service ` - -Go back to :ref:`Installing Your SQream Software` - - -.. _install_exporter_service: - -Installing the Exporter Service -~~~~~~~~~~~~~~~~~~~~~~~~ - -**To install the exporter service:** - -1. Create a user and group that will be used to run the exporter services: - - .. code-block:: console - - $ sudo groupadd --system prometheus && sudo useradd -s /sbin/nologin --system -g prometheus prometheus - -2. Extract the **sqream_exporters_prometheus.0.1.tar.gz** file: - - .. code-block:: console - - $ cd .prometheus - $ tar -xf sqream_exporters_prometheus.0.1.tar.gz - -3. Copy the exporter software files to the **/usr/bin** directory: - - .. code-block:: console - - $ cd sqream_exporters_prometheus.0.1 - $ sudo cp node_exporter/node_exporter /usr/bin/ - $ sudo cp nvidia_exporter/nvidia_exporter /usr/bin/ - -4. Copy the exporters service file to the **/etc/systemd/system/** directory: - - .. code-block:: console - - $ sudo cp services/node_exporter.service /etc/systemd/system/ - $ sudo cp services/nvidia_exporter.service /etc/systemd/system/ - -5. Set the permission and group of the service files: - - .. code-block:: console - - $ sudo chown prometheus:prometheus /usr/bin/node_exporter - $ sudo chmod u+x /usr/bin/node_exporter - $ sudo chown prometheus:prometheus /usr/bin/nvidia_exporter - $ sudo chmod u+x /usr/bin/nvidia_exporter - -6. Reload the services: - - .. code-block:: console - - $ sudo systemctl daemon-reload - -7. Start both services and set them to start when the server is booted up: - - * Node_exporter: - - .. code-block:: console - - $ sudo systemctl start node_exporter && sudo systemctl enable node_exporter - - * Nvidia_exporter: - - .. code-block:: console - - $ sudo systemctl start nvidia_exporter && sudo systemctl enable nvidia_exporter - - - -.. _check_exporter_status: - -Checking the Exporter Status -~~~~~~~~~~~~~~~~~~~~~~~~ -After installing the **exporter** service, you must check its status. - -You can check the exporter status by running the following command: - -.. code-block:: console - - $ sudo systemctl status node_exporter && sudo systemctl status nvidia_exporter - -Go back to :ref:`Installing Your SQream Software` - - -.. _running_sqream_install_service: - -Running the Sqream-install Service -================================ -The **Running the Sqream-install Service** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Installing Your License --------------------------------- -After install the SQream Prometheus package, you must install your license. - -**To install your license:** - -1. Copy your license package to the sqream **/license** folder. - -.. note:: You do not need to untar the license package after copying it to the **/license** folder because the installer script does it automatically. - -The following flags are **mandatory** during your first run: - -.. code-block:: console - - $ sudo ./sqream-install -i -k -m - -.. note:: If you cannot run the script with **sudo**, verify that you have the right permission (**rwx** for the user) on the relevant directories (config, log, volume, and data-in directories). - -Go back to :ref:`Running the SQream_install Service`. - -Changing Your Data Ingest Folder --------------------------------- -After installing your license, you must change your data ingest folder. - -You can change your data ingest folder by running the following command: - -.. code-block:: console - - $ sudo ./sqream-install -d /media/nfs/sqream/data_in - -Go back to :ref:`Running the SQream_install Service`. - -Checking Your System Settings --------------------------------- -After changing your data ingest folder, you must check your system settings. - -The following command shows you all the variables that your SQream system is running with: - -.. code-block:: console - - $ ./sqream-install -s - -After optionally checking your system settings, you can use the **sqream-start** application to control your Kubernetes cluster. - -Go back to :ref:`Running the SQream_install Service`. - -SQream Installation Command Reference --------------------------------- -If needed, you can use the **sqream-install** flag reference for any needed flags by typing: - -.. code-block:: console - - $ ./sqream-install --help - -The following shows the **sqream--install** flag descriptions: - -.. list-table:: - :widths: 22 59 25 - :header-rows: 1 - - * - Flag - - Function - - Note - * - **-i** - - Loads all the software from the hidden **.docker** folder. - - Mandatory - * - **-k** - - Loads the license package from the **/license** directory. - - Mandatory - * - **-m** - - Sets the relative path for all SQream folders under the shared filesystem available from all nodes (sqreamdb, config, logs and data_in). No other flags are required if you use this flag (such as c, v, l or d). - - Mandatory - * - **-c** - - Sets the path where to write/read SQream configuration files from. The default is **/etc/sqream/**. - - Optional - * - **-v** - - Shows the location of the SQream cluster. ``v`` creates a cluster if none exist, and mounts it if does. - - Optional - * - **-l** - - Shows the location of the SQream system startup logs. The logs contain startup and Docker logs. The default is **/var/log/sqream/**. - - Optional - * - **-d** - - Shows the folder containing data that you want to import into or copy from SQream. - - Optional - * - **-n** - - Sets the Kubernetes namespace. The default is **sqream**. - - Optional - * - **-N** - - Deletes a specific Kubernetes namespace and sets the factory default namespace (sqream). - - Optional - * - **-f** - - Overwrite existing folders and all files located in mounted directories. - - Optional - * - **-r** - - Resets the system configuration. This flag is run without any other variables. - - Optional - * - **-s** - - Shows the system settings. - - Optional - * - **-e** - - Sets the Kubernetes cluster's virtual IP address. - - Optional - * - **-h** - - Help, shows all available flags. - - Optional - -Go back to :ref:`Running the SQream_install Service`. - -Controlling Your Kubernetes Cluster Using SQream Flags --------------------------------- -You can control your Kubernetes cluster using SQream flags. - -The following command shows you the available Kubernetes cluster control options: - -.. code-block:: console - - $ ./sqream-start -h - -The following describes the **sqream-start** flags: - -.. list-table:: - :widths: 22 59 25 - :header-rows: 1 - - * - Flag - - Function - - Note - * - **-s** - - Starts the sqream services, starting metadata, server picker, and workers. The number of workers started is based on the number of available GPU’s. - - Mandatory - * - **-p** - - Sets specific ports to the workers services. You must enter the starting port for the sqream-start application to allocate it based on the number of workers. - - - * - **-j** - - Uses an external .json configuration file. The file must be located in the **configuration** directory. - - The workers must each be started individually. - * - **-m** - - Allocates worker spool memory. - - The workers must each be started individually. - * - **-a** - - Starts the SQream Administration dashboard and specifies the listening port. - - - * - **-d** - - Deletes all running SQream services. - - - * - **-h** - - Shows all available flags. - - Help - -Go back to :ref:`Running the SQream_install Service`. - -.. _using_sqream_start_commands: - -Using the sqream-start Commands -======================= -In addition to controlling your Kubernetes cluster using SQream flags, you can control it using **sqream-start** commands. - -The **Using the sqream-start Commands** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Starting Your SQream Services ------------------------- -You can run the **sqream-start** command with the **-s** flag to start SQream services on all available GPU's: - -.. code-block:: console - - $ sudo ./sqream-start -s - -This command starts the SQream metadata, server picker, and sqream workers on all available GPU’s in the cluster. - -The following is an example of the correct output: - -.. code-block:: console - - ./sqream-start -s - Initializing network watchdogs on 3 hosts... - Network watchdogs are up and running - - Initializing 3 worker data collectors ... - Worker data collectors are up and running - - Starting Prometheus ... - Prometheus is available at 192.168.5.100:9090 - - Starting SQream master ... - SQream master is up and running - - Starting up 3 SQream workers ... - All SQream workers are up and running, SQream-DB is available at 192.168.5.100:3108 - All SQream workers are up and running, SQream-DB is available at 192.168.5.100:3108 - - -Go back to :ref:`Using the SQream-start Commands`. - -Starting Your SQream Services in Split Mode ------------------------- -Starting SQream services in split mode refers to running multiple SQream workers on a single GPU. You can do this by running the **sqream-start** command with the **-s** and **-z** flags. In addition, you can define the amount of hosts to run the multiple workers on. In the example below, the command defines to run the multiple workers on three hosts. - -**To start SQream services in split mode:** - -1. Run the following command: - -.. code-block:: console - - $ ./sqream-start -s -z 3 - -This command starts the SQream metadata, server picker, and sqream workers on a single GPU for three hosts: - -The following is an example of the correct output: - -.. code-block:: console - - Initializing network watchdogs on 3 hosts... - Network watchdogs are up and running - - Initializing 3 worker data collectors ... - Worker data collectors are up and running - - Starting Prometheus ... - Prometheus is available at 192.168.5.101:9090 - - Starting SQream master ... - SQream master is up and running - - Starting up 9 SQream workers over <#> available GPUs ... - All SQream workers are up and running, SQream-DB is available at 192.168.5.101:3108 - -2. Verify all pods are properly running in k8s cluster (**STATUS** column): - -.. code-block:: console - - kubectl -n sqream get pods - - NAME READY STATUS RESTARTS AGE - prometheus-bcf877867-kxhld 1/1 Running 0 106s - sqream-metadata-fbcbc989f-6zlkx 1/1 Running 0 103s - sqream-picker-64b8c57ff5-ndfr9 1/1 Running 2 102s - sqream-split-workers-0-1-2-6bdbfbbb86-ml7kn 1/1 Running 0 57s - sqream-split-workers-3-4-5-5cb49d49d7-596n4 1/1 Running 0 57s - sqream-split-workers-6-7-8-6d598f4b68-2n9z5 1/1 Running 0 56s - sqream-workers-start-xj75g 1/1 Running 0 58s - watchdog-network-management-6dnfh 1/1 Running 0 115s - watchdog-network-management-tfd46 1/1 Running 0 115s - watchdog-network-management-xct4d 1/1 Running 0 115s - watchdog-network-storage-lr6v4 1/1 Running 0 116s - watchdog-network-storage-s29h7 1/1 Running 0 116s - watchdog-network-storage-sx9mw 1/1 Running 0 116s - worker-data-collector-62rxs 0/1 Init:0/1 0 54s - worker-data-collector-n8jsv 0/1 Init:0/1 0 55s - worker-data-collector-zp8vf 0/1 Init:0/1 0 54s - -Go back to :ref:`Using the SQream-start Commands`. - -Starting the Sqream Studio UI --------------------------------- -You can run the following command the to start the SQream Studio UI (Editor and Dashboard): - -.. code-block:: console - - $ ./sqream-start -a - -The following is an example of the correct output: - -.. code-block:: console - - $ ./sqream-start -a - Please enter USERNAME: - sqream - Please enter PASSWORD: - ****** - Please enter port value or press ENTER to keep 8080: - - Starting up SQream Admin UI... - SQream admin ui is available at 192.168.5.100:8080 - -Go back to :ref:`Using the SQream-start Commands`. - -Stopping the SQream Services --------------------------------- -You can run the following command to stop all SQream services: - -.. code-block:: console - - $ ./sqream-start -d - -The following is an example of the correct output: - -.. code-block:: console - - $ ./sqream-start -d - $ Cleaning all SQream services in sqream namespace ... - $ All SQream service removed from sqream namespace - -Go back to :ref:`Using the SQream-start Commands`. - -Advanced sqream-start Commands --------------------------------- -Controlling Your SQream Spool Size -~~~~~~~~~~~~~~~~~~~~~~~~ -If you do not specify the SQream spool size, the console automatically distributes the available RAM between all running workers. - -You can define a specific spool size by running the following command: - -.. code-block:: console - - $ ./sqream-start -s -m 4 - -Using a Custom .json File -~~~~~~~~~~~~~~~~~~~~~~~~ -You have the option of using your own .json file for your own custom configurations. Your .json file must be placed within the path mounted in the installation. SQream recommends placing your .json file in the **configuration** folder. - -The SQream console does not validate the integrity of external .json files. - -You can use the following command (using the ``j`` flag) to set the full path of your .json file to the configuration file: - -.. code-block:: console - - $ ./sqream-start -s -f .json - -This command starts one worker with an external configuration file. - -.. note:: The configuration file must be available in the shared configuration folder. - - -Checking the Status of the SQream Services -~~~~~~~~~~~~~~~~~~~~~~~~ -You can show all running SQream services by running the following command: - -.. code-block:: console - - $ kubectl get pods -n -o wide - -This command shows all running services in the cluster and which nodes they are running in. - -Go back to :ref:`Using the SQream-start Commands`. - -Upgrading Your SQream Version -================================ -The **Upgrading Your SQream Version** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Before Upgrading Your System ----------------------------- -Before upgrading your system you must do the following: - -1. Contact SQream support for a new SQream package tarball file. - -:: - -2. Set a maintenance window. - - -.. note:: You must stop the system while upgrading it. - - -Upgrading Your System ----------------------------- -After completing the steps in **Before Upgrading Your System** above, you can upgrade your system. - -**To upgrade your system:** - -1. Extract the contents of the tarball file that you received from SQream support. Make sure to extract the contents to the same directory as in :ref:`getting_sqream_package` and for the same user: - - .. code-block:: console - - $ tar -xvf sqream_installer-2.0.5-DB2019.2.1-CO1.6.3-ED3.0.0-x86_64/ - $ cd sqream_installer-2.0.5-DB2019.2.1-CO1.6.3-ED3.0.0-x86_64/ - -2. Start the upgrade process run the following command: - - .. code-block:: console - - $ ./sqream-install -i - -The upgrade process checks if the SQream services are running and will prompt you to stop them. - -3. Do one of the following: - - * Stop the upgrade by writing ``No``. - * Continue the upgrade by writing ``Yes``. - -If you continue upgrading, all running SQream workers (master and editor) are stopped. When all services have been stopped, the new version is loaded. - - -.. note:: SQream periodically upgrades its metadata structure. If an upgrade version includes an upgraded metadata service, an approval request message is displayed. This approval is required to finish the upgrade process. Because SQream supports only specific metadata versions, all SQream services must be upgraded at the same time. - - -4. When SQream has successfully upgraded, load the SQream console and restart your services. - -For questions, contact SQream Support. diff --git a/installation_guides/installing_studio_on_stand_alone_server.rst b/installation_guides/installing_studio_on_stand_alone_server.rst index 874adba8d..1e069cbdc 100644 --- a/installation_guides/installing_studio_on_stand_alone_server.rst +++ b/installation_guides/installing_studio_on_stand_alone_server.rst @@ -1,180 +1,102 @@ .. _installing_studio_on_stand_alone_server: -.. _install_studio_top: - -*********************** +***************************************** Installing Studio on a Stand-Alone Server -*********************** - - -The **Installing Studio on a Stand-Alone Server** guide describes how to install SQream Studio on a stand-alone server. A stand-alone server is a server that does not run SQream based on binary files, Docker, or Kubernetes. +***************************************** -The Installing Studio on a Stand-Alone Server guide includes the following sections: +A stand-alone server is a server that does not run SQreamDB based on binary files. .. contents:: :local: :depth: 1 -Installing NodeJS Version 12 on the Server -^^^^^^^^^^^^^^^ -Before installing Studio you must install NodeJS version 12 on the server. - -**To install NodeJS version 12 on the server:** - -1. Check if a version of NodeJS older than version *12.* has been installed on the target server. - - .. code-block:: console - - $ node -v - - The following is the output if a version of NodeJS has already been installed on the target server: - - .. code-block:: console - - bash: /usr/bin/node: No such file or directory - -2. If a version of NodeJS older than *12.* has been installed, remove it as follows: - - * On CentOS: - - .. code-block:: console - - $ sudo yum remove -y nodejs - - * On Ubuntu: - - .. code-block:: console - - $ sudo apt remove -y nodejs - -3. If you have not installed NodeJS version 12, run the following commands: - - * On CentOS: - - .. code-block:: console - - $ curl -sL https://rpm.nodesource.com/setup_12.x | sudo bash - - $ sudo yum clean all && sudo yum makecache fast - $ sudo yum install -y nodejs - - * On Ubuntu: - - .. code-block:: console - - $ curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash - - $ sudo apt-get install -y nodejs - - The following output is displayed if your installation has completed successfully: - - .. code-block:: console - - Transaction Summary - ============================================================================================================================== - Install 1 Package - - Total download size: 22 M - Installed size: 67 M - Downloading packages: - warning: /var/cache/yum/x86_64/7/nodesource/packages/nodejs-12.22.1-1nodesource.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 34fa74dd: NOKEY - Public key for nodejs-12.22.1-1nodesource.x86_64.rpm is not installed - nodejs-12.22.1-1nodesource.x86_64.rpm | 22 MB 00:00:02 - Retrieving key from file:///etc/pki/rpm-gpg/NODESOURCE-GPG-SIGNING-KEY-EL - Importing GPG key 0x34FA74DD: - Userid : "NodeSource " - Fingerprint: 2e55 207a 95d9 944b 0cc9 3261 5ddb e8d4 34fa 74dd - Package : nodesource-release-el7-1.noarch (installed) - From : /etc/pki/rpm-gpg/NODESOURCE-GPG-SIGNING-KEY-EL - Running transaction check - Running transaction test - Transaction test succeeded - Running transaction - Warning: RPMDB altered outside of yum. - Installing : 2:nodejs-12.22.1-1nodesource.x86_64 1/1 - Verifying : 2:nodejs-12.22.1-1nodesource.x86_64 1/1 - - Installed: - nodejs.x86_64 2:12.22.1-1nodesource - - Complete! - -4. Confirm the Node version. - - .. code-block:: console - - $ node -v - - The following is an example of the correct output: - - .. code-block:: console - - v12.22.1 - -5. Install Prometheus using binary packages. - - For more information on installing Prometheus using binary packages, see :ref:`installing_prometheus_using_binary_packages`. - -Back to :ref:`Installing Studio on a Stand-Alone Server` - +Before You Begin +================ +It is essential you have :ref:`NodeJS 16 installed `. Installing Studio -^^^^^^^^^^^^^^^ -After installing the Dashboard Data Collector, you can install Studio. - -**To install Studio:** - -1. Copy the SQream Studio package from SQream Artifactory into the target server. For access to the Sqream Studio package, contact SQream Support. +================= -:: +1. Copy the SQreamDB Studio package from SQreamDB Artifactory into the target server. + + For access to the SQreamDB Studio package, contact `SQreamDB Support `_. 2. Extract the package: .. code-block:: console - $ tar -xvf sqream-acceleration-studio-.x86_64.tar.gz + tar -xvf sqream-acceleration-studio-.x86_64.tar.gz -:: - 3. Navigate to the new package folder. .. code-block:: console - $ cd sqream-admin + cd sqream-admin .. _add_parameter: -4. Build the configuration file to set up Sqream Studio. You can use IP address **127.0.0.1** on a single server. +4. Build the configuration file to set up SQreamDB Studio. You can use IP address **127.0.0.1** on a single server. .. code-block:: console - $ npm run setup -- -y --host= --port=3108 + npm run setup -- -y --host= --port=3108 The above command creates the **sqream-admin-config.json** configuration file in the **sqream-admin** folder and shows the following output: .. code-block:: console Config generated successfully. Run `npm start` to start the app. + +5. To make the communication between Studio and SQreamDB secure, in your configuration file do the following: - For more information about the available set-up arguments, see :ref:`Set-Up Arguments`. - - :: + a. Change your ``port`` value to **3109**. + + b. Change your ``ssl`` flag value to **true**. + + The following is an example of the correctly modified configuration file: + + .. code-block:: json + + { + "debugSqream": false, + "webHost": "localhost", + "webPort": 8080, + "webSslPort": 8443, + "logsDirectory": "", + "clusterType": "standalone", + "dataCollectorUrl": "", + "connections": [ + { + "host": "127.0.0.1", + "port":3109, + "isCluster": true, + "name": "default", + "service": "sqream", + "ssl":true, + "networkTimeout": 60000, + "connectionTimeout": 3000 + } + ] + } -5. If you have installed Studio on a server where SQream is already installed, move the **sqream-admin-config.json** file to **/etc/sqream/**: + Note that for the ``host`` value, it is essential that you use the IP address of your SQreamDB machine. + +6. If you have installed Studio on a server where SQreamDB is already installed, move the **sqream-admin-config.json** file to **/etc/sqream/**: .. code-block:: console - $ mv sqream-admin-config.json /etc/sqream + mv sqream-admin-config.json /etc/sqream -Back to :ref:`Installing Studio on a Stand-Alone Server` +Starting Studio +--------------- -Starting Studio Manually -^^^^^^^^^^^^^^^ -You can start Studio manually by running the following command: +Start Studio by running the following command: .. code-block:: console - $ cd /home/sqream/sqream-admin - $ NODE_ENV=production pm2 start ./server/build/main.js --name=sqream-studio -- start + cd /home/sqream/sqream-admin + NODE_ENV=production pm2 start ./server/build/main.js --name=sqream-studio -- start --config-location=/etc/sqream/sqream-admin-config.json The following output is displayed: @@ -188,61 +110,25 @@ The following output is displayed: │ 0 │ sqream-studio │ default │ 0.1.0 │ fork │ 11540 │ 0s │ 0 │ online │ 0% │ 15.6mb │ sqream │ disabled │ └─────┴──────────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘ -Starting Studio as a Service -^^^^^^^^^^^^^^^ -Sqream uses the **Process Manager (PM2)** to maintain Studio. - -**To start Studio as a service:** -1. Run the following command: +1. If the **sqream-admin-config.json** file is not located in **/etc/sqream/**, run the following command: .. code-block:: console - $ sudo npm install -g pm2 - -:: - -2. Verify that the PM2 has been installed successfully. - - .. code-block:: console - - $ pm2 list - - The following is the output: + cd /home/sqream/sqream-admin + NODE_ENV=production pm2 start ./server/build/main.js --name=sqream-studio -- start - .. code-block:: console +2. To verify the process is running, use the ``pm2 list`` command: - ┌─────┬──────────────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐ - │ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │ - ├─────┼──────────────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤ - │ 0 │ sqream-studio │ default │ 0.1.0 │ fork │ 11540 │ 2m │ 0 │ online │ 0% │ 31.5mb │ sqream │ disabled │ - └─────┴──────────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘ + .. code-block:: -:: - -2. Start the service with PM2: - - * If the **sqream-admin-config.json** file is located in **/etc/sqream/**, run the following command: - - .. code-block:: console - - $ cd /home/sqream/sqream-admin - $ NODE_ENV=production pm2 start ./server/build/main.js --name=sqream-studio -- start --config-location=/etc/sqream/sqream-admin-config.json - - * If the **sqream-admin-config.json** file is not located in **/etc/sqream/**, run the following command: - - .. code-block:: console - - $ cd /home/sqream/sqream-admin - $ NODE_ENV=production pm2 start ./server/build/main.js --name=sqream-studio -- start - -:: + pm2 list -3. Verify that Studio is running. +3. Verify that Studio is running: .. code-block:: console - $ netstat -nltp + netstat -nltp 4. Verify that SQream_studio is listening on port 8080, as shown below: @@ -258,75 +144,63 @@ Sqream uses the **Process Manager (PM2)** to maintain Studio. tcp6 0 0 :::22 :::* LISTEN - tcp6 0 0 ::1:25 :::* LISTEN - - - -:: -5. Verify the following: +5. Verify that you can: - 1. That you can access Studio from your browser (``http://:8080``). - - :: + a. Access Studio from your browser (``http://:8080``) - 2. That you can log in to SQream. + b. Log in to SQreamDB -6. Save the configuration to run on boot. +6. Save the configuration to run on boot: .. code-block:: console - $ pm2 startup + pm2 startup The following is an example of the output: .. code-block:: console - $ sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u sqream --hp /home/sqream + sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u sqream --hp /home/sqream 7. Copy and paste the output above and run it. -:: -8. Save the configuration. + +8. Save the configuration: .. code-block:: console - $ pm2 save - -Back to :ref:`Installing Studio on a Stand-Alone Server` + pm2 save Accessing Studio -^^^^^^^^^^^^^^^ +---------------- + The Studio page is available on port 8080: ``http://:8080``. If port 8080 is blocked by the server firewall, you can unblock it by running the following command: .. code-block:: console - $ firewall-cmd --zone=public --add-port=8080/tcp --permanent - $ firewall-cmd --reload + firewall-cmd --zone=public --add-port=8080/tcp --permanent + firewall-cmd --reload -Back to :ref:`Installing Studio on a Stand-Alone Server` - Maintaining Studio with the Process Manager (PM2) -^^^^^^^^^^^^^^^ -Sqream uses the **Process Manager (PM2)** to maintain Studio. +------------------------------------------------- + +SQreamDB uses the **Process Manager (PM2)** to maintain Studio. You can use PM2 to do one of the following: * To check the PM2 service status: ``pm2 list`` - - :: * To restart the PM2 service: ``pm2 reload sqream-studio`` - - :: * To see the PM2 service logs: ``pm2 logs sqream-studio`` -Back to :ref:`Installing Studio on a Stand-Alone Server` - Upgrading Studio -^^^^^^^^^^^^^^^ +---------------- + To upgrade Studio you need to stop the version that you currently have. **To stop the current version of Studio:** @@ -335,244 +209,44 @@ To upgrade Studio you need to stop the version that you currently have. .. code-block:: console - $ pm2 list + pm2 list - The process name is displayed. + The process name is displayed.: .. code-block:: console - -:: 2. Run the following command with the process name: .. code-block:: console - $ pm2 stop + pm2 stop -:: - 3. If only one process is running, run the following command: .. code-block:: console - $ pm2 stop all + pm2 stop all -:: - -4. Change the name of the current **sqream-admin** folder to the old version. +4. Change the name of the current **sqream-admin** folder to the old version: .. code-block:: console - $ mv sqream-admin sqream-admin- + mv sqream-admin sqream-admin- -:: - -5. Extract the new Studio version. +5. Extract the new Studio version: .. code-block:: console - $ tar -xf sqream-acceleration-studio-tar.gz - -:: - -6. Rebuild the configuration file. You can use IP address **127.0.0.1** on a single server. - - .. code-block:: console - - $ npm run setup -- -y --host= --port=3108 - - The above command creates the **sqream-admin-config.json** configuration file in the **sqream_admin** folder. - -:: - -7. Copy the **sqream-admin-config.json** configuration file to **/etc/sqream/** to overwrite the old configuration file. - -:: - -8. Start PM2. - - .. code-block:: console + tar -xf sqream-acceleration-studio-tar.gz - $ pm2 start all - -Back to :ref:`Installing Studio on a Stand-Alone Server` - -.. _install_studio_docker_container: - -Installing Studio in a Docker Container -^^^^^^^^^^^^^^^^^^^^^^^ -This guide explains how to install SQream Studio in a Docker container and includes the following sections: - -.. contents:: - :local: - :depth: 1 - -Installing Studio --------------- -If you have already installed Docker, you can install Studio in a Docker container. - -**To install Studio:** - -1. Copy the downloaded image onto the target server. - -:: - -2. Load the Docker image. - - .. code-block:: console - - $ docker load -i - -:: - -3. If the downloaded image is called **sqream-acceleration-studio-5.1.3.x86_64.docker18.0.3.tar,** run the following command: +6. Start PM2: .. code-block:: console - $ docker load -i sqream-acceleration-studio-5.1.3.x86_64.docker18.0.3.tar + pm2 start all -:: - -4. Start the Docker container. - - .. code-block:: console - - $ docker run -d --restart=unless-stopped -p :8080 -e runtime=docker -e SQREAM_K8S_PICKER= -e SQREAM_PICKER_PORT= -e SQREAM_DATABASE_NAME= -e SQREAM_ADMIN_UI_PORT=8080 --name=sqream-admin-ui - - The following is an example of the command above: - - .. code-block:: console - - $ docker run -d --name sqream-studio -p 8080:8080 -e runtime=docker -e SQREAM_K8S_PICKER=192.168.0.183 -e SQREAM_PICKER_PORT=3108 -e SQREAM_DATABASE_NAME=master -e SQREAM_ADMIN_UI_PORT=8080 sqream-acceleration-studio:5.1.3 - -Back to :ref:`Installing Studio in a Docker Container` - -Accessing Studio ------------------ -You can access Studio from Port 8080: ``http://:8080``. - -If you want to use Studio over a secure connection (https), you must use the parameter values shown in the following table: - -.. list-table:: - :widths: 10 25 65 - :header-rows: 1 - - * - Parameter - - Default Value - - Description - * - ``--web-ssl-port`` - - 8443 - - - * - ``--web-ssl-key-path`` - - None - - The path of SSL key PEM file for enabling https. Leave empty to disable. - * - ``--web-ssl-cert-path`` - - None - - The path of SSL certificate PEM file for enabling https. Leave empty to disable. - -You can configure the above parameters using the following syntax: - -.. code-block:: console - - $ npm run setup -- -y --host=127.0.0.1 --port=3108 - -.. _using_docker_container_commands: - -Back to :ref:`Installing Studio in a Docker Container` - -Using Docker Container Commands ---------------- -When installing Studio in Docker, you can run the following commands: - -* View Docker container logs: - - .. code-block:: console - - $ docker logs -f sqream-admin-ui - -* Restart the Docker container: - - .. code-block:: console - - $ docker restart sqream-admin-ui - -* Kill the Docker container: - - .. code-block:: console - - $ docker rm -f sqream-admin-ui - -Back to :ref:`Installing Studio in a Docker Container` - -Setting Up Argument Configurations ----------------- -When creating the **sqream-admin-config.json** configuration file, you can add ``-y`` to create the configuration file in non-interactive mode. Configuration files created in non-interactive mode use all the parameter defaults not provided in the command. - -The following table shows the available arguments: - -.. list-table:: - :widths: 10 25 65 - :header-rows: 1 - - * - Parameter - - Default Value - - Description - * - ``--web--host`` - - 8443 - - - * - ``--web-port`` - - 8080 - - - * - ``--web-ssl-port`` - - 8443 - - - * - ``--web-ssl-key-path`` - - None - - The path of the SSL Key PEM file for enabling https. Leave empty to disable. - * - ``--web-ssl-cert-path`` - - None - - The path of the SSL Certificate PEM file for enabling https. Leave empty to disable. - * - ``--debug-sqream (flag)`` - - false - - - * - ``--host`` - - 127.0.0.1 - - - * - ``--port`` - - 3108 - - - * - ``is-cluster (flag)`` - - true - - - * - ``--service`` - - sqream - - - * - ``--ssl (flag)`` - - false - - Enables the SQream SSL connection. - * - ``--name`` - - default - - - * - ``--data-collector-url`` - - localhost:8100/api/dashboard/data - - Enables the Dashboard. Leaving this blank disables the Dashboard. Using a mock URL uses mock data. - * - ``--cluster-type`` - - standalone (``standalone`` or ``k8s``) - - - * - ``--config-location`` - - ./sqream-admin-config.json - - - * - ``--network-timeout`` - - 60000 (60 seconds) - - - * - ``--access-key`` - - None - - If defined, UI access is blocked unless ``?ui-access=`` is included in the URL. - -Back to :ref:`Installing Studio in a Docker Container` +7. To access Studio over a secure (HTTPS) connection, follow :ref:`NGINX instructions`. - :: -Back to :ref:`Installing Studio on a Stand-Alone Server` diff --git a/installation_guides/launching_sqream_with_monit.rst b/installation_guides/launching_sqream_with_monit.rst index b8dc4f1a6..5e0822734 100644 --- a/installation_guides/launching_sqream_with_monit.rst +++ b/installation_guides/launching_sqream_with_monit.rst @@ -3,6 +3,7 @@ ********************************************* Launching SQream with Monit ********************************************* + This procedure describes how to launch SQream using Monit. Launching SQream diff --git a/installation_guides/pre-installation_configurations.rst b/installation_guides/pre-installation_configurations.rst new file mode 100644 index 000000000..8a04f7c82 --- /dev/null +++ b/installation_guides/pre-installation_configurations.rst @@ -0,0 +1,979 @@ +.. _pre-installation_configurations: + +****************************** +Pre-Installation Configuration +****************************** + +Before installing SQreamDB, it is essential that you tune your system for better performance and stability. + +.. contents:: + :local: + :depth: 1 + +Basic Input/Output System Settings +================================== + +The first step when setting your pre-installation configurations is to use the basic input/output system (BIOS) settings. + +The BIOS settings may have a variety of names, or may not exist on your system. Each system vendor has a different set of settings and variables. It is safe to skip any and all of the configuration steps, but this may impact performance. + +If any doubt arises, consult the documentation for your server or your hardware vendor for the correct way to apply the settings. + +.. list-table:: + :widths: 25 25 50 + :header-rows: 1 + + * - Item + - Setting + - Rationale + * - **Management console access** + - **Connected** + - Connection to Out-of-band (OOB) required to preserve continuous network uptime. + * - **All drives** + - **Connected and displayed on RAID interface** + - Prerequisite for cluster or OS installation. + * - **RAID volumes** + - **Configured according to project guidelines. Must be rebooted to take effect.** + - Clustered to increase logical volume and provide redundancy. + * - **Fan speed Thermal Configuration.** + - Dell fan speed: **High Maximum**. Specified minimum setting: **60**. HPe thermal configuration: **Increased cooling**. + - NVIDIA Tesla GPUs are passively cooled and require high airflow to operate at full performance. + * - **Power regulator or iDRAC power unit policy** + - HPe: **HP static high performance** mode enabled. Dell: **iDRAC power unit policy** (power cap policy) disabled. + - Other power profiles (such as "balanced") throttle the CPU and diminishes performance. Throttling may also cause GPU failure. + * - **System Profile**, **Power Profile**, or **Performance Profile** + - **High Performance** + - The Performance profile provides potentially increased performance by maximizing processor frequency, and the disabling certain power saving features such as C-states. Use this setting for environments that are not sensitive to power consumption. + * - **Power Cap Policy** or **Dynamic power capping** + - **Disabled** + - Other power profiles (like "balanced") throttle the CPU and may diminish performance or cause GPU failure. This setting may appear together with the above (Power profile or Power regulator). This setting allows disabling system ROM power calibration during the boot process. Power regulator settings are named differently in BIOS and iLO/iDRAC. + * - **Intel Turbo Boost** + - **Enabled** + - Intel Turbo Boost enables overclocking the processor to boost CPU-bound operation performance. Overclocking may risk computational jitter due to changes in the processor's turbo frequency. This causes brief pauses in processor operation, introducing uncertainty into application processing time. Turbo operation is a function of power consumption, processor temperature, and the number of active cores. + * - **Intel Virtualization Technology** (VT-d) + - **Disable** + - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. + * - **Logical Processor** + - **HPe**: Enable **Hyperthreading** **Dell**: Enable **Logical Processor** + - Hyperthreading doubles the amount of logical processors, which may improve performance by ~5-10% for CPU-bound operations. + * - **Intel Virtualization Technology** (VT-d) + - **Disable** + - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. + * - **Processor C-States** (Minimum processor idle power core state) + - **Disable** + - Processor C-States reduce server power when the system is in an idle state. This causes slower cold-starts when the system transitions from an idle to a load state, and may reduce query performance by up to 15%. + * - **HPe**: **Energy/Performance bias** + - **Maximum performance** + - Configures processor sub-systems for high-performance and low-latency. Other power profiles (like "balanced") throttle the CPU and may diminish performance. Use this setting for environments that are not sensitive to power consumption. + * - **HPe**: **DIMM voltage** + - **Optimized for Performance** + - Setting a higher voltage for DIMMs may increase performance. + * - **Memory Operating Mode** + - **Optimizer Mode**, **Disable Node Interleaving**, **Auto Memory Operating Voltage** + - Memory Operating Mode is tuned for performance in **Optimizer** mode. Other modes may improve reliability, but reduce performance. **Node Interleaving** should be disabled because enabling it interleaves the memory between memory nodes, which harms NUMA-aware applications such as SQreamDB. + * - **HPe**: **Memory power savings mode** + - **Maximum performance** + - This setting configures several memory parameters to optimize the performance of memory sub-systems. The default setting is **Balanced**. + * - **HPe ACPI SLIT** + - **Enabled** + - ACPI SLIT sets the relative access times between processors and memory and I/O sub-systems. ACPI SLIT enables operating systems to use this data to improve performance by more efficiently allocating resources and workloads. + * - **QPI Snoop** + - **Cluster on Die** or **Home Snoop** + - QPI (QuickPath Interconnect) Snoop lets you configure different Snoop modes that impact the QPI interconnect. Changing this setting may improve the performance of certain workloads. The default setting of **Home Snoop** provides high memory bandwidth in an average NUMA environment. **Cluster on Die** may provide increased memory bandwidth in highly optimized NUMA workloads. **Early Snoop** may decrease memory latency, but may result in lower overall bandwidth compared to other modes. + +Installing the Operating System +================================ + +Before You Begin +------------------- + +* Your system must have at least 200 gigabytes of free space on the root ``/`` mount. + +* For a multi-node cluster, you must have external shared storage provided by systems like General Parallel File System (GPFS), Weka, or VAST. + +* Once the BIOS settings have been set, you must install the operating system. + +* Make sure you use a supported OS version as listed on the release notes of the installed version. + +* Verify the exact RHEL8 version with your storage vendor to avoid driver incompatibility. + +Installation +---------------- + +#. Select a language (English recommended). +#. From **Software Selection**, select **Minimal** and check the **Development Tools** group checkbox. + + Selecting the **Development Tools** group installs the following tools: + + * autoconf + * automake + * binutils + * bison + * flex + * gcc + * gcc-c++ + * gettext + * libtool + * make + * patch + * pkgconfig + * redhat-rpm-config + * rpm-build + * rpm-sign + +#. Continue the installation. +#. Set up the necessary drives and users as per the installation process. + + The OS shell is booted up. + +Configuring the Operating System +================================== + +When configuring the operating system, several basic settings related to creating a new server are required. Configuring these as part of your basic set-up increases your server's security and usability. + +Creating a ``sqream`` User +---------------------------- + +**The sqream user must have the same UID and GID across all servers in your cluster.** + +If the ``sqream`` user does not have the same UID and GID across all servers and there is no critical data stored under ``/home/sqream``, it is recommended to delete the ``sqream`` user and sqream group from your servers. Subsequently, create new ones with the same ID, using the following command: + + .. code-block:: console + + sudo userdel sqream + sudo rm /var/spool/mail/sqream + +Before adding a user with a specific UID and GID, it is crucial to verify that such Ids do not already exist. + +The steps below guide you on creating a ``sqream`` user with an exemplary ID of ``1111``. + +1. Verify that a ``1111`` UID does not already exists: + + .. code-block:: console + + cat /etc/passwd |grep 1111 + +2. Verify that a ``1111`` GID does not already exists: + + .. code-block:: console + + cat /etc/group |grep 1111 + +3. Add a user with an identical UID on all cluster nodes: + + .. code-block:: console + + useradd -u 1111 sqream + +4. Add a ``sqream`` user to the ``wheel`` group. + + .. code-block:: console + + sudo usermod -aG wheel sqream + + You can remove the ``sqream`` user from the ``wheel`` group when the installation and configuration are complete: + + .. code-block:: console + + passwd sqream + +5. Log out and log back in as ``sqream``. + +6. If you deleted the ``sqream`` user and recreated it to have a new ID, you must change its ownership to ``/home/sqream`` in order to avoid permission errors. + + .. code-block:: console + + sudo chown -R sqream:sqream /home/sqream + +Setting Up A Locale +----------------------- + +SQreamDB enables you to set up a locale using your own location. To find out your current time-zone, run the ``timedatectl list-timezones`` command. + +Set the language of the locale: + +.. code-block:: console + + sudo localectl set-locale LANG=en_US.UTF-8 + +Installing Required Software +--------------------------------- + +.. contents:: + :local: + :depth: 1 + +Installing EPEL Repository +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + .. code-block:: console + + sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm + +Enabling Additional Red Hat Repositories +""""""""""""""""""""""""""""""""""""""""" + +Enabling additional Red Hat repositories is essential to install the required packages in the subsequent procedures. + + .. code-block:: console + + sudo subscription-manager release --set=8.9 + sudo subscription-manager repos --enable codeready-builder-for-rhel-8-x86_64-rpms + sudo subscription-manager repos --enable rhel-8-for-x86_64-appstream-rpms + sudo subscription-manager repos --enable rhel-8-for-x86_64-baseos-rpms + +Installing Required Packages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + .. code-block:: console + + sudo dnf install chrony pciutils monit zlib-devel openssl-devel kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc net-tools wget jq libffi-devel xz-devel ncurses-compat-libs libnsl gdbm-devel tk-devel sqlite-devel readline-devel texinfo + +Installing Recommended Tools +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + .. code-block:: console + + sudo dnf install bash-completion.noarch vim-enhanced vim-common net-tools iotop htop psmisc screen xfsprogs wget yum-utils dos2unix + +**For SQreamDB version 4.4 or newer, install Python 3.9.13.** + +1. Download the Python 3.9.13 source code tarball file from the following URL into the ``/home/sqream`` directory: + + .. code-block:: console + + wget https://www.python.org/ftp/python/3.9.13/Python-3.9.13.tar.xz + +2. Extract the Python 3.9.13 source code into your current directory: + + .. code-block:: console + + tar -xf Python-3.9.13.tar.xz + +3. Navigate to the Python 3.9.13 directory: + + .. code-block:: console + + cd Python-3.9.13 + +4. Run the ``./configure`` script: + + .. code-block:: console + + ./configure --enable-loadable-sqlite-extensions + +5. Build the software: + + .. code-block:: console + + make -j30 + +6. Install the software: + + .. code-block:: console + + sudo make install + +7. Verify that Python 3.9.13 has been installed: + + .. code-block:: console + + python3 --version + +.. _installing_nodejs: + +Installing NodeJS +^^^^^^^^^^^^^^^^^^ + +NodeJS is necessary only when the UI runs on the same server as SqreamDB. If not, you can skip this step. + +1. Download the NodeJS source code tarball file from the following URL into the ``/home/sqream`` directory: + + .. code-block:: console + + wget https://nodejs.org/dist/v16.20.0/node-v16.20.0-linux-x64.tar.xz + tar -xf node-v16.20.0-linux-x64.tar.xz + +2. Move the node-v16.20.0-linux-x64 file to the */usr/local* directory. + + .. code-block:: console + + sudo mv node-v16.20.0-linux-x64 /usr/local + +3. Navigate to the ``/usr/bin/`` directory: + + .. code-block:: console + + cd /usr/bin + +4. Create a symbolic link to the ``/local/node-v16.20.0-linux-x64/bin/node node`` directory: + + .. code-block:: console + + sudo ln -s ../local/node-v16.20.0-linux-x64//bin/node node + +5. Create a symbolic link to the ``/local/node-v16.20.0-linux-x64/bin/npm npm`` directory: + + .. code-block:: console + + sudo ln -s ../local/node-v16.20.0-linux-x64/bin/npm npm + +6. Create a symbolic link to the ``/local/node-v16.20.0-linux-x64/bin/npx npx`` directory: + + .. code-block:: console + + sudo ln -s ../local/node-v16.20.0-linux-x64/bin/npx npx + +7. Install the ``pm2`` process management: + + .. code-block:: console + + sudo npm install pm2 -g + cd /usr/bin + sudo ln -s ../local/node-v16.20.0-linux-x64/bin/pm2 pm2 + +8. If installing the ``pm2`` process management fails, install it offline: + + a. On a machine with internet access, install the following: + + * nodejs + * npm + * pm2 + + b. Extract the pm2 module to the correct directory: + + .. code-block:: console + + cd /usr/local/node-v16.20.0-linux-x64/lib/node_modules + tar -czvf pm2_x86.tar.gz pm2 + + c. Copy the ``pm2_x86.tar.gz`` file to a server without access to the internet and extract it. + + :: + + d. Move the ``pm2`` folder to the ``/usr/local/node-v16.20.0-linux-x64/lib/node_modules`` directory: + + .. code-block:: console + + sudo mv pm2 /usr/local/node-v16.20.0-linux-x64/lib/node_modules + + e. Navigate back to the ``/usr/bin`` directory: + + .. code-block:: console + + cd /usr/bin + + f. Create a symbolink to the ``pm2`` service: + + .. code-block:: console + + sudo ln -s /usr/local/node-v16.20.0-linux-x64/lib/node_modules/pm2/bin/pm2 pm2 + + g. Verify that installation was successful without using ``sudo``: + + .. code-block:: console + + pm2 list + + h. Verify that the node versions for the above are correct: + + .. code-block:: console + + node --version + +Configuring Chrony for RHEL8 Only +---------------------------------- + +#. Start the Chrony service: + + .. code-block:: console + + sudo systemctl start chronyd + +#. Enable the Chrony service to start automatically at boot time: + + .. code-block:: + + sudo systemctl enable chronyd + +#. Check the status of the Chrony service: + + .. code-block:: + + sudo systemctl status chronyd + +Configuring the Server to Boot Without Linux GUI +---------------------------------------------------- + +We recommend that you configure your server to boot without a Linux GUI by running the following command: + + .. code-block:: console + + sudo systemctl set-default multi-user.target + +Running this command activates the **NO-UI** server mode. + +Configuring the Security Limits +-------------------------------- + +The security limits refer to the number of open files, processes, etc. + + .. code-block:: console + + sudo bash + + .. code-block:: console + + echo -e "sqream soft nproc 1000000\nsqream hard nproc 1000000\nsqream soft nofile 1000000\nsqream hard nofile 1000000\nroot soft nproc 1000000\nroot hard nproc 1000000\nroot soft nofile 1000000\nroot hard nofile 1000000\nsqream soft core unlimited\nsqream hard core unlimited" >> /etc/security/limits.conf + +Configuring the Kernel Parameters +--------------------------------- + +1. Insert a new line after each kernel parameter: + + .. code-block:: console + + echo -e "vm.dirty_background_ratio = 5 \n vm.dirty_ratio = 10 \n vm.swappiness = 10 \n vm.vfs_cache_pressure = 200 \n vm.zone_reclaim_mode = 0 \n" >> /etc/sysctl.conf + +2. Check the maximum value of the ``fs.file``: + + .. code-block:: console + + sysctl -n fs.file-max + +Configuring the Firewall +-------------------------- + +The example in this section shows the open ports for four ``sqreamd`` sessions. If more than four are required, open the required ports as needed. Port 8080 in the example below is a new UI port. + +The ports listed below are required, and the same logic applies to all additional SQreamDB Worker ports. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Port + - Use + * - 8080 + - UI port + * - 443 + - UI over HTTPS ( requires nginx installation ) + * - 3105 + - SqreamDB metadataserver service + * - 3108 + - SqreamDB serverpicker service + * - 3109 + - SqreamDB serverpicker service over ssl + * - 5000 + - SqreamDB first worker default port + * - 5100 + - SqreamDB first worker over ssl default port + * - 5001 + - SqreamDB second worker default port + * - 5101 + - SqreamDB second worker over ssl default port + +1. Start the service and enable FirewallID on boot: + + .. code-block:: console + + systemctl start firewalld + +2. Add the following ports to the permanent firewall: + + .. code-block:: console + + firewall-cmd --zone=public --permanent --add-port=8080/tcp + firewall-cmd --zone=public --permanent --add-port=3105/tcp + firewall-cmd --zone=public --permanent --add-port=3108/tcp + firewall-cmd --zone=public --permanent --add-port=5000-5003/tcp + firewall-cmd --zone=public --permanent --add-port=5100-5103/tcp + firewall-cmd --permanent --list-all + +3. Reload the firewall: + + .. code-block:: console + + firewall-cmd --reload + +4. Enable FirewallID on boot: + + .. code-block:: console + + systemctl enable firewalld + + If you do not need the firewall, you can disable it: + + .. code-block:: console + + sudo systemctl stop firewalld + sudo systemctl disable firewalld + +Disabling SELinux +------------------- + +Disabling SELinux is a recommended action. + +1. Show the status of ``selinux``: + + .. code-block:: console + + sudo sestatus + +2. If the output is not ``disabled``, edit the ``/etc/selinux/config`` file: + + .. code-block:: console + + sudo vim /etc/selinux/config + +3. Change ``SELINUX=enforcing`` to ``SELINUX=disabled``: + + The above changes will only take effect after rebooting the server. + + You can disable selinux immediately after rebooting the server by running the following command: + + .. code-block:: console + + sudo setenforce 0 + +Configuring the ``/etc/hosts`` File +------------------------------------ + +1. Edit the ``/etc/hosts`` file: + + .. code-block:: console + + sudo vim /etc/hosts + +2. Call your local host: + + .. code-block:: console + + 127.0.0.1 localhost + + + +Installing the NVIDIA CUDA Driver +================================== + +After configuring your operating system, you must install the NVIDIA CUDA driver. + +.. warning:: If your Linux GUI runs on the server, it must be stopped before installing the CUDA drivers. + +Before You Begin +---------------- + +1. Verify that the NVIDIA card has been installed and is detected by the system: + + .. code-block:: console + + lspci | grep -i nvidia + +2. Verify that ``gcc`` has been installed: + + .. code-block:: console + + gcc --version + +3. If ``gcc`` has not been installed, install it for RHEL: + + .. code-block:: console + + sudo yum install -y gcc + +Updating the Kernel Headers +--------------------------- + +1. Update the kernel headers on RHEL: + + .. code-block:: console + + sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) + +2. Make sure kernel-devel and kernel-headers match installed kernel: + + .. code-block:: console + + uname -r + rpm -qa |grep kernel-devel-$(uname -r) + rpm -qa |grep kernel-headers-$(uname -r) + +Disabling Nouveau +----------------- + +Disable Nouveau, which is the default operating system driver. + +1. Check if the Nouveau driver has been loaded: + + .. code-block:: console + + lsmod | grep nouveau + + If the Nouveau driver has been loaded, the command above generates output. If the Nouveau driver has not been loaded, you may skip step 2 and 3. + +2. Blacklist the Nouveau driver to disable it: + + .. code-block:: console + + cat <`_. + +Installing the CUDA Driver from the Repository +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Installing the CUDA driver from the Repository is the recommended installation method. + +1. Install the CUDA dependencies for one of the following operating systems: + + .. code-block:: console + + sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm + +2. (Optional) Install the CUDA dependencies from the ``epel`` repository: + + .. code-block:: console + + sudo yum install dkms libvdpau + + Installing the CUDA depedendencies from the ``epel`` repository is only required for installing ``runfile``. + +3. Download and install the required local repository: + + + * **RHEL8.8/8.9 CUDA 12.3.2 repository ( INTEL ) installation ( Required for H/L Series GPU models ):** + + .. code-block:: console + + wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda-repo-rhel8-12-3-local-12.3.2_545.23.08-1.x86_64.rpm + sudo dnf localinstall cuda-repo-rhel8-12-3-local-12.3.2_545.23.08-1.x86_64.rpm + + .. code-block:: console + + sudo dnf clean all + sudo dnf -y module install nvidia-driver:latest-dkms + +Tuning Up NVIDIA Performance +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following procedures exclusively relate to Intel. + +.. contents:: + :local: + :depth: 1 + +Tune Up NVIDIA Performance when Driver Installed from the Repository +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +1. Check the service status: + + .. code-block:: console + + sudo systemctl status nvidia-persistenced + + If the service exists, it will be stopped by default. + +2. Start the service: + + .. code-block:: console + + sudo systemctl start nvidia-persistenced + +3. Verify that no errors have occurred: + + .. code-block:: console + + sudo systemctl status nvidia-persistenced + +4. Enable the service to start up on boot: + + .. code-block:: console + + sudo systemctl enable nvidia-persistenced + +5. For **H100/A100**, add the following lines: + + .. code-block:: console + + nvidia-persistenced + +6. Reboot the server and run the **NVIDIA System Management Interface (NVIDIA SMI)**: + + .. code-block:: console + + nvidia-smi + + + +Tune Up NVIDIA Performance when Driver Installed from the Runfile +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +1. Change the permissions on the ``rc.local`` file to ``executable``: + + .. code-block:: console + + sudo chmod +x /etc/rc.local + +2. Edit the ``/etc/yum.repos.d/cuda-10-1-local.repo`` file: + + .. code-block:: console + + sudo vim /etc/rc.local + +3. Add the following lines: + + * **For H100/A100**: + + .. code-block:: console + + nvidia-persistenced + +4. Reboot the server and run the ``NVIDIA System Management Interface (NVIDIA SMI)``: + + .. code-block:: console + + nvidia-smi + + +Enabling Core Dumps +=================== + +While this procedure is optional, SQreamDB recommends that core dumps be enabled. Note that the default ``abrt`` format is not ``gdb`` compatible, and that for SQreamDB support to be able to analyze your core dumps, they must be ``gdb`` compatible. + +.. contents:: + :local: + :depth: 1 + +Checking the ``abrtd`` Status +----------------------------- + +1. Check if ``abrtd`` is running: + + .. code-block:: console + + sudo ps -ef |grep abrt + +2. If **abrtd** is running, stop it: + + .. code-block:: console + + for i in abrt-ccpp.service abrtd.service abrt-oops.service abrt-pstoreoops.service abrt-vmcore.service abrt-xorg.service ; do sudo systemctl disable $i; sudo systemctl stop $i; done + +Setting the Limits +------------------ + +1. Set the limits: + + .. code-block:: console + + ulimit -c + +2. If the output is ``0``, add the following lines to the ``/etc/security/limits.conf`` file: + + .. code-block:: console + + * soft core unlimited + * hard core unlimited + +3. To apply the limit changes, log out and log back in. + +Creating the Core Dump Directory +-------------------------------- + +Because the core dump file may be the size of total RAM on the server, verify that you have sufficient disk space. In the example above, the core dump is configured to the ``/tmp/core_dumps`` directory. If necessary, replace path according to your own environment and disk space. + +1. Make the ``/tmp/core_dumps`` directory: + + .. code-block:: console + + mkdir /tmp/core_dumps + +2. Set the ownership of the ``/tmp/core_dumps`` directory: + + .. code-block:: console + + sudo chown sqream.sqream /tmp/core_dumps + +3. Grant read, write, and execute permissions to all users: + + .. code-block:: console + + sudo chmod -R 777 /tmp/core_dumps + + +Setting the Output Directory on the ``/etc/sysctl.conf`` File +------------------------------------------------------------- + +1. Open the ``/etc/sysctl.conf`` file in the Vim text editor: + + .. code-block:: console + + sudo vim /etc/sysctl.conf + +2. Add the following to the bottom of the file: + + .. code-block:: console + + kernel.core_uses_pid = 1 + kernel.core_pattern = /tmp/core_dumps/core-%e-%s-%u-%g-%p-%t + fs.suid_dumpable = 2 + +3. To apply the changes without rebooting the server, run the following: + + .. code-block:: console + + sudo sysctl -p + +4. Check that the core output directory points to the following: + + .. code-block:: console + + sudo cat /proc/sys/kernel/core_pattern + + The following shows the correct generated output: + + .. code-block:: console + + /tmp/core_dumps/core-%e-%s-%u-%g-%p-%t + +Verifying that the Core Dumps Work +---------------------------------- + +You can verify that the core dumps work only after installing and running SQreamDB. This causes the server to crash and a new ``core.xxx`` file to be included in the folder that is written in ``/etc/sysctl.conf``. + +1. Stop and restart all SQreamDB services. + + :: + +2. Connect to SQreamDB with ClientCmd and run the following command: + + .. code-block:: console + + select abort_server(); + +Verify Your SQreamDB Installation +------------------------------------ + +1. Verify that the ``sqream`` user exists and has the same ID on all cluster servers. + + .. code-block:: console + + id sqream + +2. please verify that the storage is mounted on all cluster servers. + + .. code-block:: console + + mount + +3. make sure that the driver is properly installed. + + .. code-block:: console + + nvidia-smi + +4. Verify that the kernel file-handles allocation is greater than or equal to ``2097152``: + + .. code-block:: console + + sysctl -n fs.file-max + +5. Verify limits (run this command as a ``sqream`` user): + + .. code-block:: console + + ulimit -c -u -n + + Desired output: + core file size (blocks, -c) unlimited + max user processes (-u) 1000000 + open files (-n) 1000000 + +Troubleshooting Core Dumping +------------------------------ + +This section describes the troubleshooting procedure to be followed if all parameters have been configured correctly, but the cores have not been created. + +1. Reboot the server. + + :: + +2. Verify that you have folder permissions: + + .. code-block:: console + + sudo chmod -R 777 /tmp/core_dumps + +3. Verify that the limits have been set correctly: + + .. code-block:: console + + ulimit -c + + If all parameters have been configured correctly, the correct output is: + + .. code-block:: console + + core file size (blocks, -c) unlimited + +4. If all parameters have been configured correctly, but running ``ulimit -c`` outputs ``0``, run the following: + + .. code-block:: console + + sudo vim /etc/profile + +5. Search for the following line and disable it using the ``#`` symbol: + + .. code-block:: console + + ulimit -S -c 0 > /dev/null 2>&1 + +6. Log out and log back in. + + :: + +7. Run the ``ulimit -c`` command: + + .. code-block:: console + + ulimit -a + +8. If the line is not found in ``/etc/profile``, do the following: + + a. Run the following command: + + .. code-block:: console + + sudo vim /etc/init.d/functions + + b. Search for the following line disable it using the ``#`` symbol and reboot the server. + + .. code-block:: console + + ulimit -S -c ${DAEMON_COREFILE_LIMIT:-0} >/dev/null 2>&1 diff --git a/installation_guides/recommended_pre-installation_configurations.rst b/installation_guides/recommended_pre-installation_configurations.rst deleted file mode 100644 index cf6328a34..000000000 --- a/installation_guides/recommended_pre-installation_configurations.rst +++ /dev/null @@ -1,1289 +0,0 @@ -.. _recommended_pre-installation_configurations: - -********************************************* -Recommended Pre-Installation Configuration -********************************************* -Before :ref:`installing SQream DB`, SQream recommends you to tune your system for better performance and stability. - -This page provides recommendations for production deployments of SQream and describes the following: - -.. contents:: - :local: - :depth: 1 - -Recommended BIOS Settings -========================== -The first step when setting your pre-installation configurations is to use the recommended BIOS settings. - -The BIOS settings may have a variety of names, or may not exist on your system. Each system vendor has a different set of settings and variables. It is safe to skip any and all of the configuration steps, but this may impact performance. - -If any doubt arises, consult the documentation for your server or your hardware vendor for the correct way to apply the settings. - -.. list-table:: - :widths: 25 25 50 - :header-rows: 1 - - * - Item - - Setting - - Rationale - * - **Management console access** - - **Connected** - - Connection to OOB required to preserve continuous network uptime. - * - **All drives** - - **Connected and displayed on RAID interface** - - Prerequisite for cluster or OS installation. - * - **RAID volumes.** - - **Configured according to project guidelines. Must be rebooted to take effect.** - - Clustered to increase logical volume and provide redundancy. - * - **Fan speed Thermal Configuration.** - - Dell fan speed: **High Maximum**. Specified minimum setting: **60**. HPe thermal configuration: **Increased cooling**. - - NVIDIA Tesla GPUs are passively cooled and require high airflow to operate at full performance. - * - **Power regulator or iDRAC power unit policy** - - HPe: **HP static high performance** mode enabled. Dell: **iDRAC power unit policy** (power cap policy) disabled. - - Other power profiles (such as "balanced") throttle the CPU and diminishes performance. Throttling may also cause GPU failure. - * - **System Profile**, **Power Profile**, or **Performance Profile** - - **High Performance** - - The Performance profile provides potentially increased performance by maximizing processor frequency, and the disabling certain power saving features such as C-states. Use this setting for environments that are not sensitive to power consumption. - * - **Power Cap Policy** or **Dynamic power capping** - - **Disabled** - - Other power profiles (like "balanced") throttle the CPU and may diminish performance or cause GPU failure. This setting may appear together with the above (Power profile or Power regulator). This setting allows disabling system ROM power calibration during the boot process. Power regulator settings are named differently in BIOS and iLO/iDRAC. - * - **Intel Turbo Boost** - - **Enabled** - - Intel Turbo Boost enables overclocking the processor to boost CPU-bound operation performance. Overclocking may risk computational jitter due to changes in the processor's turbo frequency. This causes brief pauses in processor operation, introducing uncertainty into application processing time. Turbo operation is a function of power consumption, processor temperature, and the number of active cores. - * - **Logical Processor** - - **HPe**: Enable **Hyperthreading** **Dell**: Enable **Logical Processor** - - Hyperthreading doubles the amount of logical processors, which may improve performance by ~5-10% for CPU-bound operations. - * - **Intel Virtualization Technology** (VT-d) - - **Disable** - - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. - * - **Logical Processor** - - **HPe**: Enable **Hyperthreading** **Dell**: Enable **Logical Processor** - - Hyperthreading doubles the amount of logical processors, which may improve performance by ~5-10% for CPU-bound operations. - * - **Intel Virtualization Technology** (VT-d) - - **Disable** - - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. - * - **Processor C-States** (Minimum processor idle power core state) - - **Disable** - - Processor C-States reduce server power when the system is in an idle state. This causes slower cold-starts when the system transitions from an idle to a load state, and may reduce query performance by up to 15%. - * - **HPe**: **Energy/Performance bias** - - **Maximum performance** - - Configures processor sub-systems for high-performance and low-latency. Other power profiles (like "balanced") throttle the CPU and may diminish performance. Use this setting for environments that are not sensitive to power consumption. - * - **HPe**: **DIMM voltage** - - **Optimized for Performance** - - Setting a higher voltage for DIMMs may increase performance. - * - **Memory Operating Mode** - - **Optimizer Mode**, **Disable Node Interleaving**, **Auto Memory Operating Voltage** - - Memory Operating Mode is tuned for performance in **Optimizer** mode. Other modes may improve reliability, but reduce performance. **Node Interleaving** should be disabled because enabling it interleaves the memory between memory nodes, which harms NUMA-aware applications such as SQream DB. - * - **HPe**: **Memory power savings mode** - - **Maximum performance** - - This setting configures several memory parameters to optimize the performance of memory sub-systems. The default setting is **Balanced**. - * - **HPe ACPI SLIT** - - **Enabled** - - ACPI SLIT sets the relative access times between processors and memory and I/O sub-systems. ACPI SLIT enables operating systems to use this data to improve performance by more efficiently allocating resources and workloads. - * - **QPI Snoop** - - **Cluster on Die** or **Home Snoop** - - QPI (QuickPath Interconnect) Snoop lets you configure different Snoop modes that impact the QPI interconnect. Changing this setting may improve the performance of certain workloads. The default setting of **Home Snoop** provides high memory bandwidth in an average NUMA environment. **Cluster on Die** may provide increased memory bandwidth in highly optimized NUMA workloads. **Early Snoop** may decrease memory latency, but may result in lower overall bandwidth compared to other modes. - -Installing the Operating System -=================================================== -Once the BIOS settings have been set, you must install the operating system. Either the CentOS (versions 7.6-7.9) or RHEL (versions 7.6-7.9) must be installed before installing the SQream database, by either the customer or a SQream representative. - -**To install the operating system:** - -#. Select a language (English recommended). -#. From **Software Selection**, select **Minimal**. -#. Select the **Development Tools** group checkbox. -#. Continue the installation. -#. Set up the necessary drives and users as per the installation process. - - Using Debugging Tools is recommended for future problem-solving if necessary. - -Selecting the **Development Tools** group installs the following tools: - - * autoconf - * automake - * binutils - * bison - * flex - * gcc - * gcc-c++ - * gettext - * libtool - * make - * patch - * pkgconfig - * redhat-rpm-config - * rpm-build - * rpm-sign - -The root user is created and the OS shell is booted up. - -Configuring the Operating System -=================================================== -Once you've installted your operation system, you can configure it. When configuring the operating system, several basic settings related to creating a new server are required. Configuring these as part of your basic set-up increases your server's security and usability. - -Logging In to the Server --------------------------------- -You can log in to the server using the server's IP address and password for the **root** user. The server's IP address and **root** user were created while installing the operating system above. - -Automatically Creating a SQream User ------------------------------------- - -**To automatically create a SQream user:** - -#. If a SQream user was created during installation, verify that the same ID is used on every server: - - .. code-block:: console - - $ sudo id sqream - - The ID **1000** is used on each server in the following example: - - .. code-block:: console - - $ uid=1000(sqream) gid=1000(sqream) groups=1000(sqream) - -2. If the ID's are different, delete the SQream user and SQream group from both servers: - - .. code-block:: console - - $ sudo userdel sqream - -3. Recreate it using the same ID: - - .. code-block:: console - - $ sudo rm /var/spool/mail/sqream - -Manually Creating a SQream User --------------------------------- - -**To manually create a SQream user:** - -SQream enables you to manually create users. This section shows you how to manually create a user with the UID **1111**. You cannot manually create during the operating system installation procedure. - -1. Add a user with an identical UID on all cluster nodes: - - .. code-block:: console - - $ useradd -u 1111 sqream - -2. Add the user **sqream** to the **wheel** group. - - .. code-block:: console - - $ sudo usermod -aG wheel sqream - - You can remove the SQream user from the **wheel** group when the installation and configuration are complete: - - .. code-block:: console - - $ passwd sqream - -3. Log out and log back in as **sqream**. - - .. note:: If you deleted the **sqream** user and recreated it with different ID, to avoid permission errors, you must change its ownership to /home/sqream. - -4. Change the **sqream** user's ownership to /home/sqream: - - .. code-block:: console - - $ sudo chown -R sqream:sqream /home/sqream - -Setting Up A Locale --------------------------------- - -SQream enables you to set up a locale. In this example, the locale used is your own location. - -**To set up a locale:** - -1. Set the language of the locale: - - .. code-block:: console - - $ sudo localectl set-locale LANG=en_US.UTF-8 - -2. Set the time stamp (time and date) of the locale: - - .. code-block:: console - - $ sudo timedatectl set-timezone Asia/Jerusalem - -If needed, you can run the **timedatectl list-timezones** command to see your current time-zone. - - -Installing the Required Packages --------------------------------- -You can install the required packages by running the following command: - -.. code-block:: console - - $ sudo yum install ntp pciutils monit zlib-devel openssl-devel kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc net-tools wget jq - - -Installing the Recommended Tools --------------------------------- -You can install the recommended tools by running the following command: - -.. code-block:: console - - $ sudo yum install bash-completion.noarch vim-enhanced vim-common net-tools iotop htop psmisc screen xfsprogs wget yum-utils deltarpm dos2unix - - -Installing Python 3.6.7 --------------------------------- -1. Download the Python 3.6.7 source code tarball file from the following URL into the **/home/sqream** directory: - - .. code-block:: console - - $ wget https://www.python.org/ftp/python/3.6.7/Python-3.6.7.tar.xz - -2. Extract the Python 3.6.7 source code into your current directory: - - .. code-block:: console - - $ tar -xf Python-3.6.7.tar.xz - -3. Navigate to the Python 3.6.7 directory: - - .. code-block:: console - - $ cd Python-3.6.7 - -4. Run the **./configure** script: - - .. code-block:: console - - $ ./configure - -5. Build the software: - - .. code-block:: console - - $ make -j30 - -6. Install the software: - - .. code-block:: console - - $ sudo make install - -7. Verify that Python 3.6.7 has been installed: - - .. code-block:: console - - $ python3 - -Installing NodeJS on CentOS --------------------------------- -**To install the node.js on CentOS:** - -1. Download the `setup_12.x file `__ as a root user logged in shell: - - .. code-block:: console - - $ curl -sL https://rpm.nodesource.com/setup_12.x | sudo bash - - -2. Clear the YUM cache and update the local metadata: - - .. code-block:: console - - $ sudo yum clean all && sudo yum makecache fast - -3. Install the **node.js** file: - - .. code-block:: console - - $ sudo yum install -y nodejs - -4. Install npm and make it available for all users: - - .. code-block:: console - - $ sudo npm install pm2 -g - -Installing NodeJS on Ubuntu --------------------------------- -**To install the node.js file on Ubuntu:** - -1. Download the `setup_12.x file `__ as a root user logged in shell: - - .. code-block:: console - - $ curl -sL https://rpm.nodesource.com/setup_12.x | sudo bash - - -2. Install the node.js file: - - .. code-block:: console - - $ sudo apt-get install -y nodejs - -3. Install npm and make it available for all users: - - .. code-block:: console - - $ sudo npm install pm2 -g - -Installing NodeJS Offline -------------------------------------------- -**To install NodeJS Offline** - -1. Download the NodeJS source code tarball file from the following URL into the **/home/sqream** directory: - - .. code-block:: console - - $ wget https://nodejs.org/dist/v12.13.0/node-v12.13.0-linux-x64.tar.xz - -2. Move the node-v12.13.0-linux-x64 file to the */usr/local* directory. - - .. code-block:: console - - $ sudo mv node-v12.13.0-linux-x64 /usr/local - -3. Navigate to the */usr/bin/* directory: - - .. code-block:: console - - $ cd /usr/bin - -4. Create a symbolic link to the */local/node-v12.13.0-linux-x64/bin/node node* directory: - - .. code-block:: console - - $ sudo ln -s ../local/node-v12.13.0-linux-x64/bin/node node - -5. Create a symbolic link to the */local/node-v12.13.0-linux-x64/bin/npm npm* directory: - - .. code-block:: console - - $ sudo ln -s ../local/node-v12.13.0-linux-x64/bin/npm npm - -6. Create a symbolic link to the */local/node-v12.13.0-linux-x64/bin/npx npx* directory: - - .. code-block:: console - - $ sudo ln -s ../local/node-v12.13.0-linux-x64/bin/npx npx - -7. Verify that the node versions for the above are correct: - - .. code-block:: console - - $ node --version - -Installing the pm2 Service Offline -------------------------------------------- -**To install the pm2 Service Offline** - -1. On a machine with internet access, install the following: - - * nodejs - * npm - * pm2 - -2. Extract the pm2 module to the correct directory: - - .. code-block:: console - - $ cd /usr/local/node-v12.13.0-linux-x64/lib/node_modules - $ tar -czvf pm2_x86.tar.gz pm2 - -3. Copy the **pm2_x86.tar.gz** file to a server without access to the internet and extract it. - - :: - -4. Move the **pm2** folder to the */usr/local/node-v12.13.0-linux-x64/lib/node_modules* directory: - - .. code-block:: console - - $ sudo mv pm2 /usr/local/node-v12.13.0-linux-x64/lib/node_modules - -5. Navigate back to the */usr/bin* directory: - - .. code-block:: console - - $ cd /usr/bin again - -6. Create a symbolink to the **pm2** service: - - .. code-block:: console - - $ sudo ln -s /usr/local/node-v12.22.3-linux-x64/lib/node_modules/pm2/bin/pm2 pm2 - -7. Verify that installation was successful: - - .. code-block:: console - - $ pm2 list - - .. note:: This must be done as a **sqream** user, and not as a **sudo** user. - -8. Verify that the node version is correct: - - .. code-block:: console - - $ node -v - -Configuring the Network Time Protocol -------------------------------------------- -This section describes how to configure your **Network Time Protocol (NTP)**. - -If you don't have internet access, see `Configure NTP Client to Synchronize with NTP Server `__. - -**To configure your NTP:** - -1. Install the NTP file. - - .. code-block:: console - - $ sudo yum install ntp - -2. Enable the **ntpd** program. - - .. code-block:: console - - $ sudo systemctl enable ntpd - -3. Start the **ntdp** program. - - .. code-block:: console - - $ sudo systemctl start ntpd - -4. Print a list of peers known to the server and a summary of their states. - - .. code-block:: console - - $ sudo ntpq -p - -Configuring the Network Time Protocol Server --------------------------------------------- -If your organization has an NTP server, you can configure it. - -**To configure your NTP server:** - -1. Output your NTP server address and append ``/etc/ntpd.conf`` to the outuput. - - .. code-block:: console - - $ echo -e "\nserver \n" | sudo tee -a /etc/ntp.conf - -2. Restart the service. - - .. code-block:: console - - $ sudo systemctl restart ntpd - -3. Check that synchronization is enabled: - - .. code-block:: console - - $ sudo timedatectl - - Checking that synchronization is enabled generates the following output: - - .. code-block:: console - - $ Local time: Sat 2019-10-12 17:26:13 EDT - Universal time: Sat 2019-10-12 21:26:13 UTC - RTC time: Sat 2019-10-12 21:26:13 - Time zone: America/New_York (EDT, -0400) - NTP enabled: yes - NTP synchronized: yes - RTC in local TZ: no - DST active: yes - Last DST change: DST began at - Sun 2019-03-10 01:59:59 EST - Sun 2019-03-10 03:00:00 EDT - Next DST change: DST ends (the clock jumps one hour backwards) at - Sun 2019-11-03 01:59:59 EDT - Sun 2019-11-03 01:00:00 EST - -Configuring the Server to Boot Without the UI ---------------------------------------------- -You can configure your server to boot without a UI in cases when it is not required (recommended) by running the following command: - -.. code-block:: console - - $ sudo systemctl set-default multi-user.target - -Running this command activates the **NO-UI** server mode. - -Configuring the Security Limits --------------------------------- -The security limits refers to the number of open files, processes, etc. - -You can configure the security limits by running the **echo -e** command as a root user logged in shell: - -.. code-block:: console - - $ sudo bash - -.. code-block:: console - - $ echo -e "sqream soft nproc 1000000\nsqream hard nproc 1000000\nsqream soft nofile 1000000\nsqream hard nofile 1000000\nsqream soft core unlimited\nsqream hard core unlimited" >> /etc/security/limits.conf - -Configuring the Kernel Parameters ---------------------------------- -**To configure the kernel parameters:** - -1. Insert a new line after each kernel parameter: - - .. code-block:: console - - $ echo -e "vm.dirty_background_ratio = 5 \n vm.dirty_ratio = 10 \n vm.swappiness = 10 \n vm.vfs_cache_pressure = 200 \n vm.zone_reclaim_mode = 0 \n" >> /etc/sysctl.conf - - .. note:: In the past, the **vm.zone_reclaim_mode** parameter was set to **7.** In the latest Sqream version, the vm.zone_reclaim_mode parameter must be set to **0**. If it is not set to **0**, when a numa node runs out of memory, the system will get stuck and will be unable to pull memory from other numa nodes. - -2. Check the maximum value of the **fs.file**. - - .. code-block:: console - - $ sysctl -n fs.file-max - -3. If the maximum value of the **fs.file** is smaller than **2097152**, run the following command: - - .. code-block:: console - - $ echo "fs.file-max=2097152" >> /etc/sysctl.conf - - **IP4 forward** must be enabled for Docker and K8s installation only. - -4. Run the following command: - - .. code-block:: console - - $ sudo echo “net.ipv4.ip_forward = 1” >> /etc/sysctl.conf - -5. Reboot your system: - - .. code-block:: console - - $ sudo reboot - -Configuring the Firewall --------------------------------- -The example in this section shows the open ports for four sqreamd sessions. If more than four are required, open the required ports as needed. Port 8080 in the example below is a new UI port. - -**To configure the firewall:** - -1. Start the service and enable FirewallID on boot: - - .. code-block:: console - - $ systemctl start firewalld - -2. Add the following ports to the permanent firewall: - - .. code-block:: console - - $ firewall-cmd --zone=public --permanent --add-port=8080/tcp - $ firewall-cmd --zone=public --permanent --add-port=3105/tcp - $ firewall-cmd --zone=public --permanent --add-port=3108/tcp - $ firewall-cmd --zone=public --permanent --add-port=5000-5003/tcp - $ firewall-cmd --zone=public --permanent --add-port=5100-5103/tcp - $ firewall-cmd --permanent --list-all - -3. Reload the firewall: - - .. code-block:: console - - $ firewall-cmd --reload - -4. Enable FirewallID on boot: - - .. code-block:: console - - $ systemctl enable firewalld - - If you do not need the firewall, you can disable it: - - .. code-block:: console - - $ sudo systemctl disable firewalld - -Disabling selinux --------------------------------- -**To disable selinux:** - -1. Show the status of **selinux**: - - .. code-block:: console - - $ sudo sestatus - -2. If the output is not **disabled**, edit the **/etc/selinux/config** file: - - .. code-block:: console - - $ sudo vim /etc/selinux/config - -3. Change **SELINUX=enforcing** to **SELINUX=disabled**. - - The above changes will only take effect after rebooting the server. - - You can disable selinux immediately after rebooting the server by running the following command: - - .. code-block:: console - - $ sudo setenforce 0 - -Configuring the /etc/hosts File --------------------------------- -**To configure the /etc/hosts file:** - -1. Edit the **/etc/hosts** file: - - .. code-block:: console - - $ sudo vim /etc/hosts - -2. Call your local host: - - .. code-block:: console - - $ 127.0.0.1 localhost - $ - $ - -Configuring the DNS --------------------------------- -**To configure the DNS:** - -1. Run the **ifconfig** commasnd to check your NIC name. In the following example, **eth0** is the NIC name: - - .. code-block:: console - - $ sudo vim /etc/sysconfig/network-scripts/ifcfg-eth0 - -2. Replace the DNS lines from the example above with your own DNS addresses : - - .. code-block:: console - - $ DNS1="4.4.4.4" - $ DNS2="8.8.8.8" - -Installing the Nvidia CUDA Driver -=================================================== -After configuring your operating system, you must install the Nvidia CUDA driver. - - .. warning:: If your UI runs on the server, the server must be stopped before installing the CUDA drivers. - -CUDA Driver Prerequisites --------------------------------- -1. Verify that the NVIDIA card has been installed and is detected by the system: - - .. code-block:: console - - $ lspci | grep -i nvidia - -2. Check which version of gcc has been installed: - - .. code-block:: console - - $ gcc --version - -3. If gcc has not been installed, install it for one of the following operating systems: - - * On RHEL/CentOS: - - .. code-block:: console - - $ sudo yum install -y gcc - - * On Ubuntu: - - .. code-block:: console - - $ sudo apt-get install gcc - -Updating the Kernel Headers --------------------------------- -**To update the kernel headers:** - -1. Update the kernel headers on one of the following operating systems: - - * On RHEL/CentOS: - - .. code-block:: console - - $ sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) - - * On Ubuntu: - - .. code-block:: console - - $ sudo apt-get install linux-headers-$(uname -r) - -2. Install **wget** one of the following operating systems: - - * On RHEL/CentOS: - - .. code-block:: console - - $ sudo yum install wget - - * On Ubuntu: - - .. code-block:: console - - $ sudo apt-get install wget - -Disabling Nouveau --------------------------------- -You can disable Nouveau, which is the default driver. - -**To disable Nouveau:** - -1. Check if the Nouveau driver has been loaded: - - .. code-block:: console - - $ lsmod | grep nouveau - - If the Nouveau driver has been loaded, the command above generates output. - -2. Blacklist the Nouveau drivers to disable them: - - .. code-block:: console - - $ cat <`__ for the additional set-up requirements. - -To Tune Up NVIDIA Performance when Driver Installed from the Runfile -~~~~~~~~~~~~~~~~~~~~ -**To tune up NVIDIA performance when the driver was installed from the runfile:** - -1. Change the permissions on the **rc.local** file to **executable**: - - .. code-block:: console - - $ sudo chmod +x /etc/rc.local - -2. Edit the **/etc/yum.repos.d/cuda-10-1-local.repo** file: - - .. code-block:: console - - $ sudo vim /etc/rc.local - -3. Add the following lines: - - * **For V100/A100**: - - .. code-block:: console - - $ nvidia-persistenced - - * **For IBM (mandatory)**: - - .. code-block:: console - - $ sudo systemctl start nvidia-persistenced - $ sudo systemctl enable nvidia-persistenced - - * **For K80**: - - .. code-block:: console - - $ nvidia-persistenced - $ nvidia-smi -pm 1 - $ nvidia-smi -acp 0 - $ nvidia-smi --auto-boost-permission=0 - $ nvidia-smi --auto-boost-default=0 - -4. Reboot the server and run the **NVIDIA System Management Interface (NVIDIA SMI)**: - - .. code-block:: console - - $ nvidia-smi - -.. note:: Setting up the NVIDIA POWER9 CUDA driver includes additional set-up requirements. The NVIDIA POWER9 CUDA driver will not function properly if the additional set-up requirements are not followed. See `POWER9 Setup `__ for the additional set-up requirements. - -Disabling Automatic Bug Reporting Tools -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -**To disable automatic bug reporting tools:** - -1. Run the following **abort** commands: - - .. code-block:: console - - $ for i in abrt-ccpp.service abrtd.service abrt-oops.service abrt-pstoreoops.service abrt-vmcore.service abrt-xorg.service ; do sudo systemctl disable $i; sudo systemctl stop $i; done - -The server is ready for the SQream software installation. - -2. Run the following checks: - - a. Check the OS release: - - .. code-block:: console - - $ cat /etc/os-release - - b. Verify that a SQream user exists and has the same ID on all cluster member services: - - .. code-block:: console - - $ id sqream - - c. Verify that the storage is mounted: - - .. code-block:: console - - $ mount - - d. Verify that the driver has been installed correctly: - - .. code-block:: console - - $ nvidia-smi - - e. Check the maximum value of the **fs.file**: - - .. code-block:: console - - $ sysctl -n fs.file-max - - f. Run the following command as a SQream user: - - .. code-block:: console - - $ ulimit -c -u -n - - The following shows the desired output: - - .. code-block:: console - - $ core file size (blocks, -c) unlimited - $ max user processes (-u) 1000000 - $ open files (-n) 1000000 - -Enabling Core Dumps -=================================================== -After installing the Nvidia CUDA driver, you can enable your core dumps. While SQream recommends enabling your core dumps, it is optional. - -The **Enabling Core Dumps** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Checking the abrtd Status ---------------------------------------------------- -**To check the abrtd status:** - -1. Check if **abrtd** is running: - - .. code-block:: console - - $ sudo ps -ef |grep abrt - -2. If **abrtd** is running, stop it: - - .. code-block:: console - - $ sudo service abrtd stop - $ sudo chkconfig abrt-ccpp off - $ sudo chkconfig abrt-oops off - $ sudo chkconfig abrt-vmcore off - $ sudo chkconfig abrt-xorg off - $ sudo chkconfig abrtd off - -Setting the Limits ---------------------------------------------------- -**To set the limits:** - -1. Set the limits: - - .. code-block:: console - - $ ulimit -c - -2. If the output is **0**, add the following lines to the **limits.conf** file (/etc/security): - - .. code-block:: console - - $ * soft core unlimited - $ * hard core unlimited - -3. Log out and log in to apply the limit changes. - -Creating the Core Dumps Directory ---------------------------------------------------- -**To set the core dumps directory:** - -1. Make the **/tmp/core_dumps** directory: - - .. code-block:: console - - $ mkdir /tmp/core_dumps - -2. Set the ownership of the **/tmp/core_dumps** directory: - - .. code-block:: console - - $ sudo chown sqream.sqream /tmp/core_dumps - -3. Grant read, write, and execute permissions to all users: - - .. code-block:: console - - $ sudo chmod -R 777 /tmp/core_dumps - -.. warning:: Because the core dump file may be the size of total RAM on the server, verify that you have sufficient disk space. In the example above, the core dump is configured to the */tmp/core_dumps* directory. You must replace path according to your own environment and disk space. - -Setting the Output Directory of the /etc/sysctl.conf File ------------------------------------------------------------------ -**To set the output directory of the /etc/sysctl.conf file:** - -1. Edit the **/etc/sysctl.conf** file: - - .. code-block:: console - - $ sudo vim /etc/sysctl.conf - -2. Add the following to the bottom of the file: - - .. code-block:: console - - $ kernel.core_uses_pid = 1 - $ kernel.core_pattern = //core-%e-%s-%u-%g-%p-%t - $ fs.suid_dumpable = 2 - -3. To apply the changes without rebooting the server, run the following: - - .. code-block:: console - - $ sudo sysctl -p - -4. Check that the core output directory points to the following: - - .. code-block:: console - - $ sudo cat /proc/sys/kernel/core_pattern - - The following shows the correct generated output: - - .. code-block:: console - - $ /tmp/core_dumps/core-%e-%s-%u-%g-%p-%t - -5. Verify that the core dumping works: - - .. code-block:: console - - $ select abort_server(); - -Verifying that the Core Dumps Work ---------------------------------------------------- -You can verify that the core dumps work only after installing and running SQream. This causes the server to crash and a new core.xxx file to be included in the folder that is written in **/etc/sysctl.conf** - -**To verify that the core dumps work:** - -1. Stop and restart all SQream services. - - :: - -2. Connect to SQream with ClientCmd and run the following command: - - .. code-block:: console - - $ select abort_server(); - -Troubleshooting Core Dumping ---------------------------------------------------- -This section describes the troubleshooting procedure to be followed if all parameters have been configured correctly, but the cores have not been created. - -**To troubleshoot core dumping:** - -1. Reboot the server. - - :: - -2. Verify that you have folder permissions: - - .. code-block:: console - - $ sudo chmod -R 777 /tmp/core_dumps - -3. Verify that the limits have been set correctly: - - .. code-block:: console - - $ ulimit -c - - If all parameters have been configured correctly, the correct output is: - - .. code-block:: console - - $ core file size (blocks, -c) unlimited - $ open files (-n) 1000000 - -4. If all parameters have been configured correctly, but running **ulimit -c** outputs **0**, run the following: - - .. code-block:: console - - $ sudo vim /etc/profile - -5. Search for line and tag it with the **hash** symbol: - - .. code-block:: console - - $ ulimit -S -c 0 > /dev/null 2>&1 - -6. Log out and log in. - - :: - -7. Run the ulimit -c command: - - .. code-block:: console - - $ ulimit -c command - -8. If the line is not found in **/etc/profile** directory, do the following: - - a. Run the following command: - - .. code-block:: console - - $ sudo vim /etc/init.d/functions - - b. Search for the following: - - .. code-block:: console - - $ ulimit -S -c ${DAEMON_COREFILE_LIMIT:-0} >/dev/null 2>&1 - - c. If the line is found, tag it with the **hash** symbol and reboot the server. diff --git a/installation_guides/running_sqream_in_a_docker_container.rst b/installation_guides/running_sqream_in_a_docker_container.rst deleted file mode 100644 index 040223936..000000000 --- a/installation_guides/running_sqream_in_a_docker_container.rst +++ /dev/null @@ -1,1488 +0,0 @@ -.. _running_sqream_in_a_docker_container: - - - -*********************** -Installing and Running SQream in a Docker Container -*********************** -The **Running SQream in a Docker Container** page describes how to prepare your machine's environment for installing and running SQream in a Docker container. - -This page describes the following: - -.. contents:: - :local: - :depth: 1 - -Setting Up a Host -==================================== - -Operating System Requirements ------------------------------------- -SQream was tested and verified on the following versions of Linux: - - * x86 CentOS/RHEL 7.6 - 7.9 - * IBM RHEL 7.6 - -SQream recommends installing a clean OS on the host to avoid any installation issues. - -.. warning:: Docker-based installation supports only single host deployment and cannot be used on a multi-node cluster. Installing Docker on a single host you will not be able to scale it to a multi-node cluster. - - -Creating a Local User ----------------- -To run SQream in a Docker container you must create a local user. - -**To create a local user:** - -1. Add a local user: - - .. code-block:: console - - $ useradd -m -U - -2. Set the local user's password: - - .. code-block:: console - - $ passwd - -3. Add the local user to the ``wheel`` group: - - .. code-block:: console - - $ usermod -aG wheel - - You can remove the local user from the ``wheel`` group when you have completed the installation. - -4. Log out and log back in as the local user. - -Setting a Local Language ----------------- -After creating a local user you must set a local language. - -**To set a local language:** - -1. Set the local language: - - .. code-block:: console - - $ sudo localectl set-locale LANG=en_US.UTF-8 - -2. Set the time stamp (time and date) of the locale: - - .. code-block:: console - - $ sudo timedatectl set-timezone Asia/Jerusalem - -You can run the ``timedatectl list-timezones`` command to see your timezone. - -Adding the EPEL Repository ----------------- -After setting a local language you must add the EPEL repository. - -**To add the EPEL repository:** - -1. As a root user, upgrade the **epel-release-latest-7.noarch.rpm** repository: - - 1. RedHat (RHEL 7): - - .. code-block:: console - - $ sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm - - 2. CentOS 7 - - .. code-block:: console - - $ sudo yum install epel-release - -Installing the Required NTP Packages ----------------- -After adding the EPEL repository, you must install the required NTP packages. - -You can install the required NTP packages by running the following command: - -.. code-block:: console - - $ sudo yum install ntp pciutils python36 kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc - -Installing the Recommended Tools ----------------- -After installin gthe required NTP packages you must install the recommended tools. - -SQream recommends installing the following recommended tools: - -.. code-block:: console - - $ sudo yum install bash-completion.noarch vim-enhanced.x86_64 vim-common.x86_64 net-tools iotop htop psmisc screen xfsprogs wget yum-utils deltarpm dos2unix - -Updating to the Current Version of the Operating System ----------------- -After installing the recommended tools you must update to the current version of the operating system. - -SQream recommends updating to the current version of the operating system. This is not recommended if the nvidia driver has **not been installed.** - - - -Configuring the NTP Package ----------------- -After updating to the current version of the operating system you must configure the NTP package. - -**To configure the NTP package:** - -1. Add your local servers to the NTP configuration. - - :: - -2. Configure the **ntpd** service to begin running when your machine is started: - - .. code-block:: console - - $ sudo systemctl enable ntpd - $ sudo systemctl start ntpd - $ sudo ntpq -p - -Configuring the Performance Profile ----------------- -After configuring the NTP package you must configure the performance profile. - -**To configure the performance profile:** - -1. Switch the active profile: - - .. code-block:: console - - $ sudo tuned-adm profile throughput-performance - -2. Change the multi-user's default run level: - - .. code-block:: console - - $ sudo systemctl set-default multi-user.target - -Configuring Your Security Limits ----------------- -After configuring the performance profile you must configure your security limits. Configuring your security limits refers to configuring the number of open files, processes, etc. - -**To configure your security limits:** - -1. Run the **bash** shell as a super-user: - - .. code-block:: console - - $ sudo bash - -2. Run the following command: - - .. code-block:: console - - $ echo -e "sqream soft nproc 500000\nsqream hard nproc 500000\nsqream soft nofile 500000\nsqream hard nofile 500000\nsqream soft core unlimited\nsqream hard core unlimited" >> /etc/security/limits.conf - -3. Run the following command: - - .. code-block:: console - - $ echo -e "vm.dirty_background_ratio = 5 \n vm.dirty_ratio = 10 \n vm.swappiness = 10 \n vm.zone_reclaim_mode = 0 \n vm.vfs_cache_pressure = 200 \n" >> /etc/sysctl.conf - -Disabling Automatic Bug-Reporting Tools ----------------- -After configuring your security limits you must disable the following automatic bug-reporting tools: - -* ccpp.service -* oops.service -* pstoreoops.service -* vmcore.service -* xorg.service - -You can abort the above but-reporting tools by running the following command: - -.. code-block:: console - - $ for i in abrt-ccpp.service abrtd.service abrt-oops.service abrt-pstoreoops.service abrt-vmcore.service abrt-xorg.service ; do sudo systemctl disable $i; sudo systemctl stop $i; done - -Installing the Nvidia CUDA Driver -------------------------------------- - -1. Verify that the Tesla NVIDIA card has been installed and is detected by the system: - - .. code-block:: console - - $ lspci | grep -i nvidia - - The correct output is a list of Nvidia graphic cards. If you do not receive this output, verify that an NVIDIA GPU card has been installed. - -#. Verify that the open-source upstream Nvidia driver is running: - - .. code-block:: console - - $ lsmod | grep nouveau - - No output should be generated. - -#. If you receive any output, do the following: - - 1. Disable the open-source upstream Nvidia driver: - - .. code-block:: console - - $ sudo bash - $ echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf - $ echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf - $ dracut --force - $ modprobe --showconfig | grep nouveau - - 2. Reboot the server and verify that the Nouveau model has not been loaded: - - .. code-block:: console - - $ lsmod | grep nouveau - -#. Check if the Nvidia CUDA driver has already been installed: - - .. code-block:: console - - $ nvidia-smi - - The following is an example of the correct output: - - .. code-block:: console - - nvidia-smi - Wed Oct 30 14:05:42 2019 - +-----------------------------------------------------------------------------+ - | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 | - |-------------------------------+----------------------+----------------------+ - | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | - | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | - |===============================+======================+======================| - | 0 Tesla V100-SXM2... On | 00000004:04:00.0 Off | 0 | - | N/A 32C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - | 1 Tesla V100-SXM2... On | 00000035:03:00.0 Off | 0 | - | N/A 33C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - - +-----------------------------------------------------------------------------+ - | Processes: GPU Memory | - | GPU PID Type Process name Usage | - |=============================================================================| - | No running processes found | - +-----------------------------------------------------------------------------+ - -#. Verify that the installed CUDA version shown in the output above is ``10.1``. - - :: - - -#. Do one of the following: - - :: - - 1. If CUDA version 10.1 has already been installed, skip to Docktime Runtime (Community Edition). - :: - - 2. If CUDA version 10.1 has not been installed yet, continue with Step 7 below. - -#. Do one of the following: - - * Install :ref:`CUDA Driver version 10.1 for x86_64 `. - - :: - - * Install :ref:`CUDA driver version 10.1 for IBM Power9 `. - -.. _CUDA_10.1_x8664: - -Installing the CUDA Driver Version 10.1 for x86_64 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**To install the CUDA driver version 10.1 for x86_64:** - -1. Make the following target platform selections: - - :: - - * **Operating system**: Linux - * **Architecture**: x86_64 - * **Distribution**: CentOS - * **Version**: 7 - * **Installer type**: the relevant installer type - -For installer type, SQream recommends selecting **runfile (local)**. The available selections shows only the supported platforms. - -2. Download the base installer for Linux CentOS 7 x86_64: - - .. code-block:: console - - wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64.rpm - - -3. Install the base installer for Linux CentOS 7 x86_64 by running the following commands: - - .. code-block:: console - - $ sudo yum localinstall cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64.rpm - $ sudo yum clean all - $ sudo yum install nvidia-driver-latest-dkms - -.. warning:: Verify that the output indicates that driver **418.87** will be installed. - -4. Follow the command line prompts. - - - :: - - -5. Enable the Nvidia service to start at boot and start it: - - .. code-block:: console - - $ sudo systemctl enable nvidia-persistenced.service && sudo systemctl start nvidia-persistenced.service - -6. Create a symbolic link from the **/etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service** file to the **/usr/lib/systemd/system/nvidia-persistenced.service** file. - - :: - -7. Reboot the server. - - :: -8. Verify that the Nvidia driver has been installed and shows all available GPU's: - - .. code-block:: console - - $ nvidia-smi - - The following is the correct output: - - .. code-block:: console - - nvidia-smi - Wed Oct 30 14:05:42 2019 - +-----------------------------------------------------------------------------+ - | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 | - |-------------------------------+----------------------+----------------------+ - | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | - | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | - |===============================+======================+======================| - | 0 Tesla V100-SXM2... On | 00000004:04:00.0 Off | 0 | - | N/A 32C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - | 1 Tesla V100-SXM2... On | 00000035:03:00.0 Off | 0 | - | N/A 33C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - - +-----------------------------------------------------------------------------+ - | Processes: GPU Memory | - | GPU PID Type Process name Usage | - |=============================================================================| - | No running processes found | - +-----------------------------------------------------------------------------+ - -.. _CUDA_10.1_IBMPower9: - -Installing the CUDA Driver Version 10.1 for IBM Power9 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**To install the CUDA driver version 10.1 for IBM Power9:** - -1. Download the base installer for Linux CentOS 7 PPC64le: - - .. code-block:: console - - wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.ppc64le.rpm - - -#. Install the base installer for Linux CentOS 7 x86_64 by running the following commands: - - .. code-block:: console - - $ sudo rpm -i cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.ppc64le.rpm - $ sudo yum clean all - $ sudo yum install nvidia-driver-latest-dkms - -.. warning:: Verify that the output indicates that driver **418.87** will be installed. - - - -3. Copy the file to the **/etc/udev/rules.d** directory. - - :: - -4. If you are using RHEL 7 version (7.6 or later), comment out, remove, or change the hot-pluggable memory rule located in file copied to the **/etc/udev/rules.d** directory by running the following command: - - .. code-block:: console - - $ sudo cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d - $ sudo sed -i 's/SUBSYSTEM!="memory",.*GOTO="memory_hotplug_end"/SUBSYSTEM=="*", GOTO="memory_hotplug_end"/' /etc/udev/rules.d/40-redhat.rules - -#. Enable the **nvidia-persisted.service** file: - - .. code-block:: console - - $ sudo systemctl enable nvidia-persistenced.service - -#. Create a symbolic link from the **/etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service** file to the **/usr/lib/systemd/system/nvidia-persistenced.service** file. - - :: - -#. Reboot your system to initialize the above modifications. - - :: - -#. Verify that the Nvidia driver and the **nvidia-persistenced.service** files are running: - - .. code-block:: console - - $ nvidia smi - - The following is the correct output: - - .. code-block:: console - - nvidia-smi - Wed Oct 30 14:05:42 2019 - +-----------------------------------------------------------------------------+ - | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 | - |-------------------------------+----------------------+----------------------+ - | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | - | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | - |===============================+======================+======================| - | 0 Tesla V100-SXM2... On | 00000004:04:00.0 Off | 0 | - | N/A 32C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - | 1 Tesla V100-SXM2... On | 00000035:03:00.0 Off | 0 | - | N/A 33C P0 37W / 300W | 0MiB / 16130MiB | 0% Default | - +-------------------------------+----------------------+----------------------+ - - +-----------------------------------------------------------------------------+ - | Processes: GPU Memory | - | GPU PID Type Process name Usage | - |=============================================================================| - | No running processes found | - +-----------------------------------------------------------------------------+ - -#. Verify that the **nvidia-persistenced** service is running: - - .. code-block:: console - - $ systemctl status nvidia-persistenced - - The following is the correct output: - - .. code-block:: console - - root@gpudb ~]systemctl status nvidia-persistenced - nvidia-persistenced.service - NVIDIA Persistence Daemon - Loaded: loaded (/usr/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: disabled) - Active: active (running) since Tue 2019-10-15 21:43:19 KST; 11min ago - Process: 8257 ExecStart=/usr/bin/nvidia-persistenced --verbose (code=exited, status=0/SUCCESS) - Main PID: 8265 (nvidia-persiste) - Tasks: 1 - Memory: 21.0M - CGroup: /system.slice/nvidia-persistenced.service - └─8265 /usr/bin/nvidia-persistenced --verbose - -Installing the Docker Engine (Community Edition) -======================= -After installing the Nvidia CUDA driver you must install the Docker engine. - -This section describes how to install the Docker engine using the following processors: - -* :ref:`Using x86_64 processor on CentOS ` -* :ref:`Using x86_64 processor on Ubuntu ` -* :ref:`Using IBM Power9 (PPC64le) processor ` - - -.. _dockerx8664centos: - -Installing the Docker Engine Using an x86_64 Processor on CentOS ---------------------------------- -The x86_64 processor supports installing the **Docker Community Edition (CE)** versions 18.03 and higher. - -For more information on installing the Docker Engine CE on an x86_64 processor, see `Install Docker Engine on CentOS `_ - - - -.. _dockerx8664ubuntu: - -Installing the Docker Engine Using an x86_64 Processor on Ubuntu ------------------------------------------------------ - - -The x86_64 processor supports installing the **Docker Community Edition (CE)** versions 18.03 and higher. - -For more information on installing the Docker Engine CE on an x86_64 processor, see `Install Docker Engine on Ubuntu `_ - -.. _docker_ibmpower9: - -Installing the Docker Engine on an IBM Power9 Processor ----------------------------------------- -The x86_64 processor only supports installing the **Docker Community Edition (CE)** version 18.03. - - -**To install the Docker Engine on an IBM Power9 processor:** - -You can install the Docker Engine on an IBM Power9 processor by running the following command: - -.. code-block:: console - - $ wget http://ftp.unicamp.br/pub/ppc64el/rhel/7_1/docker-ppc64el/container-selinux-2.9-4.el7.noarch.rpm - $ wget http://ftp.unicamp.br/pub/ppc64el/rhel/7_1/docker-ppc64el/docker-ce-18.03.1.ce-1.el7.centos.ppc64le.rpm - $ yum install -y container-selinux-2.9-4.el7.noarch.rpm - $ docker-ce-18.03.1.ce-1.el7.centos.ppc64le.rpm - - - -For more information on installing the Docker Engine CE on an IBM Power9 processor, see `Install Docker Engine on Ubuntu `_. - -Docker Post-Installation -================================= -After installing the Docker engine you must configure Docker on your local machine. - -**To configure Docker on your local machine:** - -1. Enable Docker to start on boot: - - .. code-block:: console - - $ sudo systemctl enable docker && sudo systemctl start docker - -2. Enable managing Docker as a non-root user: - - .. code-block:: console - - $ sudo usermod -aG docker $USER - -3. Log out and log back in via SSH. This causes Docker to re-evaluate your group membership. - - :: - -4. Verify that you can run the following Docker command as a non-root user (without ``sudo``): - - .. code-block:: console - - $ docker run hello-world - -If you can run the above Docker command as a non-root user, the following occur: - -* Docker downloads a test image and runs it in a container. -* When the container runs, it prints an informational message and exits. - -For more information on installing the Docker Post-Installation, see `Docker Post-Installation `_. - -Installing the Nvidia Docker2 ToolKit -========================================== -After configuring Docker on your local machine you must install the Nvidia Docker2 ToolKit. The NVIDIA Docker2 Toolkit lets you build and run GPU-accelerated Docker containers. The Toolkit includes a container runtime library and related utilities for automatically configuring containers to leverage NVIDIA GPU's. - -This section describes the following: - -* :ref:`Installing the NVIDIA Docker2 Toolkit on an x86_64 processor. ` -* :ref:`Installing the NVIDIA Docker2 Toolkit on a PPC64le processor. ` - -.. _install_nvidia_docker2_toolkit_x8664_processor: - -Installing the NVIDIA Docker2 Toolkit on an x86_64 Processor ----------------------------------------- - -This section describes the following: - -* :ref:`Installing the NVIDIA Docker2 Toolkit on a CentOS operating system ` - -* :ref:`Installing the NVIDIA Docker2 Toolkit on an Ubuntu operating system ` - -.. _install_nvidia_docker2_toolkit_centos: - -Installing the NVIDIA Docker2 Toolkit on a CentOS Operating System -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**To install the NVIDIA Docker2 Toolkit on a CentOS operating system:** - -1. Install the repository for your distribution: - - .. code-block:: console - - $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) - $ curl -s -L - $ https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \ - $ sudo tee /etc/yum.repos.d/nvidia-docker.repo - -2. Install the ``nvidia-docker2`` package and reload the Docker daemon configuration: - - .. code-block:: console - - $ sudo yum install nvidia-docker2 - $ sudo pkill -SIGHUP dockerd - -3. Do one of the following: - - * If you received an error when installing the ``nvidia-docker2`` package, skip to :ref:`Step 4 `. - * If you successfully installed the ``nvidia-docker2`` package, skip to :ref:`Step 5 `. - -.. _step_4_centos: - -4. Do the following: - - 1. Run the ``sudo vi /etc/yum.repos.d/nvidia-docker.repo`` command if the following error is displayed when installing the ``nvidia-docker2`` package: - - - .. code-block:: console - - https://nvidia.github.io/nvidia-docker/centos7/ppc64le/repodata/repomd.xml: - [Errno -1] repomd.xml signature could not be verified for nvidia-docker - - 2. Change ``repo_gpgcheck=1`` to ``repo_gpgcheck=0``. - -.. _step_5_centos: - -5. Verify that the NVIDIA-Docker run has been installed correctly: - - .. code-block:: console - - $ docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi - -For more information on installing the NVIDIA Docker2 Toolkit on a CentOS operating system, see :ref:`Installing the NVIDIA Docker2 Toolkit on a CentOS operating system ` - - -.. _install_nvidia_docker2_toolkit_ubuntu: - -Installing the NVIDIA Docker2 Toolkit on an Ubuntu Operating System -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**To install the NVIDIA Docker2 Toolkit on an Ubuntu operating system:** - -1. Install the repository for your distribution: - - .. code-block:: console - - $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ - $ sudo apt-key add - - $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) - $ curl -s -L - $ https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ - $ sudo tee /etc/apt/sources.list.d/nvidia-docker.list - $ sudo apt-get update - -2. Install the ``nvidia-docker2`` package and reload the Docker daemon configuration: - - .. code-block:: console - - $ sudo apt-get install nvidia-docker2 - $ sudo pkill -SIGHUP dockerd -3. Do one of the following: - * If you received an error when installing the ``nvidia-docker2`` package, skip to :ref:`Step 4 `. - * If you successfully installed the ``nvidia-docker2`` package, skip to :ref:`Step 5 `. - - .. _step_4_ubuntu: - -4. Do the following: - - 1. Run the ``sudo vi /etc/yum.repos.d/nvidia-docker.repo`` command if the following error is displayed when installing the ``nvidia-docker2`` package: - - .. code-block:: console - - https://nvidia.github.io/nvidia-docker/centos7/ppc64le/repodata/repomd.xml: - [Errno -1] repomd.xml signature could not be verified for nvidia-docker - - 2. Change ``repo_gpgcheck=1`` to ``repo_gpgcheck=0``. - -.. _step_5_ubuntu: - -5. Verify that the NVIDIA-Docker run has been installed correctly: - - .. code-block:: console - - $ docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi - -For more information on installing the NVIDIA Docker2 Toolkit on a CentOS operating system, see :ref:`Installing the NVIDIA Docker2 Toolkit on an Ubuntu operating system ` - -.. _install_nvidia_docker2_toolkit_ppc64le_processor: - -Installing the NVIDIA Docker2 Toolkit on a PPC64le Processor --------------------------------------- - -This section describes how to install the NVIDIA Docker2 Toolkit on an IBM RHEL operating system: - -**To install the NVIDIA Docker2 Toolkit on an IBM RHEL operating system:** - -1. Import the repository and install the ``libnvidia-container`` and the ``nvidia-container-runtime`` containers. - - .. code-block:: console - - $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) - $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \ - sudo tee /etc/yum.repos.d/nvidia-docker.repo - $ sudo yum install -y libnvidia-container* - -2. Do one of the following: - - * If you received an error when installing the containers, skip to :ref:`Step 3 `. - * If you successfully installed the containers, skip to :ref:`Step 4 `. - -.. _step_3_installing_nvidia_docker2_toolkit_ppc64le_processor: - -3. Do the following: - - 1. Run the ``sudo vi /etc/yum.repos.d/nvidia-docker.repo`` command if the following error is displayed when installing the containers: - - .. code-block:: console - - https://nvidia.github.io/nvidia-docker/centos7/ppc64le/repodata/repomd.xml: - [Errno -1] repomd.xml signature could not be verified for nvidia-docker - - 2. Change ``repo_gpgcheck=1`` to ``repo_gpgcheck=0``. - - :: - - 3. Install the ``libnvidia-container`` container. - - .. code-block:: console - - $ sudo yum install -y libnvidia-container* - - .. _step_4_installing_nvidia_docker2_toolkit_ppc64le_processor: - -4. Install the ``nvidia-container-runtime`` container: - - .. code-block:: console - - $ sudo yum install -y nvidia-container-runtime* - -5. Add ``nvidia runtime`` to the Docker daemon: - - .. code-block:: console - - $ sudo mkdir -p /etc/systemd/system/docker.service.d/ - $ sudo vi /etc/systemd/system/docker.service.d/override.conf - - $ [Service] - $ ExecStart= - $ ExecStart=/usr/bin/dockerd - -6. Restart Docker: - - .. code-block:: console - - $ sudo systemctl daemon-reload - $ sudo systemctl restart docker - -7. Verify that the NVIDIA-Docker run has been installed correctly: - - .. code-block:: console - - $ docker run --runtime=nvidia --rm nvidia/cuda-ppc64le nvidia-smi - -.. _accessing_hadoop_kubernetes_configuration_files: - -Accessing the Hadoop and Kubernetes Configuration Files --------------------------------------- -The information this section is optional and is only relevant for Hadoop users. If you require Hadoop and Kubernetes (Krb5) connectivity, contact your IT department for access to the following configuration files: - -* Hadoop configuration files: - - * core-site.xml - * hdfs-site.xml - - :: - -* Kubernetes files: - - * Configuration file - krb.conf - * Kubernetes Hadoop client certificate - hdfs.keytab - -Once you have the above files, you must copy them into the correct folders in your working directory. - -For more information about the correct directory to copy the above files into, see the :ref:`Installing the SQream Software ` section below. - -For related information, see the following sections: - -* :ref:`Configuring the Hadoop and Kubernetes Configuration Files `. -* :ref:`Setting the Hadoop and Kubernetes Configuration Parameters `. - -.. _installing_sqream_software: - -Installing the SQream Software -============================== - -Preparing Your Local Environment -------------------------- -After installing the Nvidia Docker2 toolKit you must prepare your local environment. - -.. note:: You must install the SQream software under a *sqream* and not a *root* user. - -The Linux user preparing the local environment must have **read/write** access to the following directories for the SQream software to correctly read and write the required resources: - -* **Log directory** - default: /var/log/sqream/ -* **Configuration directory** - default: /etc/sqream/ -* **Cluster directory** - the location where SQream writes its DB system, such as */mnt/sqreamdb* -* **Ingest directory** - the location where the required data is loaded, such as */mnt/data_source/* - -.. _download_sqream_software: - -Deploying the SQream Software -------------------------- -After preparing your local environment you must deploy the SQream software. Deploying the SQream software requires you to access and extract the required files and to place them in the correct directory. - -**To deploy the SQream software:** - -1. Contact the SQream Support team for access to the **sqream_installer-nnn-DBnnn-COnnn-EDnnn-.tar.gz** file. - -The **sqream_installer-nnn-DBnnn-COnnn-EDnnn-.tar.gz** file includes the following parameter values: - -* **sqream_installer-nnn** - sqream installer version -* **DBnnn** - SQreamDB version -* **COnnn** - SQream console version -* **EDnnn** - SQream editor version -* **arch** - server arch (applicable to X86.64 and ppc64le) - -2. Extract the tarball file: - - .. code-block:: console - - $ tar -xvf sqream_installer-1.1.5-DB2019.2.1-CO1.5.4-ED3.0.0-x86_64.tar.gz - -When the tarball file has been extracted, a new folder will be created. The new folder is automatically given the name of the tarball file: - - .. code-block:: console - - drwxrwxr-x 9 sqream sqream 4096 Aug 11 11:51 sqream_istaller-1.1.5-DB2019.2.1-CO1.5.4-ED3.0.0-x86_64/ - -rw-rw-r-- 1 sqream sqream 3130398797 Aug 11 11:20 sqream_installer-1.1.5-DB2019.2.1-CO1.5.4-ED3.0.0-x86_64.tar.gz - -3. Change the directory to the new folder that you created in the previous step. - -:: - -4. Verify that the folder you just created contains all of the required files. - - .. code-block:: console - - $ ls -la - - The following is an example of the files included in the new folder: - - .. code-block:: console - - drwxrwxr-x. 10 sqream sqream 198 Jun 3 17:57 . - drwx------. 25 sqream sqream 4096 Jun 7 18:11 .. - drwxrwxr-x. 2 sqream sqream 226 Jun 7 18:09 .docker - drwxrwxr-x. 2 sqream sqream 64 Jun 3 12:55 .hadoop - drwxrwxr-x. 2 sqream sqream 4096 May 31 14:18 .install - drwxrwxr-x. 2 sqream sqream 39 Jun 3 12:53 .krb5 - drwxrwxr-x. 2 sqream sqream 22 May 31 14:18 license - drwxrwxr-x. 2 sqream sqream 82 May 31 14:18 .sqream - -rwxrwxr-x. 1 sqream sqream 1712 May 31 14:18 sqream-console - -rwxrwxr-x. 1 sqream sqream 4608 May 31 14:18 sqream-install - -For information relevant to Hadoop users, see the following sections: - -* :ref:`Accessing the Hadoop and Kubernetes Configuration Files `. -* :ref:`Configuring the Hadoop and Kubernetes Configuration Files `. -* :ref:`Setting the Hadoop and Kubernetes Configuration Parameters `. - -.. _configure_hadoop_kubernetes_configuration_files: - -Configuring the Hadoop and Kubernetes Configuration Files ------------------------------ -The information in this section is optional and is only relevant for Hadoop users. If you require Hadoop and Kubernetes (Krb5) connectivity, you must copy the Hadoop and Kubernetes files into the correct folders in your working directory as shown below: - -* .hadoop/core-site.xml -* .hadoop/hdfs-site.xml -* .krb5/krb5.conf -* .krb5/hdfs.keytab - -For related information, see the following sections: - -* :ref:`Accessing the Hadoop and Kubernetes Configuration Files `. -* :ref:`Setting the Hadoop and Kubernetes Configuration Parameters `. - -Configuring the SQream Software -------------------------------- -After deploying the SQream software, and optionally configuring the Hadoop and Kubernetes configuration files, you must configure the SQream software. - -Configuring the SQream software requires you to do the following: - -* Configure your local environment -* Understand the ``sqream-install`` flags -* Install your SQream license -* Validate your SQream icense -* Change your data ingest folder - -Configuring Your Local Environment -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Once you've downloaded the SQream software, you can begin configuring your local environment. The following commands must be run (as **sudo**) from the same directory that you located your packages. - -For example, you may have saved your packages in **/home/sqream/sqream-console-package/**. - -The following table shows the flags that you can use to configure your local directory: - -.. list-table:: - :widths: 10 50 40 - :header-rows: 1 - - * - Flag - - Function - - Note - * - **-i** - - Loads all software from the hidden folder **.docker**. - - Mandatory - * - **-k** - - Loads all license packages from the **/license** directory. - - Mandatory - * - **-f** - - Overwrites existing folders. **Note** Using ``-f`` overwrites **all files** located in mounted directories. - - Mandatory - * - **-c** - - Defines the origin path for writing/reading SQream configuration files. The default location is ``/etc/sqream/``. - - If you are installing the Docker version on a server that already works with SQream, do not use the default path. - * - **-v** - - The SQream cluster location. If a cluster does not exist yet, ``-v`` creates one. If a cluster already exists, ``-v`` mounts it. - - Mandatory - * - **-l** - - SQream system startup logs location, including startup logs and docker logs. The default location is ``/var/log/sqream/``. - - - * - **-d** - - The directory containing customer data to be imported and/or copied to SQream. - - - * - **-s** - - Shows system settings. - - - * - **-r** - - Resets the system configuration. This value is run without any other variables. - - Mandatory - * - **-h** - - Help. Shows the available flags. - - Mandatory - * - **-K** - - Runs license validation - - - * - **-e** - - Used for inserting your RKrb5 server DNS name. For more information on setting your Kerberos configuration parameters, see :ref:`Setting the Hadoop and Kubernetes Configuration Parameters `. - - - * - **-p** - - Used for inserting your Kerberos user name. For more information on setting your Kerberos configuration parameters, see :ref:`Setting the Hadoop and Kubernetes Configuration Parameters `. - - - - -Installing Your License -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Once you've configured your local environment, you must install your license by copying it into the SQream installation package folder located in the **./license** folder: - -.. code-block:: console - - $ sudo ./sqream-install -k - -You do not need to extract this folder after uploading into the **./license**. - - -Validating Your License -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You can copy your license package into the SQream console folder located in the **/license** folder by running the following command: - -.. code-block:: console - - $ sudo ./sqream-install -K - -The following mandatory flags must be used in the first run: - -.. code-block:: console - - $ sudo ./sqream-install -i -k -v - -The following is an example of the correct command syntax: - -.. code-block:: console - - $ sudo ./sqream-install -i -k -c /etc/sqream -v /home/sqream/sqreamdb -l /var/log/sqream -d /home/sqream/data_ingest - -.. _setting_hadoop_kubernetes_connectivity_parameters: - -Setting the Hadoop and Kubernetes Connectivity Parameters -------------------------------- -The information in this section is optional, and is only relevant for Hadoop users. If you require Hadoop and Kubernetes (Krb5) connectivity, you must set their connectivity parameters. - -The following is the correct syntax when setting the Hadoop and Kubernetes connectivity parameters: - -.. code-block:: console - - $ sudo ./sqream-install -p -e : - -The following is an example of setting the Hadoop and Kubernetes connectivity parameters: - -.. code-block:: console - - $ sudo ./sqream-install -p -e kdc.sq.com:<192.168.1.111> - -For related information, see the following sections: - -* :ref:`Accessing the Hadoop and Kubernetes Configuration Files `. -* :ref:`Configuring the Hadoop and Kubernetes Configuration Files `. - -Modifying Your Data Ingest Folder -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Once you've validated your license, you can modify your data ingest folder after the first run by running the following command: - -.. code-block:: console - - $ sudo ./sqream-install -d /home/sqream/data_in - -Configuring Your Network for Docker -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Once you've modified your data ingest folder (if needed), you must validate that the server network and Docker network that you are setting up do not overlap. - -**To configure your network for Docker:** - -1. To verify that your server network and Docker network do not overlap, run the following command: - -.. code-block:: console - - $ ifconfig | grep 172. - -2. Do one of the following: - - * If running the above command output no results, continue the installation process. - * If running the above command output results, run the following command: - - .. code-block:: console - - $ ifconfig | grep 192.168. - - -Checking and Verifying Your System Settings -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Once you've configured your network for Docker, you can check and verify your system settings. - -Running the following command shows you all the variables used by your SQream system: - -.. code-block:: console - - $ ./sqream-install -s - -The following is an example of the correct output: - -.. code-block:: console - - SQREAM_CONSOLE_TAG=1.5.4 - SQREAM_TAG=2019.2.1 - SQREAM_EDITOR_TAG=3.0.0 - license_worker_0=f0:cc: - license_worker_1=26:91: - license_worker_2=20:26: - license_worker_3=00:36: - SQREAM_VOLUME=/media/sqreamdb - SQREAM_DATA_INGEST=/media/sqreamdb/data_in - SQREAM_CONFIG_DIR=/etc/sqream/ - LICENSE_VALID=true - SQREAM_LOG_DIR=/var/log/sqream/ - SQREAM_USER=sqream - SQREAM_HOME=/home/sqream - SQREAM_ENV_PATH=/home/sqream/.sqream/env_file - PROCESSOR=x86_64 - METADATA_PORT=3105 - PICKER_PORT=3108 - NUM_OF_GPUS=2 - CUDA_VERSION=10.1 - NVIDIA_SMI_PATH=/usr/bin/nvidia-smi - DOCKER_PATH=/usr/bin/docker - NVIDIA_DRIVER=418 - SQREAM_MODE=single_host - -Using the SQream Console -========================= -After configuring the SQream software and veriying your system settings you can begin using the SQream console. - -SQream Console - Basic Commands ---------------------------------- -The SQream console offers the following basic commands: - -* :ref:`Starting your SQream console ` -* :ref:`Starting Metadata and Picker ` -* :ref:`Starting the running services ` -* :ref:`Listing the running services ` -* :ref:`Stopping the running services ` -* :ref:`Using the SQream editor ` -* :ref:`Using the SQream Client ` - -.. _starting_sqream_console: - -Starting Your SQream Console -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -You can start your SQream console by running the following command: - -.. code-block:: console - - $ ./sqream-console - -.. _starting_metadata_and_picker: - -Starting the SQream Master -~~~~~~~~~~~~~~~~~ - -**To listen to metadata and picker:** - -1. Start the metadata server (default port 3105) and picker (default port 3108) by running the following command: - - .. code-block:: console - - $ sqream master --start - - The following is the correct output: - - .. code-block:: console - - sqream-console> sqream master --start - starting master server in single_host mode ... - sqream_single_host_master is up and listening on ports: 3105,3108 - - -2. *Optional* - Change the metadata and server picker ports by adding ``-p `` and ``-m ``: - - .. code-block:: console - - $ sqream-console>sqream master --start -p 4105 -m 43108 - $ starting master server in single_host mode ... - $ sqream_single_host_master is up and listening on ports: 4105,4108 - - - -.. _starting_running_services: - -Starting SQream Workers -~~~~~~~~~~~~~~~~~ - - -When starting SQream workers, setting the ```` value sets how many workers to start. Leaving the ```` value unspecified runs all of the available resources. - - -.. code-block:: console - - $ sqream worker --start - - The following is an example of expected output when setting the ```` value to ``2``: - - .. code-block:: console - - sqream-console>sqream worker --start 2 - started sqream_single_host_worker_0 on port 5000, allocated gpu: 0 - started sqream_single_host_worker_1 on port 5001, allocated gpu: 1 - - -.. _listing_running_services: - -Listing the Running Services -~~~~~~~~~~~~~~~~~ - -You can list running SQream services to look for container names and ID's by running the following command: - -.. code-block:: console - - $ sqream master --list - -The following is an example of the expected output: - -.. code-block:: console - - sqream-console>sqream master --list - container name: sqream_single_host_worker_0, container id: c919e8fb78c8 - container name: sqream_single_host_master, container id: ea7eef80e038-- - - -.. _stopping_running_services: - -Stopping the Running Services -~~~~~~~~~~~~~~~~~ - -You can stop running services either for a single SQream worker, or all SQream services for both master and worker. - -The following is the command for stopping a running service for a single SQream worker: - -.. code-block:: console - - $ sqream worker --stop - -The following is an example of expected output when stopping a running service for a single SQream worker: - -.. code-block:: console - - sqream worker stop - stopped container sqream_single_host_worker_0, id: 892a8f1a58c5 - - -You can stop all running SQream services (both master and worker) by running the following command: - -.. code-block:: console - - $ sqream-console>sqream master --stop --all - -The following is an example of expected output when stopping all running services: - -.. code-block:: console - - sqream-console>sqream master --stop --all - stopped container sqream_single_host_worker_0, id: 892a8f1a58c5 - stopped container sqream_single_host_master, id: 55cb7e38eb22 - - -.. _using_sqream_editor: - -Using SQream Studio -~~~~~~~~~~~~~~~~~ -SQream Studio is an SQL statement editor. - -**To start SQream Studio:** - -1. Run the following command: - - .. code-block:: console - - $ sqream studio --start - -The following is an example of the expected output: - - .. code-block:: console - - SQream Acceleration Studio is available at http://192.168.1.62:8080 - -2. Click the ``http://192.168.1.62:8080`` link shown in the CLI. - - -**To stop SQream Studio:** - -You can stop your SQream Studio by running the following command: - -.. code-block:: console - - $ sqream studio --stop - -The following is an example of the expected output: - -.. code-block:: console - - sqream_admin stopped - - -.. _using_sqream_client: - -Using the SQream Client -~~~~~~~~~~~~~~~~~ - - -You can use the embedded SQream Client on the following nodes: - -* Master node -* Worker node - - -When using the SQream Client on the Master node, the following default settings are used: - -* **Default port**: 3108. You can change the default port using the ``-p`` variable. -* **Default database**: master. You can change the default database using the ``-d`` variable. - -The following is an example: - -.. code-block:: console - - $ sqream client --master -u sqream -w sqream - - -When using the SQream Client on a Worker node (or nodes), you should use the ``-p`` variable for Worker ports. The default database is ``master``, but you can use the ``-d`` variable to change databases. - -The following is an example: - -.. code-block:: console - - $ sqream client --worker -p 5000 -u sqream -w sqream - - -Moving from Docker Installation to Standard On-Premises Installation ------------------------------------------------ - -Because Docker creates all files and directories on the host at the **root** level, you must grant ownership of the SQream storage folder to the working directory user. - -SQream Console - Advanced Commands ------------------------------ - -The SQream console offers the following advanced commands: - - -* :ref:`Controlling the spool size ` -* :ref:`Splitting a GPU ` -* :ref:`Splitting a GPU and setting the spool size ` -* :ref:`Using a custom configuration file ` -* :ref:`Clustering your Docker environment ` - - - - -.. _controlling_spool_size: - -Controlling the Spool Size -~~~~~~~~~~~~~~~~~~ - -From the console you can define a spool size value. - -The following example shows the spool size being set to ``50``: - -.. code-block:: console - - $ sqream-console>sqream worker --start 2 -m 50 - - -If you don't define the SQream spool size, the SQream console automatically distributes the available RAM between all running workers. - -.. _splitting_gpu: - -Splitting a GPU -~~~~~~~~~~~~~~~~~~ - -You can start more than one sqreamd on a single GPU by splitting it. - - -The following example shows the GPU being split into **two** sqreamd's on the GPU in **slot 0**: - -.. code-block:: console - - $ sqream-console>sqream worker --start 2 -g 0 - -.. _splitting_gpu_setting_spool_size: - -Splitting GPU and Setting the Spool Size -~~~~~~~~~~~~~~~~~~ - -You can simultaneously split a GPU and set the spool size by appending the ``-m`` flag: - -.. code-block:: console - - $ sqream-console>sqream worker --start 2 -g 0 -m 50 - -.. note:: The console does not validate whether the user-defined spool size is available. Before setting the spool size, verify that the requested resources are available. - -.. _using_custom_configuration_file: - -Using a Custom Configuration File -~~~~~~~~~~~~~~~~~~ - -SQream lets you use your own external custom configuration json files. You must place these json files in the path mounted in the installation. SQream recommends placing the json file in the Configuration folder. - -The SQream console does not validate the integrity of your external configuration files. - -When using your custom configuration file, you can use the ``-j`` flag to define the full path to the Configuration file, as in the example below: - -.. code-block:: console - - $ sqream-console>sqream worker --start 1 -j /etc/sqream/configfile.json - -.. note:: To start more than one sqream daemon, you must provide files for each daemon, as in the example below: - -.. code-block:: console - - $ sqream worker --start 2 -j /etc/sqream/configfile.json /etc/sqream/configfile2.json - -.. note:: To split a specific GPU, you must also list the GPU flag, as in the example below: - -.. code-block:: console - - $ sqream worker --start 2 -g 0 -j /etc/sqream/configfile.json /etc/sqream/configfile2.json - -.. _clustering_docker_environment: - -Clustering Your Docker Environment -~~~~~~~~~~~~~~~~~~ - -SQream lets you connect to a remote Master node to start Docker in Distributed mode. If you have already connected to a Slave node server in Distributed mode, the **sqream Master** and **Client** commands are only available on the Master node. - -.. code-block:: console - - $ --master-host - $ sqream-console>sqream worker --start 1 --master-host 192.168.0.1020 - -Checking the Status of SQream Services ---------------------------- -SQream lets you check the status of SQream services from the following locations: - -* :ref:`From the Sqream console ` -* :ref:`From outside the Sqream console ` - -.. _inside_sqream_console: - -Checking the Status of SQream Services from the SQream Console -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -From the SQream console, you can check the status of SQream services by running the following command: - -.. code-block:: console - - $ sqream-console>sqream master --list - -The following is an example of the expected output: - -.. code-block:: console - - $ sqream-console>sqream master --list - $ checking 3 sqream services: - $ sqream_single_host_worker_1 up, listens on port: 5001 allocated gpu: 1 - $ sqream_single_host_worker_0 up, listens on port: 5000 allocated gpu: 1 - $ sqream_single_host_master up listens on ports: 3105,3108 - -.. _outside_sqream_console: - -Checking the Status of SQream Services from Outside the SQream Console -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -From outside the Sqream Console, you can check the status of SQream services by running the following commands: - -.. code-block:: console - - $ sqream-status - $ NAMES STATUS PORTS - $ sqream_single_host_worker_1 Up 3 minutes 0.0.0.0:5001->5001/tcp - $ sqream_single_host_worker_0 Up 3 minutes 0.0.0.0:5000->5000/tcp - $ sqream_single_host_master Up 3 minutes 0.0.0.0:3105->3105/tcp, 0.0.0.0:3108->3108/tcp - $ sqream_editor_3.0.0 Up 3 hours (healthy) 0.0.0.0:3000->3000/tcp - -Upgrading Your SQream System ----------------------------- -This section describes how to upgrade your SQream system. - -**To upgrade your SQream system:** - -1. Contact the SQream Support team for access to the new SQream package tarball file. - - :: - -2. Set a maintenance window to enable stopping the system while upgrading it. - - :: - -3. Extract the following tarball file received from the SQream Support team, under it with the same user and in the same folder that you used while :ref:`Downloading the SQream Software <_download_sqream_software>`. - - - .. code-block:: console - - $ tar -xvf sqream_installer-2.0.5-DB2019.2.1-CO1.6.3-ED3.0.0-x86_64/ - -4. Navigate to the new folder created as a result of extracting the tarball file: - - .. code-block:: console - - $ cd sqream_installer-2.0.5-DB2019.2.1-CO1.6.3-ED3.0.0-x86_64/ - -5. Initiate the upgrade process: - - .. code-block:: console - - $ ./sqream-install -i - - Initiating the upgrade process checks if any SQream services are running. If any services are running, you will be prompted to stop them. - -6. Do one of the following: - - * Select **Yes** to stop all running SQream workers (Master and Editor) and continue the upgrade process. - * Select **No** to stop the upgrade process. - - SQream periodically upgrades the metadata structure. If an upgrade version includes a change to the metadata structure, you will be prompted with an approval request message. Your approval is required to finish the upgrade process. - - Because SQream supports only certain metadata versions, all SQream services must be upgraded at the same time. - -7. When the upgrade is complete, load the SQream console and restart your services. - - For assistance, contact SQream Support. diff --git a/installation_guides/sqream_studio_installation.rst b/installation_guides/sqream_studio_installation.rst index 8d6c16546..8891dfcde 100644 --- a/installation_guides/sqream_studio_installation.rst +++ b/installation_guides/sqream_studio_installation.rst @@ -10,10 +10,4 @@ The **Installing SQream Studio** page incudes the following installation guides: :glob: installing_studio_on_stand_alone_server - installing_prometheus_exporters - installing_prometheus_using_binary_packages - installing_dashboard_data_collector - - - - + installing_nginx_proxy_over_secure_connection \ No newline at end of file diff --git a/installation_guides/upgrade_guide/index.rst b/installation_guides/upgrade_guide/index.rst new file mode 100644 index 000000000..0186499c3 --- /dev/null +++ b/installation_guides/upgrade_guide/index.rst @@ -0,0 +1,15 @@ +.. _upgrade_guides: + +***************** +Upgrade Guides +***************** + +Refer to the :ref:`version_upgrade` guide to upgrade from your current SQreamDB version and explore the :ref:`version_upgrade_configurations` guide. It provides a breakdown of the necessary system modifications for the specific version you’re upgrading to, ensuring a thorough and effective upgrade process. + +.. toctree:: + :maxdepth: 1 + :titlesonly: + :hidden: + + version_upgrade + version_upgrade_configurations \ No newline at end of file diff --git a/installation_guides/upgrade_guide/version_upgrade.rst b/installation_guides/upgrade_guide/version_upgrade.rst new file mode 100644 index 000000000..f556e74e9 --- /dev/null +++ b/installation_guides/upgrade_guide/version_upgrade.rst @@ -0,0 +1,120 @@ +.. _version_upgrade: + +***************** +Version Upgrade +***************** + +Upgrading your SQreamDB version requires stopping all running services. + +1. Stop all actively running SQreamDB services. + + Ensuring that SQreamDB services are at a halt depends on the tool being used. + +2. Verify that SQreamDB has stopped listening on ports **500X**, **510X**, and **310X**: + +.. code-block:: console + + $ sudo netstat -nltp #to make sure SQreamDB stopped listening on 500X, 510X and 310X ports. + +3. Replace the old SQreamDB version with the new version, such as in the following example: + +.. code-block:: console + + $ cd /home/sqream + $ mkdir tempfolder + $ mv sqream-db-v2021.1.tar.gz tempfolder/ + $ cd tempfolder/ + $ tar -xf sqream-db-v2021.1.tar.gz + $ sudo mv sqream /usr/local/sqream-db-v2021.1 + $ cd /usr/local + $ sudo chown -R sqream:sqream sqream-db-v2021.1 + +4. Remove the symbolic link: + +.. code-block:: console + + $ sudo rm sqream + +5. Create a new symbolic link named "sqream" pointing to the new version: + +.. code-block:: console + + $ sudo ln -s sqream-db-v2021.1 sqream + +6. Verify that the symbolic SQreamDB link points to the real folder: + +.. code-block:: console + + $ ls -l + + -- Output example: + + $ sqream -> sqream-db-v2021.1 + +7. Upgrade your version of SQreamDB storage. + + a. SQreamDB recommends storing the generated back-up locally in case needed. To generate a back-up of the metadata, run the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + + b. Shut down all SQreamDB services. + + c. Extract the recently created back-up file. + + d. Replace your current metadata with the metadata you stored in the back-up file. + + e. Navigate to the new SQreamDB package bin folder. + + f. Get the cluster path + + .. code-block:: console + + $ cat /etc/sqream/sqream1_config.json |grep cluster + + g. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + + -- Output example: + + get_leveldb_version path{} + current storage version 23 + upgrade_v24 + upgrade_storage to 24 + upgrade_storage to 24 - Done + upgrade_v25 + upgrade_storage to 25 + upgrade_storage to 25 - Done + upgrade_v26 + upgrade_storage to 26 + upgrade_storage to 26 - Done + validate_leveldb + ... + upgrade_v37 + upgrade_storage to 37 + upgrade_storage to 37 - Done + validate_leveldb + storage has been upgraded successfully to version 37 + +8. Verify that the latest version has been installed: + +.. code-block:: console + + $ ./sqream sql --username sqream --password sqream --host localhost --databasename master -c "SELECT SHOW_VERSION();" + + -- Output example: + + v2021.1 + 1 row + time: 0.050603s + +For more information, see the :ref:`upgrade_storage` command line program. + +9. After completing the upgrade process, ensure that ALL :ref:`operational and configuration` changes introduced in versions newer than the version you are upgrading from are applied before returning to regular SQreamDB operations. + diff --git a/installation_guides/upgrade_guide/version_upgrade_configurations.rst b/installation_guides/upgrade_guide/version_upgrade_configurations.rst new file mode 100644 index 000000000..1dc0bcc18 --- /dev/null +++ b/installation_guides/upgrade_guide/version_upgrade_configurations.rst @@ -0,0 +1,81 @@ +.. _version_upgrade_configurations: + +****************************************** +Upgrade-Related Configuration Changes +****************************************** + + + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - SQreamDB Version + - Storage Version + - Configurations and Changes + * - 4.4 + - 49 + - New Releases: + * Pysqream 5.0.0 Connector is released + + * JDBC 5.0.0 Connector is released + * - 4.3 + - 49 + - Configuration: + * Two new :ref:`AWS S3` object access style and endpoint URL with Virtual Private Cloud (VPC) configuration flags: ``AwsEndpointOverride``, ``AwsObjectAccessStyle`` + * **REHL 8.x** is now officially supported + * - 4.2 + - 46 + - New Releases: + * Pysqream 3.2.5 Connector is released + + * ODBC 4.4.4 Connector is released + + * JDBC 4.5.8 Connector is released + + * Apache Spark 5.0.0 Connector is released + + * The ``INT96`` data type is deprecated + + Configuration: + + * :ref:`Access control permissions` in SQreamDB have been expanded. Learn how to reconfigure access control permissions when :ref:`upgrading from version 4.2` + * - 4.1 + - 45 + - New Releases: + * JDBC 4.5.7 Connector + + * SQream Studio v5.5.4 + * - 4.0 + - 45 + - None + * - 2022.1.7 + - 43 + - None + * - 2022.1.6 + - 42 + - None + * - 2022.1.5 + - 42 + - None + * - 2022.1.4 + - 42 + - None + * - 2022.1.3 + - 42 + - The ``VARCHAR`` data type has been deprecated and replaced with ``TEXT``. + * - 2022.1.2 + - 41 + - None + * - 2022.1.1 + - 40 + - * In compliance with GDPR standards, version 2022.1.1 requires a strong password policy when accessing the CLI and Studio. For more information, see :ref:`Password Policy`. + + * The ``login_max_retries`` configuration flag is required for adjusting the permitted log-in attempts. For more information, see :ref:`Adjusting the Permitted Log-In Attempts`. + * - 2022.1 + - 40 + - * In SQream version 2022.1 the ``VARCHAR`` data type has been deprecated and replaced with ``TEXT``. SQream will maintain ``VARCHAR`` in all previous versions until completing the migration to ``TEXT``, at which point it will be deprecated in all earlier versions. SQream also provides an automated and secure tool to facilitate and simplify migration from ``VARCHAR`` to ``TEXT``. + + * If you are using an earlier version of SQreamDB, see the :ref:`Using Legacy String Literals` configuration flag. + + diff --git a/login_5.3.1.png b/login_5.3.1.png deleted file mode 100644 index 48c725a4c..000000000 Binary files a/login_5.3.1.png and /dev/null differ diff --git a/operational_guides/accelerating_filtered_queries_with_metadata_partitions.rst b/operational_guides/accelerating_filtered_queries_with_metadata_partitions.rst new file mode 100644 index 000000000..d91e38b75 --- /dev/null +++ b/operational_guides/accelerating_filtered_queries_with_metadata_partitions.rst @@ -0,0 +1,75 @@ +.. _accelerating_filtered_statements: + +************************* +Accelerating Filtered Statements +************************* + +This page outlines a feature designed to significantly improve the performance of statements that include filters on large tables. By using **Metadata Partitions**, SQDB minimize the overhead associated with metadata scanning, leading to faster statement execution times, especially as tables grow. + +.. contents:: + :local: + :depth: 1 + +The Challenge: Metadata Scan Overhead +===================================== + +When you execute a statement with a filter (e.g., ``SELECT x, y FROM table1 WHERE X=7;``), the system needs to scan the metadata of each data chunk within the table to identify the relevant chunks containing the data that satisfies your filter condition. As tables scale to trillions of rows, this metadata scanning process can become a significant bottleneck, adding substantial latency to your statements. In some cases, this overhead can reach tens of seconds for very large tables. + +The Solution: Metadata Partitions +================================= + +To address this challenge, SQDB introduced **Metadata Partitions**. This feature creates an internal metadata metadata partitoning structure for each table, grouping data chunks based on the minimum and maximum values of sorted columns within those chunks. This allows the system to efficiently identify and target only the relevant partitions during a filtered statement, drastically reducing the amount of metadata that needs to be scanned. + + +Managing Metadata Partitions +============================ + +A new SQL function, `recalculate_metadata_partition`, is introduced to manage and update the Metadata Partitions for your tables. + +Syntax +====== + +.. code-block:: postgres + + SELECT recalculate_metadata_partition('', '', '', ['true'/'false']); + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The name of the schema + * - ``table_name`` + - The name of the table + * - ``column_name`` + - The name of the column + * - ``case_sensitive_flag`` + - optional input - ``'false'`` (default) ignore case sensitivity. ``'true'`` for case sensitivity. + + +Important Considerations +======================== + + * ```recalculate_metadata_partition`` function requires ``SUPERUSER`` privileges. + * **Impact of Data Modifications:** + * ``INSERT`` New chunks will be added, and a full scan of these new chunks will be performed until the Metadata Partition is updated. + * ``DELETE`` The existing metadata partition might still be used, potentially leading to false positives (pointing to non-existent chunks) - which will later get filtered out from the statement results. + * ``UPDATE`` The existing metadata partition will become irrelevant and will not be used. + * ``CLEANUP_CHUNKS``, ``CLEANUP_EXTENNTS``, ``RECHUNK`` These operations will require dropping and recreating the Metadata Partition. + * The ``recalculate_metadata_partition`` utility is designed to be CPU-based, ensuring that it does not impact GPU-intensive workloads. + + +Monitoring Metadata Partitions +============================== + +A new catalog statement is available to list the existing Metadata Partitions and their status: + +.. code-block:: postgres + + SELECT db_name, schema_name, table_name, column_name, last_update, total_chunks_per_column, total_metadata partitoned_chunks_per_column + FROM sqream_catalog.metadata_partitions; diff --git a/operational_guides/access_control.rst b/operational_guides/access_control.rst index 7f92f8eaf..9a1ce9f17 100644 --- a/operational_guides/access_control.rst +++ b/operational_guides/access_control.rst @@ -4,594 +4,12 @@ Access Control ************** -.. contents:: In this topic: - :local: - -Overview -========== - -Access control provides authentication and authorization in SQream DB. - -SQream DB manages authentication and authorization using a role-based access control system (RBAC), like ANSI SQL and other SQL products. - -SQream DB has a default permissions system which is inspired by Postgres, but with more power. In most cases, this allows an administrator to set things up so that every object gets permissions set automatically. - -In SQream DB, users log in from any worker which verifies their roles and permissions from the metadata server. Each statement issues commands as the currently logged in role. - -Roles are defined at the cluster level, meaning they are valid for all databases in the cluster. - -To bootstrap SQream DB, a new install will always have one ``SUPERUSER`` role, typically named ``sqream``. To create more roles, you should first connect as this role. - - -Terminology -================ - -Roles ----------- - -:term:`Role` : a role can be a user, a group, or both. - -Roles can own database objects (e.g. tables), and can assign permissions on those objects to other roles. - -Roles can be members of other roles, meaning a user role can inherit permissions from its parent role. - -Authentication --------------------- - -:term:`Authentication` : verifying the identity of the role. User roles have usernames (:term:`role names`) and passwords. - - -Authorization ----------------- - -:term:`Authorization` : checking the role has permissions to do a particular thing. The :ref:`grant` command is used for this. - - -Roles -===== - -Roles are used for both users and groups. - -Roles are global across all databases in the SQream DB cluster. - -To use a ``ROLE`` as a user, it should have a password, the login permission, and connect permissions to the relevant databases. - -Creating new roles (users) ------------------------------- - -A user role can log in to the database, so it should have ``LOGIN`` permissions, as well as a password. - -For example: - -.. code-block:: postgres - - CREATE ROLE role_name ; - GRANT LOGIN to role_name ; - GRANT PASSWORD 'new_password' to role_name ; - GRANT CONNECT ON DATABASE database_name to role_name ; - -Examples: - -.. code-block:: postgres - - CREATE ROLE new_role_name ; - GRANT LOGIN TO new_role_name; - GRANT PASSWORD 'my_password' TO new_role_name; - GRANT CONNECT ON DATABASE master TO new_role_name; - -A database role may have a number of permissions that define what tasks it can perform. These are assigned using the :ref:`grant` command. - -Dropping a user ---------------- - -.. code-block:: postgres - - DROP ROLE role_name ; - -Examples: - -.. code-block:: postgres - - DROP ROLE admin_role ; - -Altering a user name ------------------------- - -Renaming a user's role: - -.. code-block:: postgres - - ALTER ROLE role_name RENAME TO new_role_name ; - -Examples: - -.. code-block:: postgres - - ALTER ROLE admin_role RENAME TO copy_role ; - -.. _change_password: - -Changing user passwords --------------------------- - -To change a user role's password, grant the user a new password. - -.. code-block:: postgres - - GRANT PASSWORD 'new_password' TO rhendricks; - -.. note:: Granting a new password overrides any previous password. Changing the password while the role has an active running statement does not affect that statement, but will affect subsequent statements. - -Public Role ------------ - -There is a public role which always exists. Each role is granted to the ``PUBLIC`` role (i.e. is a member of the public group), and this cannot be revoked. You can alter the permissions granted to the public role. - -The ``PUBLIC`` role has ``USAGE`` and ``CREATE`` permissions on ``PUBLIC`` schema by default, therefore, new users can create, :ref:`insert`, :ref:`delete`, and :ref:`select` from objects in the ``PUBLIC`` schema. - - -Role membership (groups) -------------------------- - -Many database administrators find it useful to group user roles together. By grouping users, permissions can be granted to, or revoked from a group with one command. In SQream DB, this is done by creating a group role, granting permissions to it, and then assigning users to that group role. - -To use a role purely as a group, omit granting it ``LOGIN`` and ``PASSWORD`` permissions. - -The ``CONNECT`` permission can be given directly to user roles, and/or to the groups they are part of. - -.. code-block:: postgres - - CREATE ROLE my_group; - -Once the group role exists, you can add user roles (members) using the ``GRANT`` command. For example: - -.. code-block:: postgres - - -- Add my_user to this group - GRANT my_group TO my_user; - - -To manage object permissions like databases and tables, you would then grant permissions to the group-level role (see :ref:`the permissions table` below. - -All member roles then inherit the permissions from the group. For example: - -.. code-block:: postgres - - -- Grant all group users connect permissions - GRANT CONNECT ON DATABASE a_database TO my_group; - - -- Grant all permissions on tables in public schema - GRANT ALL ON all tables IN schema public TO my_group; - -Removing users and permissions can be done with the ``REVOKE`` command: - -.. code-block:: postgres - - -- remove my_other_user from this group - REVOKE my_group FROM my_other_user; - -.. _permissions_table: - -Permissions -=========== - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Object/layer - - Permission - - Description - - * - all databases - - ``LOGIN`` - - use role to log into the system (the role also needs connect permission on the database it is connecting to) - - * - all databases - - ``PASSWORD`` - - the password used for logging into the system - - * - all databases - - ``SUPERUSER`` - - no permission restrictions on any activity - - * - database - - ``SUPERUSER`` - - no permission restrictions on any activity within that database (this does not include modifying roles or permissions) - - * - database - - ``CONNECT`` - - connect to the database - - * - database - - ``CREATE`` - - create schemas in the database - - * - database - - ``CREATE FUNCTION`` - - create and drop functions - - * - schema - - ``USAGE`` - - allows additional permissions within the schema - - * - schema - - ``CREATE`` - - create tables in the schema - - * - table - - ``SELECT`` - - :ref:`select` from the table - - * - table - - ``INSERT`` - - :ref:`insert` into the table - - * - table - - ``DELETE`` - - :ref:`delete` and :ref:`truncate` on the table - - * - table - - ``DDL`` - - drop and alter on the table - - * - table - - ``ALL`` - - all the table permissions - - * - function - - ``EXECUTE`` - - use the function - - * - function - - ``DDL`` - - drop and alter on the function - - * - function - - ``ALL`` - - all function permissions - -GRANT ------ - -:ref:`grant` gives permissions to a role. - -.. code-block:: postgres - - -- Grant permissions at the instance/ storage cluster level: - GRANT - - { SUPERUSER - | LOGIN - | PASSWORD '' - } - TO [, ...] - - -- Grant permissions at the database level: - GRANT {{CREATE | CONNECT| DDL | SUPERUSER | CREATE FUNCTION} [, ...] | ALL [PERMISSIONS]} - - ON DATABASE [, ...] - TO [, ...] - - -- Grant permissions at the schema level: - GRANT {{ CREATE | DDL | USAGE | SUPERUSER } [, ...] | ALL [ - PERMISSIONS ]} - ON SCHEMA [, ...] - TO [, ...] - - -- Grant permissions at the object level: - GRANT {{SELECT | INSERT | DELETE | DDL } [, ...] | ALL [PERMISSIONS]} - ON { TABLE [, ...] | ALL TABLES IN SCHEMA [, ...]} - TO [, ...] - - -- Grant execute function permission: - GRANT {ALL | EXECUTE | DDL} ON FUNCTION function_name - TO role; - - -- Allows role2 to use permissions granted to role1 - GRANT [, ...] - TO - - -- Also allows the role2 to grant role1 to other roles: - GRANT [, ...] - TO - WITH ADMIN OPTION - -``GRANT`` examples: - -.. code-block:: postgres - - GRANT LOGIN,superuser TO admin; - - GRANT CREATE FUNCTION ON database master TO admin; - - GRANT SELECT ON TABLE admin.table1 TO userA; - - GRANT EXECUTE ON FUNCTION my_function TO userA; - - GRANT ALL ON FUNCTION my_function TO userA; - - GRANT DDL ON admin.main_table TO userB; - - GRANT ALL ON all tables IN schema public TO userB; - - GRANT admin TO userC; - - GRANT superuser ON schema demo TO userA - - GRANT admin_role TO userB; - -REVOKE ------- - -:ref:`revoke` removes permissions from a role. - -.. code-block:: postgres - - -- Revoke permissions at the instance/ storage cluster level: - REVOKE - { SUPERUSER - | LOGIN - | PASSWORD - } - FROM [, ...] - - -- Revoke permissions at the database level: - REVOKE {{CREATE | CONNECT | DDL | SUPERUSER | CREATE FUNCTION}[, ...] |ALL [PERMISSIONS]} - ON DATABASE [, ...] - FROM [, ...] - - -- Revoke permissions at the schema level: - REVOKE { { CREATE | DDL | USAGE | SUPERUSER } [, ...] | ALL [PERMISSIONS]} - ON SCHEMA [, ...] - FROM [, ...] - - -- Revoke permissions at the object level: - REVOKE { { SELECT | INSERT | DELETE | DDL } [, ...] | ALL } - ON { [ TABLE ] [, ...] | ALL TABLES IN SCHEMA - - [, ...] } - FROM [, ...] - - -- Removes access to permissions in role1 by role 2 - REVOKE [, ...] FROM [, ...] WITH ADMIN OPTION - - -- Removes permissions to grant role1 to additional roles from role2 - REVOKE [, ...] FROM [, ...] WITH ADMIN OPTION - - -Examples: - -.. code-block:: postgres - - REVOKE superuser on schema demo from userA; - - REVOKE delete on admin.table1 from userB; - - REVOKE login from role_test; - - REVOKE CREATE FUNCTION FROM admin; - -Default permissions -------------------- - -The default permissions system (See :ref:`alter_default_permissions`) -can be used to automatically grant permissions to newly -created objects (See the departmental example below for one way it can be used). - -A default permissions rule looks for a schema being created, or a -table (possibly by schema), and is table to grant any permission to -that object to any role. This happens when the create table or create -schema statement is run. - - -.. code-block:: postgres - - - ALTER DEFAULT PERMISSIONS FOR target_role_name - [IN schema_name, ...] - FOR { TABLES | SCHEMAS } - { grant_clause | DROP grant_clause} - TO ROLE { role_name | public }; - - grant_clause ::= - GRANT - { CREATE FUNCTION - | SUPERUSER - | CONNECT - | CREATE - | USAGE - | SELECT - | INSERT - | DELETE - | DDL - | EXECUTE - | ALL - } - - -Departmental Example -======================= - -You work in a company with several departments. - -The example below shows you how to manage permissions in a database shared by multiple departments, where each department has different roles for the tables by schema. It walks you through how to set the permissions up for existing objects and how to set up default permissions rules to cover newly created objects. - -The concept is that you set up roles for each new schema with the correct permissions, then the existing users can use these roles. - -A superuser must do new setup for each new schema which is a limitation, but superuser permissions are not needed at any other time, and neither are explicit grant statements or object ownership changes. - -In the example, the database is called ``my_database``, and the new or existing schema being set up to be managed in this way is called ``my_schema``. - -.. figure:: /_static/images/access_control_department_example.png - :scale: 60 % - - Our departmental example has four user group roles and seven users roles - -There will be a group for this schema for each of the following: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Group - - Activities - - * - database designers - - create, alter and drop tables - - * - updaters - - insert and delete data - - * - readers - - read data - - * - security officers - - add and remove users from these groups - -Setting up the department permissions ------------------------------------------- - -As a superuser, you connect to the system and run the following: - -.. code-block:: postgres - - -- create the groups - - CREATE ROLE my_schema_security_officers; - CREATE ROLE my_schema_database_designers; - CREATE ROLE my_schema_updaters; - CREATE ROLE my_schema_readers; - - -- grant permissions for each role - -- we grant permissions for existing objects here too, - -- so you don't have to start with an empty schema - - -- security officers - - GRANT connect ON DATABASE my_database TO my_schema_security_officers; - GRANT usage ON SCHEMA my_schema TO my_schema_security_officers; - - GRANT my_schema_database_designers TO my_schema_security_officers WITH ADMIN OPTION; - GRANT my_schema_updaters TO my_schema_security_officers WITH ADMIN OPTION; - GRANT my_schema_readers TO my_schema_security_officers WITH ADMIN OPTION; - - -- database designers - - GRANT connect ON DATABASE my_database TO my_schema_database_designers; - GRANT usage ON SCHEMA my_schema TO my_schema_database_designers; - - GRANT create,ddl ON SCHEMA my_schema TO my_schema_database_designers; - - -- updaters - - GRANT connect ON DATABASE my_database TO my_schema_updaters; - GRANT usage ON SCHEMA my_schema TO my_schema_updaters; - - GRANT SELECT,INSERT,DELETE ON ALL TABLES IN SCHEMA my_schema TO my_schema_updaters; - - -- readers - - GRANT connect ON DATABASE my_database TO my_schema_readers; - GRANT usage ON SCHEMA my_schema TO my_schema_readers; - - GRANT SELECT ON ALL TABLES IN SCHEMA my_schema TO my_schema_readers; - GRANT EXECUTE ON ALL FUNCTIONS TO my_schema_readers; - - - -- create the default permissions for new objects - - ALTER DEFAULT PERMISSIONS FOR my_schema_database_designers IN my_schema - FOR TABLES GRANT SELECT,INSERT,DELETE TO my_schema_updaters; - - -- For every table created by my_schema_database_designers, give access to my_schema_readers: - - ALTER DEFAULT PERMISSIONS FOR my_schema_database_designers IN my_schema - FOR TABLES GRANT SELECT TO my_schema_readers; - -.. note:: - * This process needs to be repeated by a user with ``SUPERUSER`` permissions each time a new schema is brought into this permissions management approach. - - * - By default, any new object created will not be accessible by our new ``my_schema_readers`` group. - Running a ``GRANT SELECT ...`` only affects objects that already exist in the schema or database. - - If you're getting a ``Missing the following permissions: SELECT on table 'database.public.tablename'`` error, make sure that - you've altered the default permissions with the ``ALTER DEFAULT PERMISSIONS`` statement. - -Creating new users in the departments ------------------------------------------ - -After the group roles have been created, you can now create user roles for each of your users. - -.. code-block:: postgres - - -- create the new database designer users - - CREATE ROLE ecodd; - GRANT LOGIN TO ecodd; - GRANT PASSWORD 'ecodds_secret_password' TO ecodd; - GRANT CONNECT ON DATABASE my_database TO ecodd; - GRANT my_schema_database_designers TO ecodd; - - CREATE ROLE ebachmann; - GRANT LOGIN TO ebachmann; - GRANT PASSWORD 'another_secret_password' TO ebachmann; - GRANT CONNECT ON DATABASE my_database TO ebachmann; - GRANT my_database_designers TO ebachmann; - - -- If a user already exists, we can assign that user directly to the group - - GRANT my_schema_updaters TO rhendricks; - - -- Create users in the readers group - - CREATE ROLE jbarker; - GRANT LOGIN TO jbarker; - GRANT PASSWORD 'action_jack' TO jbarker; - GRANT CONNECT ON DATABASE my_database TO jbarker; - GRANT my_schema_readers TO jbarker; - - CREATE ROLE lbream; - GRANT LOGIN TO lbream; - GRANT PASSWORD 'artichoke123' TO lbream; - GRANT CONNECT ON DATABASE my_database TO lbream; - GRANT my_schema_readers TO lbream; - - CREATE ROLE pgregory; - GRANT LOGIN TO pgregory; - GRANT PASSWORD 'c1ca6a' TO pgregory; - GRANT CONNECT ON DATABASE my_database TO pgregory; - GRANT my_schema_readers TO pgregory; - - -- Create users in the security officers group - - CREATE ROLE hoover; - GRANT LOGIN TO hoover; - GRANT PASSWORD 'mintchip' TO hoover; - GRANT CONNECT ON DATABASE my_database TO hoover; - GRANT my_schema_security_officers TO hoover; - - -.. todo: - create some example users - show that they have the right permission - try out the with admin option. we can't really do a security officer because - only superusers can create users and logins. see what can be done - need 1-2 users in each group, for at least 2 schemas/departments - this example will be very big just to show what this setup can do ... - example: a security officer for a department which will only have - read only access to a schema can only get that with admin option - access granted to them - -After this setup: - -* Database designers will be able to run any ddl on objects in the schema and create new objects, including ones created by other database designers -* Updaters will be able to insert and delete to existing and new tables -* Readers will be able to read from existing and new tables - -All this will happen without having to run any more ``GRANT`` statements. - -Any security officer will be able to add and remove users from these -groups. Creating and dropping login users themselves must be done by a -superuser. +.. toctree:: + :maxdepth: 1 + :titlesonly: + + access_control_overview + access_control_password_policy + access_control_managing_roles + access_control_permissions + access_control_departmental_example \ No newline at end of file diff --git a/operational_guides/access_control_departmental_example.rst b/operational_guides/access_control_departmental_example.rst new file mode 100644 index 000000000..0358f3a3e --- /dev/null +++ b/operational_guides/access_control_departmental_example.rst @@ -0,0 +1,180 @@ +.. _access_control_departmental_example: + +********************* +Departmental Example +********************* + +You work in a company with several departments. + +The example below shows you how to manage permissions in a database shared by multiple departments, where each department has different roles for the tables by schema. It walks you through how to set the permissions up for existing objects and how to set up default permissions rules to cover newly created objects. + +The concept is that you set up roles for each new schema with the correct permissions, then the existing users can use these roles. + +A superuser must do new setup for each new schema which is a limitation, but superuser permissions are not needed at any other time, and neither are explicit grant statements or object ownership changes. + +In the example, the database is called ``my_database``, and the new or existing schema being set up to be managed in this way is called ``my_schema``. + +Our departmental example has four user group roles and seven users roles + +There will be a group for this schema for each of the following: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Group + - Activities + + * - database designers + - create, alter and drop tables + + * - updaters + - insert and delete data + + * - readers + - read data + + * - security officers + - add and remove users from these groups + +Setting up the department permissions +------------------------------------------ + +As a superuser, you connect to the system and run the following: + +.. code-block:: postgres + + -- create the groups + + CREATE ROLE my_schema_security_officers; + CREATE ROLE my_schema_database_designers; + CREATE ROLE my_schema_updaters; + CREATE ROLE my_schema_readers; + + -- grant permissions for each role + -- we grant permissions for existing objects here too, + -- so you don't have to start with an empty schema + + -- security officers + + GRANT connect ON DATABASE my_database TO my_schema_security_officers; + GRANT usage ON SCHEMA my_schema TO my_schema_security_officers; + + GRANT my_schema_database_designers TO my_schema_security_officers WITH ADMIN OPTION; + GRANT my_schema_updaters TO my_schema_security_officers WITH ADMIN OPTION; + GRANT my_schema_readers TO my_schema_security_officers WITH ADMIN OPTION; + + -- database designers + + GRANT connect ON DATABASE my_database TO my_schema_database_designers; + GRANT usage ON SCHEMA my_schema TO my_schema_database_designers; + + GRANT create,ddl ON SCHEMA my_schema TO my_schema_database_designers; + + -- updaters + + GRANT connect ON DATABASE my_database TO my_schema_updaters; + GRANT usage ON SCHEMA my_schema TO my_schema_updaters; + + GRANT SELECT,INSERT,DELETE ON ALL TABLES IN SCHEMA my_schema TO my_schema_updaters; + + -- readers + + GRANT connect ON DATABASE my_database TO my_schema_readers; + GRANT usage ON SCHEMA my_schema TO my_schema_readers; + + GRANT SELECT ON ALL TABLES IN SCHEMA my_schema TO my_schema_readers; + GRANT EXECUTE ON ALL FUNCTIONS TO my_schema_readers; + + + -- create the default permissions for new objects + + ALTER DEFAULT PERMISSIONS FOR my_schema_database_designers IN my_schema + FOR TABLES GRANT SELECT,INSERT,DELETE TO my_schema_updaters; + + -- For every table created by my_schema_database_designers, give access to my_schema_readers: + + ALTER DEFAULT PERMISSIONS FOR my_schema_database_designers IN my_schema + FOR TABLES GRANT SELECT TO my_schema_readers; + +.. note:: + * This process needs to be repeated by a user with ``SUPERUSER`` permissions each time a new schema is brought into this permissions management approach. + + * + By default, any new object created will not be accessible by our new ``my_schema_readers`` group. + Running a ``GRANT SELECT ...`` only affects objects that already exist in the schema or database. + + If you're getting a ``Missing the following permissions: SELECT on table 'database.public.tablename'`` error, make sure that + you've altered the default permissions with the ``ALTER DEFAULT PERMISSIONS`` statement. + +Creating new users in the departments +----------------------------------------- + +After the group roles have been created, you can now create user roles for each of your users. + +.. code-block:: postgres + + -- create the new database designer users + + CREATE ROLE ecodd; + GRANT LOGIN TO ecodd; + GRANT PASSWORD 'Passw0rd!' TO ecodd; + GRANT CONNECT ON DATABASE my_database TO ecodd; + GRANT USAGE ON SERVICE sqream TO ecodd; + GRANT my_schema_database_designers TO ecodd; + + CREATE ROLE ebachmann; + GRANT LOGIN TO ebachmann; + GRANT PASSWORD 'Passw0rd!!!' TO ebachmann; + GRANT CONNECT ON DATABASE my_database TO ebachmann; + GRANT USAGE ON SERVICE sqream TO ebachmann; + GRANT my_database_designers TO ebachmann; + + -- If a user already exists, we can assign that user directly to the group + + GRANT my_schema_updaters TO rhendricks; + + -- Create users in the readers group + + CREATE ROLE jbarker; + GRANT LOGIN TO jbarker; + GRANT PASSWORD 'action_jacC%k' TO jbarker; + GRANT CONNECT ON DATABASE my_database TO jbarker; + GRANT USAGE ON SERVICE sqream TO jbarker; + GRANT my_schema_readers TO jbarker; + + CREATE ROLE lbream; + GRANT LOGIN TO lbream; + GRANT PASSWORD 'artichoke123O$' TO lbream; + GRANT CONNECT ON DATABASE my_database TO lbream; + GRANT USAGE ON SERVICE sqream TO lbream; + GRANT my_schema_readers TO lbream; + + CREATE ROLE pgregory; + GRANT LOGIN TO pgregory; + GRANT PASSWORD 'c1ca6aG$' TO pgregory; + GRANT CONNECT ON DATABASE my_database TO pgregory; + GRANT USAGE ON SERVICE sqream TO pgregory; + GRANT my_schema_readers TO pgregory; + + -- Create users in the security officers group + + CREATE ROLE hoover; + GRANT LOGIN TO hoover; + GRANT PASSWORD 'mint*Rchip' TO hoover; + GRANT CONNECT ON DATABASE my_database TO hoover; + GRANT USAGE ON SERVICE sqream TO hoover; + GRANT my_schema_security_officers TO hoover; + + +After this setup: + +* Database designers will be able to run any ddl on objects in the schema and create new objects, including ones created by other database designers +* Updaters will be able to insert and delete to existing and new tables +* Readers will be able to read from existing and new tables + +All this will happen without having to run any more ``GRANT`` statements. + +Any security officer will be able to add and remove users from these +groups. Creating and dropping login users themselves must be done by a +superuser. \ No newline at end of file diff --git a/operational_guides/access_control_managing_roles.rst b/operational_guides/access_control_managing_roles.rst new file mode 100644 index 000000000..b6e9985f8 --- /dev/null +++ b/operational_guides/access_control_managing_roles.rst @@ -0,0 +1,129 @@ +.. _access_control_managing_roles: + +************** +Managing Roles +************** + +Roles are used for both users and groups, and are global across all databases in the SQream cluster. For a ``ROLE`` to be used as a user, it requires a password and log-in and connect permissionss to the relevant databases. + +The Managing Roles section describes the following role-related operations: + +.. contents:: + :local: + :depth: 1 + +Creating New Roles (Users) +------------------------------ + +A user role logging in to the database requires ``LOGIN`` permissions and a password. + +The following is the syntax for creating a new role: + +.. code-block:: postgres + + CREATE ROLE ; + GRANT LOGIN to ; + GRANT PASSWORD <'new_password'> to ; + GRANT CONNECT ON DATABASE to ; + +The following is an example of creating a new role: + +.. code-block:: postgres + + CREATE ROLE new_role_name ; + GRANT LOGIN TO new_role_name; + GRANT PASSWORD 'Passw0rd!' to new_role_name; + GRANT CONNECT ON DATABASE master to new_role_name; + +A database role may have a number of permissions that define what tasks it can perform, which are assigned using the :ref:`grant` command. + +Dropping a User +------------------------------ + +The following is the syntax for dropping a user: + +.. code-block:: postgres + + DROP ROLE ; + +The following is an example of dropping a user: + +.. code-block:: postgres + + DROP ROLE admin_role ; + +Altering a User Name +------------------------------ + +The following is the syntax for altering a user name: + +.. code-block:: postgres + + ALTER ROLE RENAME TO ; + +The following is an example of altering a user name: + +.. code-block:: postgres + + ALTER ROLE admin_role RENAME TO copy_role ; + +Changing a User Password +------------------------------ + +You can change a user role's password by granting the user a new password. + +The following is an example of changing a user password: + +.. code-block:: postgres + + GRANT PASSWORD <'new_password'> TO rhendricks; + +.. note:: Granting a new password overrides any previous password. Changing the password while the role has an active running statement does not affect that statement, but will affect subsequent statements. + +Altering Public Role Permissions +--------------------------------- + +The database has a predefined ``PUBLIC`` role that cannot be deleted. Each user role is automatically granted membership in the ``PUBLIC`` role public group, and this membership cannot be revoked. However, you have the capability to adjust the permissions associated with this ``PUBLIC`` role. + +The ``PUBLIC`` role has ``USAGE`` and ``CREATE`` permissions on ``PUBLIC`` schema by default, therefore, newly created user roles are granted ``CREATE`` (:ref:`databases`, :ref:`schemas`, :ref:`roles`, :ref:`functions`, :ref:`views`, and :ref:`tables`) on the public schema. Other permissions, such as :ref:`insert`, :ref:`delete`, :ref:`select`, and :ref:`update` on objects in the public schema are not automatically granted. + + +Altering Role Membership (Groups) +--------------------------------- + +Many database administrators find it useful to group user roles together. By grouping users, permissions can be granted to, or revoked from a group with one command. In SQream DB, this is done by creating a group role, granting permissions to it, and then assigning users to that group role. + +To use a role purely as a group, omit granting it ``LOGIN`` and ``PASSWORD`` permissions. + +The ``CONNECT`` permission can be given directly to user roles, and/or to the groups they are part of. + +.. code-block:: postgres + + CREATE ROLE my_group; + +Once the group role exists, you can add user roles (members) using the ``GRANT`` command. For example: + +.. code-block:: postgres + + -- Add my_user to this group + GRANT my_group TO my_user; + + +To manage object permissions like databases and tables, you would then grant permissions to the group-level role (see :ref:`the permissions table` below. + +All member roles then inherit the permissions from the group. For example: + +.. code-block:: postgres + + -- Grant all group users connect permissions + GRANT CONNECT ON DATABASE a_database TO my_group; + + -- Grant all permissions on tables in public schema + GRANT ALL ON all tables IN schema public TO my_group; + +Removing users and permissions can be done with the ``REVOKE`` command: + +.. code-block:: postgres + + -- remove my_other_user from this group + REVOKE my_group FROM my_other_user; diff --git a/operational_guides/access_control_overview.rst b/operational_guides/access_control_overview.rst new file mode 100644 index 000000000..080797fec --- /dev/null +++ b/operational_guides/access_control_overview.rst @@ -0,0 +1,20 @@ +.. _access_control_overview: + +************** +Overview +************** +Access control refers to SQream's authentication and authorization operations, managed using a **Role-Based Access Control (RBAC)** system, such as ANSI SQL or other SQL products. SQream's default permissions system is similar to Postgres, but is more powerful. SQream's method lets administrators prepare the system to automatically provide objects with their required permissions. + +SQream users can log in from any worker, which verify their roles and permissions from the metadata server. Each statement issues commands as the role that you're currently logged into. Roles are defined at the cluster level, and are valid for all databases in the cluster. To bootstrap SQream, new installations require one ``SUPERUSER`` role, typically named ``sqream``. You can only create new roles by connecting as this role. + +Access control refers to the following basic concepts: + + * **Role** - A role can be a user, a group, or both. Roles can own database objects (such as tables) and can assign permissions on those objects to other roles. Roles can be members of other roles, meaning a user role can inherit permissions from its parent role. + + :: + + * **Authentication** - Verifies the identity of the role. User roles have usernames (or **role names**) and passwords. + + :: + + * **Authorization** - Checks that a role has permissions to perform a particular operation, such as the :ref:`grant` command. \ No newline at end of file diff --git a/operational_guides/access_control_password_policy.rst b/operational_guides/access_control_password_policy.rst new file mode 100644 index 000000000..6c69257ed --- /dev/null +++ b/operational_guides/access_control_password_policy.rst @@ -0,0 +1,76 @@ +.. _access_control_password_policy: + +************** +Password Policy +************** +The **Password Policy** describes the following: + +.. contents:: + :local: + :depth: 1 + +Password Strength Requirements +============================== +As part of our compliance with GDPR standards SQream relies on a strong password policy when accessing the CLI or Studio, with the following requirements: + +* At least eight characters long. + + :: + +* Mandatory upper and lowercase letters. + + :: + +* At least one numeric character. + + :: + +* May not include a username. + + :: + +* Must include at least one special character, such as **?**, **!**, **$**, etc. + +You can create a password by using the Studio graphic interface or using the CLI, as in the following example command: + +.. code-block:: console + + CREATE ROLE user_a ; + GRANT LOGIN to user_a ; + GRANT PASSWORD 'BBAu47?fqPL' to user_a ; + +Creating a password which does not comply with the password policy generates an error message with a request to include any of the missing above requirements: + +.. code-block:: console + + The password you attempted to create does not comply with SQream's security requirements. + + Your password must: + + * Be at least eight characters long. + + * Contain upper and lowercase letters. + + * Contain at least one numeric character. + + * Not include a username. + + * Include at least one special character, such as **?**, **!**, **$**, etc. + +Brute Force Prevention +============================== +Unsuccessfully attempting to log in five times displays the following message: + +.. code-block:: console + + The user is locked. Please contact your system administrator to reset the password and regain access functionality. + +You must have superuser permissions to release a locked user to grant a new password: + +.. code-block:: console + + GRANT PASSWORD '' to ; + +For more information, see :ref:`login_max_retries`. + +.. warning:: Because superusers can also be blocked, **you must have** at least two superusers per cluster. \ No newline at end of file diff --git a/operational_guides/access_control_permissions.rst b/operational_guides/access_control_permissions.rst new file mode 100644 index 000000000..2e0dc862a --- /dev/null +++ b/operational_guides/access_control_permissions.rst @@ -0,0 +1,532 @@ +.. _access_control_permissions: + +*********** +Permissions +*********** + +SQreamDB’s primary permission object is a role. The role operates in a dual capacity as both a user and a group. As a user, a role may have permissions to execute operations like creating tables, querying data, and administering the database. The group attribute may be thought of as a membership. As a group, a role may extend its permissions to other roles defined as its group members. This becomes handy when privileged roles wish to extend their permissions and grant multiple permissions to multiple roles. The information about all system role permissions is stored in the metadata. + +There are two types of permissions: global and object-level. Global permissions belong to ``SUPERUSER`` roles, allowing unrestricted access to all system and database activities. Object-level permissions apply to non-``SUPERUSER`` roles and can be assigned to databases, schemas, tables, functions, views, foreign tables, catalogs, and services. + +The following table describe the required permissions for performing and executing operations on various SQreamDB objects. + ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Permission** | **Description** | ++======================+=========================================================================================================================+ +|**All Databases** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``LOGIN`` | Use role to log into the system (the role also needs connect permission on the database it is connecting to) | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``PASSWORD`` | The password used for logging into the system | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SUPERUSER`` | No permission restrictions on any activity | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Database** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SUPERUSER`` | No permission restrictions on any activity within that database (this does not include modifying roles or permissions) | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``CONNECT`` | Connect to the database | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``CREATE`` | Create and drop schemas in the database (the schema must be empty for ``DROP`` operation to succeed) | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``CREATEFUNCTION`` | Create and drop functions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Drop and alter tables within the database | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All database permissions except for a SUPERUSER permission | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Schema** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``USAGE`` | Grants access to schema objects | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``CREATE`` | Create tables in the schema | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SUPERUSER`` | No permission restrictions on any activity within the schema (this does not include modifying roles or permissions) | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Drop and alter tables within the schema | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All schema permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Table** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | :ref:`select` from the table | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``INSERT`` | :ref:`insert` into the table | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``UPDATE`` | :ref:`update` the value of certain columns in existing rows | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DELETE`` | :ref:`delete` and :ref:`truncate` on the table | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Drop and alter on the table | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All table permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Function** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``EXECUTE`` | Use the function | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Drop and alter on the function | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All function permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Column** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | Select from column | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``INSERT`` | :ref:`insert` into the column | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``UPDATE`` | :ref:`update` the value of certain columns in existing rows | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **View** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | Select from view | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | DDL operations of view results | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All views permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Foreign Table** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | Select from foreign table | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Foreign table DDL operations | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All foreign table permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Catalog** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | Select from catalog | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Services** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``USAGE`` | Using a specific service | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All services permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| **Saved Query** | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``SELECT`` | Executing saved query statements and utility functions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``DDL`` | Saved query DDL operations | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``USAGE`` | Grants access to saved query objects | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ +| ``ALL`` | All saved query permissions | ++----------------------+-------------------------------------------------------------------------------------------------------------------------+ + +Syntax +====== + +Permissions may be granted or revoked using the following syntax. + +GRANT +------ + +.. code-block:: postgres + + -- Grant permissions to all databases: + GRANT { + SUPERUSER + | LOGIN + | PASSWORD '' } + TO [, ...] + + -- Grant permissions at the database level: + GRANT { + CREATE + | CONNECT + | DDL + | SUPERUSER + | CREATE FUNCTION } [, ...] + | ALL [PERMISSIONS] + ON DATABASE [, ...] + TO [, ...] + + -- Grant permissions at the schema level: + GRANT { + CREATE + | DDL + | USAGE + | SUPERUSER } [, ...] + | ALL [PERMISSIONS] + ON SCHEMA [, ...] + TO [, ...] + + -- Grant permissions at the object level: + GRANT { + SELECT + | INSERT + | DELETE + | DDL + | UPDATE } [, ...] + | ALL [PERMISSIONS] + ON {TABLE [, ...] + | ALL TABLES IN SCHEMA [, ...]} + TO [, ...] + + -- Grant permissions at the catalog level: + GRANT SELECT + ON { CATALOG [, ...] } + TO [, ...] + + -- Grant permissions on the foreign table level: + + GRANT { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { FOREIGN TABLE [, ...] + | ALL FOREIGN TABLE IN SCHEMA [, ...]} + TO [, ...] + + -- Grant function execution permission: + GRANT { + ALL + | EXECUTE + | DDL } + ON FUNCTION + TO + + -- Grant permissions at the column level: + GRANT + { + { SELECT + | INSERT + | UPDATE } [, ...] + | ALL [PERMISSIONS] + } + ON + { + COLUMN [,] IN TABLE + | COLUMN [,] IN FOREIGN TABLE + } + TO [, ...] + + -- Grant permissions on the view level + GRANT { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { VIEW [, ...] + | ALL VIEWS IN SCHEMA [, ...]} + TO [, ...] + + -- Grant permissions at the Service level: + GRANT { + {USAGE} [, ...] + | ALL [PERMISSIONS] } + ON { SERVICE [, ...] + | ALL SERVICES IN SYSTEM } + TO [, ...] + + -- Grant saved query permissions + GRANT + SELECT + | DDL + | USAGE + | ALL + ON SAVED QUERY [,...] + TO [,...] + + -- Allows role2 to use permissions granted to role1 + GRANT [, ...] + TO + + -- Also allows the role2 to grant role1 to other roles: + GRANT [, ...] + TO [,...] [WITH ADMIN OPTION] + + +REVOKE +------- + +.. code-block:: postgres + + -- Revoke permissions from all databases: + REVOKE { + SUPERUSER + | LOGIN + | PASSWORD '' } + FROM [, ...] + + -- Revoke permissions at the database level: + REVOKE { + CREATE + | CONNECT + | DDL + | SUPERUSER + | CREATE FUNCTION } [, ...] + | ALL [PERMISSIONS] + ON DATABASE [, ...] + FROM [, ...] + + -- Revoke permissions at the schema level: + REVOKE { + CREATE + | DDL + | USAGE + | SUPERUSER } [, ...] + | ALL [PERMISSIONS] + ON SCHEMA [, ...] + FROM [, ...] + + -- Revoke permissions at the object level: + REVOKE { + SELECT + | INSERT + | DELETE + | DDL + | UPDATE } [, ...] + | ALL [PERMISSIONS] + ON {TABLE [, ...] + | ALL TABLES IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke permissions at the catalog level: + REVOKE SELECT + ON { CATALOG [, ...] } + FROM [, ...] + + -- Revoke permissions on the foreign table level: + + REVOKE { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { FOREIGN TABLE [, ...] + | ALL FOREIGN TABLE IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke function execution permission: + REVOKE { + ALL + | EXECUTE + | DDL } + ON FUNCTION + FROM + + -- Revoke permissions at the column level: + REVOKE + { + { SELECT + | DDL } [, ...] + | INSERT + | UPDATE } [, ...] + | ALL [PERMISSIONS]} + ON + { + COLUMN [,] IN TABLE | COLUMN [,] IN FOREIGN TABLE + } + FROM [, ...] + + -- Revoke permissions on the view level + REVOKE { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { VIEW [, ...] + | ALL VIEWS IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke permissions at the Service level: + REVOKE { + {USAGE} [, ...] + | ALL [PERMISSIONS] } + ON { SERVICE [, ...] + | ALL SERVICES IN SYSTEM } + FROM [, ...] + + -- Revoke saved query permissions + REVOKE + SELECT + | DDL + | USAGE + | ALL + ON SAVED QUERY [,...] + FROM [,...] + + -- Removes access to permissions in role1 by role 2 + REVOKE [ADMIN OPTION FOR] [, ...] + FROM [, ...] + + -- Removes permissions to grant role1 to additional roles from role2 + REVOKE [ADMIN OPTION FOR] [, ...] + FROM [, ...] + +Altering Default Permissions +----------------------------- + +The default permissions system (See :ref:`alter_default_permissions`) +can be used to automatically grant permissions to newly +created objects (See the departmental example below for one way it can be used). + +A default permissions rule looks for a schema being created, or a +table (possibly by schema), and is table to grant any permission to +that object to any role. This happens when the create table or create +schema statement is run. + + +.. code-block:: postgres + + ALTER DEFAULT PERMISSIONS FOR modifying_role_name + [IN schema_name, ...] + FOR { + SCHEMAS + | TABLES + | FOREIGN TABLES + | VIEWS + | COLUMNS + | SAVED_QUERIES + | FUNCTIONS + } + { grant_clause + | DROP grant_clause } + TO { modified_role_name | public + } + + grant_clause ::= + GRANT + { CREATE FUNCTION + | SUPERUSER + | CONNECT + | USAGE + | SELECT + | INSERT + | DELETE + | DDL + | UPDATE + | EXECUTE + | ALL + } + +Examples +======== + +GRANT +-------------- + +Grant superuser privileges and login capability to a role: + +.. code-block:: sql + + GRANT SUPERUSER, LOGIN TO role_name; + +Grant specific permissions on a database to a role: + +.. code-block:: postgres + + GRANT CREATE, CONNECT, DDL, SUPERUSER, CREATE FUNCTION ON DATABASE database_name TO role_name; + +Grant various permissions on a schema to a role: + +.. code-block:: postgres + + GRANT CREATE, USAGE, SUPERUSER ON SCHEMA schema_name TO role_name; + +Grant permissions on specific objects (table, view, foreign table, or catalog) to a role: + +.. code-block:: postgres + + GRANT SELECT, INSERT, DELETE, DDL, UPDATE ON TABLE schema_name.table_name TO role_name; + +Grant execute function permission to a role: + +.. code-block:: postgres + + GRANT EXECUTE ON FUNCTION function_name TO role_name; + +Grant column-level permissions to a role: + +.. code-block:: postgres + + GRANT SELECT, DDL ON COLUMN column_name IN TABLE schema_name.table_name TO role_name; + +Grant view-level permissions to a role: + +.. code-block:: postgres + + GRANT ALL PERMISSIONS ON VIEW "view_name" IN SCHEMA "schema_name" TO role_name; + +Grant usage permissions on a service to a role: + +.. code-block:: postgres + + GRANT USAGE ON SERVICE service_name TO role_name; + +Grant role2 the ability to use permissions granted to role1: + +.. code-block:: postgres + + GRANT role1 TO role2; + +Grant role2 the ability to grant role1 to other roles: + +.. code-block:: postgres + + GRANT role1 TO role2 WITH ADMIN OPTION; + + +REVOKE +--------------- + +Revoke superuser privileges or login capability from a role: + +.. code-block:: postgres + + REVOKE SUPERUSER, LOGIN FROM role_name; + +Revoke specific permissions on a database from a role: + +.. code-block:: postgres + + REVOKE CREATE, CONNECT, DDL, SUPERUSER, CREATE FUNCTION ON DATABASE database_name FROM role_name; + +Revoke permissions on a schema from a role: + +.. code-block:: postgres + + REVOKE CREATE, USAGE, SUPERUSER ON SCHEMA schema_name FROM role_name; + +Revoke permissions on specific objects (table, view, foreign table, or catalog) from a role: + +.. code-block:: postgres + + REVOKE SELECT, INSERT, DELETE, DDL, UPDATE ON TABLE schema_name.table_name FROM role_name; + +Revoke execute function permission from a role: + +.. code-block:: postgres + + REVOKE EXECUTE ON FUNCTION function_name FROM role_name; + +Revoke column-level permissions from a role: + +.. code-block:: postgres + + REVOKE SELECT, DDL FROM COLUMN column_name IN TABLE schema_name.table_name FROM role_name; + +Revoke view-level permissions from a role: + +.. code-block:: postgres + + REVOKE ALL PERMISSIONS ON VIEW "view_name" IN SCHEMA "schema_name" FROM role_name; + +Revoke usage permissions on a service from a role: + +.. code-block:: postgres + + REVOKE USAGE ON SERVICE service_name FROM role_name; + +Remove access to permissions in role1 by role2: + +.. code-block:: postgres + + REVOKE role1 FROM role2 ; + +Remove permissions to grant role1 to additional roles from role2: + +.. code-block:: postgres + + REVOKE ADMIN OPTION FOR role1 FROM role2 ; + + diff --git a/operational_guides/creating_or_cloning_a_storage_cluster.rst b/operational_guides/creating_or_cloning_a_storage_cluster.rst index 0406bda77..e99511a5d 100644 --- a/operational_guides/creating_or_cloning_a_storage_cluster.rst +++ b/operational_guides/creating_or_cloning_a_storage_cluster.rst @@ -1,12 +1,13 @@ .. _creating_or_cloning_a_storage_cluster: -**************************************** +************************************ Creating or Cloning Storage Clusters -**************************************** +************************************ + When SQream DB is installed, it comes with a default storage cluster. This guide will help if you need a fresh storage cluster or a separate copy of an existing storage cluster. Creating a new storage cluster -===================================== +============================== SQream DB comes with a CLI tool, :ref:`sqream_storage_cli_reference`. This tool can be used to create a new empty storage cluster. @@ -22,10 +23,10 @@ This can also be written shorthand as ``SqreamStorage -C -r /home/rhendricks/rav This ``Setting cluster version...`` message confirms the creation of the cluster successfully. Tell SQream DB to use this storage cluster -=============================================== +========================================== Permanently setting the storage cluster setting -------------------------------------------------------- +----------------------------------------------- To permanently set the new cluster location, change the ``"cluster"`` path listed in the configuration file. @@ -72,12 +73,12 @@ should be changed to Now, the cluster should be restarted for the changes to take effect. Start a temporary SQream DB worker with a storage cluster -------------------------------------------------------------- +--------------------------------------------------------- Starting a SQream DB worker with a custom cluster path can be done in two ways: Using a configuration file (recommended) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Similar to the technique above, create a configuration file with the correct cluster path. Then, start ``sqreamd`` using the ``-config`` flag: @@ -86,7 +87,7 @@ Similar to the technique above, create a configuration file with the correct clu $ sqreamd -config config_file.json Using the command line parameters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Use sqreamd's command line parameters to override the default storage cluster path: @@ -97,7 +98,7 @@ Use sqreamd's command line parameters to override the default storage cluster pa .. note:: sqreamd's command line parameters' order is ``sqreamd `` Copying an existing storage cluster -===================================== +=================================== Copying an existing storage cluster to another path may be useful for testing or troubleshooting purposes. diff --git a/operational_guides/delete.rst b/operational_guides/delete.rst deleted file mode 100644 index 24ab5a218..000000000 --- a/operational_guides/delete.rst +++ /dev/null @@ -1,214 +0,0 @@ -.. _delete_guide: - -*********************** -Deleting Data -*********************** - -SQream DB supports deleting data, but it's important to understand how this works and how to maintain deleted data. - -How does deleting in SQream DB work? -======================================== - -In SQream DB, when you run a delete statement, any rows that match the delete predicate will no longer be returned when running subsequent queries. -Deleted rows are tracked in a separate location, in *delete predicates*. - -After the delete statement, a separate process can be used to reclaim the space occupied by these rows, and to remove the small overhead that queries will have until this is done. - -Some benefits to this design are: - -#. Delete transactions complete quickly - -#. The total disk footprint overhead at any time for a delete transaction or cleanup process is small and bounded (while the system still supports low overhead commit, rollback and recovery for delete transactions). - - -Phase 1: Delete ---------------------------- - -.. TODO: isn't the delete cleanup able to complete a certain amount of work transactionally, so that you can do a massive cleanup in stages? - -.. TODO: our current best practices is to use a cron job with sqream sql to run the delete cleanup. we should document how to do this, we have customers with very different delete schedules so we can give a few extreme examples and when/why you'd use them - -When a :ref:`delete` statement is run, SQream DB records the delete predicates used. These predicates will be used to filter future statements on this table until all this delete predicate's matching rows have been physically cleaned up. - -This filtering process takes full advantage of SQream's zone map feature. - -Phase 2: Clean-up --------------------- - -The cleanup process is not automatic. This gives control to the user or DBA, and gives flexibility on when to run the clean up. - -Files marked for deletion during the logical deletion stage are removed from disk. This is achieved by calling both utility function commands: ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` sequentially. - -.. note:: - * :ref:`alter_table` and other DDL operations are blocked on tables that require clean-up. See more in the :ref:`concurrency_and_locks` guide. - * If the estimated time for a cleanup processs is beyond a threshold, you will get an error message about it. The message will explain how to override this limitation and run the process anywhere. - -Notes on data deletion -========================================= - -.. note:: - * If the number of deleted records crosses the threshold defined by the ``mixedColumnChunksThreshold`` parameter, the delete operation will be aborted. - * This is intended to alert the user that the large number of deleted records may result in a large number of mixed chuncks. - * To circumvent this alert, replace XXX with the desired number of records before running the delete operation: - -.. code-block:: postgres - - set mixedColumnChunksThreshold=XXX; - - -Deleting data does not free up space ------------------------------------------ - -With the exception of a full table delete (:ref:`TRUNCATE`), deleting data does not free up disk space. To free up disk space, trigger the cleanup process. - -``SELECT`` performance on deleted rows ----------------------------------------- - -Queries on tables that have deleted rows may have to scan data that hasn't been cleaned up. -In some cases, this can cause queries to take longer than expected. To solve this issue, trigger the cleanup process. - -Use ``TRUNCATE`` instead of ``DELETE`` ---------------------------------------- -For tables that are frequently emptied entirely, consider using :ref:`truncate` rather than :ref:`delete`. TRUNCATE removes the entire content of the table immediately, without requiring a subsequent cleanup to free up disk space. - -Cleanup is I/O intensive -------------------------------- - -The cleanup process actively compacts tables by writing a complete new version of column chunks with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes. - -Cleanup operations can create significant I/O load on the database. Consider this when planning the best time for the cleanup process. - -If this is an issue with your environment, consider using ``CREATE TABLE AS`` to create a new table and then rename and drop the old table. - - -Example -============= - -Deleting values from a table ------------------------------- - -.. code-block:: psql - - farm=> SELECT * FROM cool_animals; - 1,Dog ,7 - 2,Possum ,3 - 3,Cat ,5 - 4,Elephant ,6500 - 5,Rhinoceros ,2100 - 6,\N,\N - - 6 rows - - farm=> DELETE FROM cool_animals WHERE weight > 1000; - executed - - farm=> SELECT * FROM cool_animals; - 1,Dog ,7 - 2,Possum ,3 - 3,Cat ,5 - 6,\N,\N - - 4 rows - -Deleting values based on more complex predicates ---------------------------------------------------- - -.. code-block:: psql - - farm=> SELECT * FROM cool_animals; - 1,Dog ,7 - 2,Possum ,3 - 3,Cat ,5 - 4,Elephant ,6500 - 5,Rhinoceros ,2100 - 6,\N,\N - - 6 rows - - farm=> DELETE FROM cool_animals WHERE weight > 1000; - executed - - farm=> SELECT * FROM cool_animals; - 1,Dog ,7 - 2,Possum ,3 - 3,Cat ,5 - 6,\N,\N - - 4 rows - -Identifying and cleaning up tables ---------------------------------------- - -List tables that haven't been cleaned up -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - farm=> SELECT t.table_name FROM sqream_catalog.delete_predicates dp - JOIN sqream_catalog.tables t - ON dp.table_id = t.table_id - GROUP BY 1; - cool_animals - - 1 row - -Identify predicates for clean-up -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp - JOIN sqream_catalog.tables t - ON dp.table_id = t.table_id - WHERE t.table_name = 'cool_animals'; - weight > 1000 - - 1 row - -Triggering a cleanup -^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - -- Chunk reorganization (aka SWEEP) - farm=> SELECT CLEANUP_CHUNKS('public','cool_animals'); - executed - - -- Delete leftover files (aka VACUUM) - farm=> SELECT CLEANUP_EXTENTS('public','cool_animals'); - executed - - - farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp - JOIN sqream_catalog.tables t - ON dp.table_id = t.table_id - WHERE t.table_name = 'cool_animals'; - - 0 rows - - - -Best practices for data deletion -===================================== - -* Run ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` after large ``DELETE`` operations. - -* When deleting large proportions of data from very large tables, consider running a ``CREATE TABLE AS`` operation instead, then rename and drop the original table. - -* Avoid killing ``CLEANUP_EXTENTS`` operations after they've started. - -* SQream DB is optimised for time-based data. When data is naturally ordered by a date or timestamp, deleting based on those columns will perform best. For more information, see our :ref:`time based data management guide`. - - - -.. soft update concept - -.. delete cleanup and it's properties. automatic/manual, in transaction or background - -.. automatic background gives fast delete, minimal transaction overhead, -.. small cost to queries until background reorganised - -.. when does delete use the metadata effectively - -.. more examples - diff --git a/operational_guides/delete_guide.rst b/operational_guides/delete_guide.rst new file mode 100644 index 000000000..bec0f296d --- /dev/null +++ b/operational_guides/delete_guide.rst @@ -0,0 +1,240 @@ +.. _delete_guide: + +************* +Deleting Data +************* + +When working with a table in a database, deleting data typically involves removing rows, although it can also involve removing columns. The process for deleting data involves first deleting the desired content, followed by a cleanup operation that reclaims the space previously occupied by the deleted data. This process is further explained below. + +The ``DELETE`` statement is used to remove rows that match a specified predicate, thereby preventing them from being included in subsequent queries. For example, the following statement deletes all rows in the ``cool_animals`` table where the weight of the animal is greater than 1000 weight units: + +.. code-block:: psql + + DELETE FROM cool_animals WHERE weight > 1000; + +By using the WHERE clause in the DELETE statement, you can specify a condition or predicate that determines which rows should be deleted from the table. In this example, the predicate "weight > 1000" specifies that only rows with an animal weight greater than 1000 should be deleted. + +.. contents:: + :local: + :depth: 1 + +The Deletion Process +==================== + +When you delete rows from a SQL database, the actual deletion process occurs in two steps: + +* **Marking for Deletion**: When you issue a ``DELETE`` statement to remove one or more rows from a table, the database marks these rows for deletion. These rows are not actually removed from the database immediately, but are instead temporarily ignored when you run any query. + + :: + +* **Clean-up**: Once the rows have been marked for deletion, you need to trigger a clean-up operation to permanently remove them from the database. During the clean-up process, the database frees up the disk space previously occupied by the deleted rows. To remove all files associated with the deleted rows, you can use the utility function commands ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS``. These commands should be run sequentially to ensure that these files removed from disk. + +If you want to delete all rows from a table, you can use the :ref:`TRUNCATE` command, which deletes all rows in a table and frees up the associated disk space. + + +Usage Notes +=========== + +General Notes +------------- + +* The :ref:`alter_table` command and other DDL operations are locked on tables that require clean-up. If the estimated clean-up time exceeds the permitted threshold, an error message is displayed describing how to override the threshold limitation. For more information, see :ref:`concurrency_and_locks`. + + :: + +* If the number of deleted records exceeds the threshold defined by the ``mixedColumnChunksThreshold`` parameter, the delete operation is aborted. This alerts users that the large number of deleted records may result in a large number of mixed chunks. To circumvent this alert, use the following syntax (replacing ``XXX`` with the desired number of records) before running the delete operation: + + .. code-block:: postgres + + set mixedColumnChunksThreshold=XXX; + + +Clean-Up Operations Are I/O Intensive +------------------------------------- +The clean-up process reduces table size by removing all unused space from column chunks. While this reduces query time, it is a time-costly operation occupying disk space for the new copy of the table until the operation is complete. + +.. tip:: Because clean-up operations can create significant I/O load on your database, consider using them sparingly during ideal times. + +If this is an issue with your environment, consider using ``CREATE TABLE AS`` to create a new table and then rename and drop the old table. + +Examples +======== + +To follow the examples section, create the following table: + + .. code-block:: psql + + CREATE OR REPLACE TABLE cool_animals ( + animal_id INT, + animal_name TEXT, + animal_weight FLOAT + ); + +Insert the following content: + + .. code-block:: psql + + INSERT INTO cool_animals (animal_id, animal_name, animal_weight) + VALUES + (1, 'Dog', 7), + (2, 'Possum', 3), + (3, 'Cat', 5), + (4, 'Elephant', 6500), + (5, 'Rhinoceros', 2100), + (6, NULL, NULL); + +View table content: + +.. code-block:: psql + + farm=> SELECT * FROM cool_animals; + + Return: + + animal_id | animal_name | animal_weight + ------------+------------------+-------------------- + 1 | Dog | 7 + 2 | Possum | 3 + 3 | Cat | 5 + 4 | Elephant | 6500 + 5 | Rhinoceros | 2100 + 6 | NULL | NULL + +Now you may use the following examples for: + +.. contents:: + :local: + :depth: 1 + +Deleting Rows from a Table +-------------------------- + +1. Delete rows from the table: + +.. code-block:: psql + + farm=> DELETE FROM cool_animals WHERE animal_weight > 1000; + +2. Display the table: + +.. code-block:: psql + + farm=> SELECT * FROM cool_animals; + + Return + + animal_id | animal_name | animal_weight + ------------+------------------+-------------- + 1 | Dog | 7 + 2 | Possum | 3 + 3 | Cat | 5 + 6 | NULL | NULL + + +Deleting Values Based on Complex Predicates +------------------------------------------- + +1. Delete rows from the table: + +.. code-block:: psql + + farm=> DELETE FROM cool_animals + WHERE animal_weight < 100 AND animal_name LIKE '%o%'; + +2. Display the table: + +.. code-block:: psql + + farm=> SELECT * FROM cool_animals; + + Return + + animal_id | animal_name | animal_weight + ------------+------------------+-------------------- + 3 | Cat | 5 + 4 | Elephant | 6500 + 6 | NULL | NULL + +Identifying and Cleaning Up Tables +--------------------------------------- + +Listing tables that have not been cleaned up: + +.. code-block:: psql + + farm=> SELECT t.table_name FROM sqream_catalog.delete_predicates dp + JOIN sqream_catalog.tables t + ON dp.table_id = t.table_id + GROUP BY 1; + cool_animals + + 1 row + +Identifying predicates for Clean-Up: + +.. code-block:: psql + + farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp + JOIN sqream_catalog.tables t + ON dp.table_id = t.table_id + WHERE t.table_name = 'cool_animals'; + weight > 1000 + + 1 row + + +Triggering a Clean-Up +^^^^^^^^^^^^^^^^^^^^^^ + +When running the clean-up operation, you need to specify two parameters: ``schema_name`` and ``table_name``. Note that both parameters are case-sensitive and cannot operate with upper-cased schema or table names. + +Running a ``CLEANUP_CHUNKS`` command (also known as ``SWEEP``) to reorganize the chunks: + + .. code-block:: psql + + farm=> SELECT CLEANUP_CHUNKS('',''); + +Running a ``CLEANUP_EXTENTS`` command (also known as ``VACUUM``) to delete the leftover files: + + .. code-block:: psql + + farm=> SELECT CLEANUP_EXTENTS('',''); + + +If you should want to run a clean-up operation without worrying about uppercase and lowercase letters, you can use the ``false`` flag to enable lowercase letters for both lowercase and uppercase table and schema names, such as in the following examples: + + .. code-block:: psql + + farm=> SELECT CLEANUP_CHUNKS('','', true); + + .. code-block:: psql + + farm=> SELECT CLEANUP_EXTENTS('','', true); + + +To display the table: + + .. code-block:: psql + + farm=> SELECT delete_predicate FROM sqream_catalog.delete_predicates dp + JOIN sqream_catalog.tables t + ON dp.table_id = t.table_id + WHERE t.table_name = ''; + +Best Practice +============= + + +* After running large ``DELETE`` operations, run ``CLEANUP_CHUNKS`` and ``CLEANUP_EXTENTS`` to improve performance and free up space. These commands remove empty chunks and extents, respectively, and can help prevent fragmentation of the table. + + :: + +* If you need to delete large segments of data from very large tables, consider using a ``CREATE TABLE AS`` operation instead. This involves creating a new table with the desired data and then renaming and dropping the original table. This approach can be faster and more efficient than running a large ``DELETE`` operation, especially if you don't need to preserve any data in the original table. + + :: + +* Avoid interrupting or killing ``CLEANUP_EXTENTS`` operations that are in progress. These operations can take a while to complete, especially if the table is very large or has a lot of fragmentation, but interrupting them can cause data inconsistencies or other issues. + + :: + +* SQream is optimized for time-based data, which means that data that is naturally ordered according to date or timestamp fields will generally perform better. If you need to delete rows from such tables, consider using the time-based columns in your ``DELETE`` predicates to improve performance. diff --git a/operational_guides/exporting_data.rst b/operational_guides/exporting_data.rst deleted file mode 100644 index 402887da7..000000000 --- a/operational_guides/exporting_data.rst +++ /dev/null @@ -1,15 +0,0 @@ -.. _exporting_data: - -*********************** -Exporting Data -*********************** -You can export data from SQream, which you may want to do for the following reasons: - - -* To use data in external tables. See `Working with External Data `_. -* To share data with other clients or consumers with different systems. -* To copy data into another SQream cluster. - -SQream provides the following methods for exporting data: - -* Copying data from a SQream database table or query to another file - See `COPY TO `_. \ No newline at end of file diff --git a/operational_guides/external_data.rst b/operational_guides/external_data.rst index 98d157ab2..4134457a3 100644 --- a/operational_guides/external_data.rst +++ b/operational_guides/external_data.rst @@ -1,15 +1,23 @@ .. _external_data: -********************************** +************************** Working with External Data -********************************** +************************** -SQream DB supports external data sources for use with :ref:`external_tables`, :ref:`copy_from`, and :ref:`copy_to`. +SQreamDB supports the following external data sources: -.. toctree:: - :maxdepth: 1 - :titlesonly: +:ref:`s3` - s3 - hdfs +:ref:`hdfs` + +:ref:`gcp` + +:ref:`azure` + +For more information, see the following: + +:ref:`foreign_tables` + +:ref:`copy_from` +:ref:`copy_to` \ No newline at end of file diff --git a/operational_guides/external_tables.rst b/operational_guides/foreign_tables.rst similarity index 51% rename from operational_guides/external_tables.rst rename to operational_guides/foreign_tables.rst index 005dc961f..80da99170 100644 --- a/operational_guides/external_tables.rst +++ b/operational_guides/foreign_tables.rst @@ -1,81 +1,89 @@ -.. _external_tables: +.. _foreign_tables: -*********************** -External Tables -*********************** -External tables can be used to run queries directly on data without inserting it into SQream DB first. -SQream DB supports read only external tables, so you can query from external tables, but you cannot insert to them, or run deletes or updates on them. -Running queries directly on external data is most effectively used for things like one off querying. If you will be repeatedly querying data, the performance will usually be better if you insert the data into SQream DB first. -Although external tables can be used without inserting data into SQream DB, one of their main use cases is to help with the insertion process. An insert select statement on an external table can be used to insert data into SQream using the full power of the query engine to perform ETL. +************** +Foreign Tables +************** -.. contents:: In this topic: +Foreign tables can be used to run queries directly on data without inserting it into SQreamDB first. +SQreamDB supports read-only foreign tables so that you can query from foreign tables, but you cannot insert to them, or run deletes or updates on them. + +Running queries directly on foreign data is most effectively used for one-off querying. If you are repeatedly querying data, the performance will usually be better if you insert the data into SQreamDB first. + +Although foreign tables can be used without inserting data into SQreamDB, one of their main use cases is to help with the insertion process. An insert select statement on a foreign table can be used to insert data into SQream using the full power of the query engine to perform ETL. + +.. contents:: :local: + :depth: 1 -What kind of data is supported? -===================================== -SQream DB supports external tables over: +Supported Data Formats +====================== -* text files (e.g. CSV, PSV, TSV) -* ORC +SQreamDB supports foreign tables using the following file formats: + +* Text: CSV, TSV, and PSV * Parquet +* ORC +* Avro +* JSON + +Supported Data Staging +====================== -What kind of data staging is supported? -============================================ -SQream DB can stage data from: +SQream can stage data from: -* a local filesystem (e.g. ``/mnt/storage/....``) -* :ref:`s3` buckets (e.g. ``s3://pp-secret-bucket/users/*.parquet``) -* :ref:`hdfs` (e.g. ``hdfs://hadoop-nn.piedpiper.com/rhendricks/*.csv``) +* A local filesystem (e.g. ``/mnt/storage/....``) +* :ref:`s3` buckets +* :ref:`hdfs` -Using external tables - a practical example -============================================== -Use an external table to stage data before loading from CSV, Parquet or ORC files. +Using Foreign Tables +==================== -Planning for data staging --------------------------------- -For the following examples, we will want to interact with a CSV file. Here's a peek at the table contents: +Use a foreign table to stage data before loading from CSV, Parquet or ORC files. -.. csv-table:: nba.csv +Planning for Data Staging +------------------------- - :file: nba-t10.csv - :widths: auto - :header-rows: 1 +For the following examples, we will interact with a CSV file. The file is stored on :ref:`s3`, at ``s3://sqream-demo-data/nba_players.csv``. -We will make note of the file structure, to create a matching ``CREATE_EXTERNAL_TABLE`` statement. +We will make note of the file structure, to create a matching ``CREATE FOREIGN TABLE`` statement. -Creating the external table ------------------------------ -Based on the source file structure, we we :ref:`create an external table` with the appropriate structure, and point it to the file. +Creating a Foreign Table +------------------------ + +Based on the source file structure, we :ref:`create a foreign table` with the appropriate structure, and point it to the file. .. code-block:: postgres - CREATE EXTERNAL TABLE nba + CREATE foreign table nba ( - Name varchar(40), - Team varchar(40), + Name varchar, + Team varchar, Number tinyint, - Position varchar(2), + Position varchar, Age tinyint, - Height varchar(4), + Height varchar, Weight real, - College varchar(40), + College varchar, Salary float ) - USING FORMAT CSV -- Text file - WITH PATH 's3://sqream-demo-data/nba_players.csv' - RECORD DELIMITER '\r\n'; -- DOS delimited file + WRAPPER csv_fdw + OPTIONS + ( LOCATION = 's3://sqream-demo-data/nba_players.csv', + DELIMITER = '\r\n' -- DOS delimited file + ); + +The file format in this case is CSV, and it is stored as an Amazon Web Services object (if the path is on :ref:`hdfs`, change the URI accordingly). -The file format in this case is CSV, and it is stored as an :ref:`s3` object (if the path is on :ref:`hdfs`, change the URI accordingly). We also took note that the record delimiter was a DOS newline (``\r\n``). -Querying external tables ------------------------------- -Let's peek at the data from the external table: +Querying Foreign Tables +----------------------- .. code-block:: psql - t=> SELECT * FROM nba LIMIT 10; + SELECT * FROM nba LIMIT 10; + name | team | number | position | age | height | weight | college | salary --------------+----------------+--------+----------+-----+--------+--------+-------------------+--------- Avery Bradley | Boston Celtics | 0 | PG | 25 | 6-2 | 180 | Texas | 7730337 @@ -89,16 +97,17 @@ Let's peek at the data from the external table: Terry Rozier | Boston Celtics | 12 | PG | 22 | 6-2 | 190 | Louisville | 1824360 Marcus Smart | Boston Celtics | 36 | PG | 22 | 6-4 | 220 | Oklahoma State | 3431040 -Modifying data from staging -------------------------------- -One of the main reasons for staging data is to examine the contents and modify them before loading them. -Assume we are unhappy with weight being in pounds, because we want to use kilograms instead. We can apply the transformation as part of a query: +Modifying Data from Staging +--------------------------- + +One of the main reasons for staging data is to examine the content and modify it before loading. +Assume we are unhappy with weight being in pounds because we want to use kilograms instead. We can apply the transformation as part of a query: .. code-block:: psql - t=> SELECT name, team, number, position, age, height, (weight / 2.205) as weight, college, salary - . FROM nba - . ORDER BY weight; + SELECT name, team, number, position, age, height, (weight / 2.205) as weight, college, salary + FROM nba + ORDER BY weight; name | team | number | position | age | height | weight | college | salary -------------------------+------------------------+--------+----------+-----+--------+----------+-----------------------+--------- @@ -113,23 +122,23 @@ Assume we are unhappy with weight being in pounds, because we want to use kilogr Cristiano Felicio | Chicago Bulls | 6 | PF | 23 | 6-10 | 124.7166 | | 525093 [...] -Now, if we're happy with the results, we can convert the staged external table to a standard table +Now, if we're happy with the results, we can convert the staged foreign table to a standard table -Converting an external table to a standard database table ---------------------------------------------------------------- +Converting a Foreign Table to a Standard Database Table +------------------------------------------------------- -:ref:`create_table_as` can be used to materialize an external table into a regular table. +:ref:`create_table_as` can be used to materialize a foreign table into a regular table. -.. tip:: If you intend to use the table multiple times, convert the external table to a standard table. +.. tip:: If you intend to use the table multiple times, convert the foreign table to a standard table. .. code-block:: psql - t=> CREATE TABLE real_nba AS - . SELECT name, team, number, position, age, height, (weight / 2.205) as weight, college, salary - . FROM nba - . ORDER BY weight; - executed - t=> SELECT * FROM real_nba LIMIT 5; + CREATE TABLE real_nba AS + SELECT name, team, number, position, age, height, (weight / 2.205) as weight, college, salary + FROM nba + ORDER BY weight; + + SELECT * FROM real_nba LIMIT 5; name | team | number | position | age | height | weight | college | salary -----------------+------------------------+--------+----------+-----+--------+----------+-------------+--------- @@ -139,17 +148,20 @@ Converting an external table to a standard database table Jusuf Nurkic | Denver Nuggets | 23 | C | 21 | 7-0 | 126.9841 | | 1842000 Andre Drummond | Detroit Pistons | 0 | C | 22 | 6-11 | 126.5306 | Connecticut | 3272091 -Error handling and limitations -================================== -* Error handling in external tables is limited. Any error that occurs during source data parsing will result in the statement aborting. +Error Handling and Limitations +============================== + +* Error handling in foreign tables is limited. Any error that occurs during source data parsing will result in the statement aborting. * - External tables are logical and do not contain any data, their structure is not verified or enforced until a query uses the table. + Foreign tables are logical and do not contain any data, their structure is not verified or enforced until a query uses the table. For example, a CSV with the wrong delimiter may cause a query to fail, even though the table has been created successfully: .. code-block:: psql - t=> SELECT * FROM nba; - master=> select * from nba; + SELECT * FROM nba; + + SELECT * FROM nba; Record delimiter mismatch during CSV parsing. User defined line delimiter \n does not match the first delimiter \r\n found in s3://sqream-demo-data/nba.csv -* Since the data for an external table is not stored in SQream DB, it can be changed or removed at any time by an external process. As a result, the same query can return different results each time it runs against an external table. Similarly, a query might fail if the external data is moved, removed, or has changed structure. + +* Since the data for a foreign table is not stored in SQreamDB, it can be changed or removed at any time by an external process. As a result, the same query can return different results each time it runs against a foreign table. Similarly, a query might fail if the external data is moved, removed, or has changed structure. diff --git a/operational_guides/hardware_guide.rst b/operational_guides/hardware_guide.rst deleted file mode 100644 index d66797223..000000000 --- a/operational_guides/hardware_guide.rst +++ /dev/null @@ -1,202 +0,0 @@ -.. _hardware_guide: - -*********************** -Hardware Guide -*********************** - -This guide describes the SQream reference architecture, emphasizing the benefits to the technical audience, and provides guidance for end-users on selecting the right configuration for a SQream installation. - - -.. rubric:: Need help? - -This page is intended as a "reference" to suggested hardware. However, different workloads require different solution sizes. SQream's experienced customer support has the experience to advise on these matters to ensure the best experience. - -Visit `SQream's support portal `_ for additional support. - -A SQream Cluster -============================ - -SQream recommends rackmount servers by server manufacturers Dell, Lenovo, HP, Cisco, Supermicro, IBM, and others. - -A typical SQream cluster includes one or more nodes, consisting of - -* Two-socket enterprise processors, like the Intel® Xeon® Gold processor family or an IBM® POWER9 processors, providing the high performance required for compute-bound database workloads. - -* NVIDIA Tesla GPU accelerators, with up to 5,120 CUDA and Tensor cores, running on PCIe or fast NVLINK busses, delivering high core count, and high-throughput performance on massive datasets - -* High density chassis design, offering between 2 and 4 GPUs in a 1U, 2U, or 3U package, for best-in-class performance per cm\ :sup:`2`. - -Single-Node Cluster Example ------------------------------------ - -A single-node SQream cluster can handle between 1 and 8 concurrent users, with up to 1PB of data storage (when connected via NAS). - -An average single-node cluster can be a rackmount server or workstation, containing the following components: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Component - - Type - * - Server - - Dell R750, Dell R940xa, HP ProLiant DL380 Gen10 or similar (Intel only) - * - Processor - - 2x Intel Xeon Gold 6240 (18C/36HT) 2.6GHz or similar - * - RAM - - 1.5 TB - * - Onboard storage - - - * 2x 960GB SSD 2.5in hot plug for OS, RAID1 - * 2x 2TB SSD or NVMe, for temporary spooling, RAID1 - * 10x 3.84TB SSD 2.5in Hot plug for storage, RAID6 - - * - GPU - - 2x A100 NVIDIA - * - Operating System - - Red Hat Enterprise Linux v7.x or CentOS v7.x or Amazon Linux - -.. note:: If you are using internal storage, your volumes must be formatted as xfs. - -In this system configuration, SQream can store about 200TB of raw data (assuming average compression ratio and ~50TB of usable raw storage). - -If a NAS is used, the 14x SSD drives can be omitted, but SQream recommends 2TB of local spool space on SSD or NVMe drives. - -Multi-Node Cluster Example ------------------------------------ - -Multi-node clusters can handle any number of concurrent users. A typical SQream cluster relies on several GPU-enabled servers and shared storage connected over a network fabric, such as InfiniBand EDR, 40GbE, or 100GbE. - -The following table shows SQream's recommended hardware specifications: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Component - - Type - * - Server - - Dell R750, Dell R940xa, HP ProLiant DL380 Gen10 or similar (Intel only) - * - Processor - - 2x Intel Xeon Gold 6240 (18C/36HT) 2.6GHz or similar - * - RAM - - 2 TB - * - Onboard storage - - - * 2x 960GB SSD 2.5in hot plug for OS, RAID1 - * 2x 2TB SSD or NVMe, for temporary spooling, RAID1 - * - External Storage - - - * Mellanox Connectx5/6 100G NVIDIA Network Card (if applicable) or other high speed network card minimum 40G compatible to customer’s infrastructure - * 50 TB (NAS connected over GPFS, Lustre, or NFS) GPFS recommended - * - GPU - - 2x A100 NVIDIA - * - Operating System - - Red Hat Enterprise Linux v7.x or CentOS v7.x or Amazon Linux - -.. note:: With a NAS connected over GPFS, Lustre, or NFS, each SQream worker can read data at up to 5GB/s. - -SQream Studio Server Example ------------------------------------ -The following table shows SQream's recommended Studio server specifications: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Component - - Type - * - Server - - Physical or virtual machine - * - Processor - - 1x Intel Core i7 - * - RAM - - 16 GB - * - Onboard storage - - 50 GB SSD 2.5in Hot plug for OS, RAID1 - * - Operating System - - Red Hat Enterprise Linux v7.x or CentOS v7.x - - - - -Cluster Design Considerations -==================================== - -* In a SQream installation, the storage and compute are logically separated. While they may reside on the same machine in a standalone installation, they may also reside on different hosts, providing additional flexibility and scalability. - - :: - -* SQream uses all resources in a machine, including CPU, RAM, and GPU to deliver the best performance. At least 256GB of RAM per physical GPU is recommended. - - :: - -* Local disk space is required for good temporary spooling performance, particularly when performing intensive operations exceeding the available RAM, such as sorting. SQream recommends an SSD or NVMe drive in RAID 1 configuration with about twice the RAM size available for temporary storage. This can be shared with the operating system drive if necessary. - - :: - -* When using SAN or NAS devices, SQream recommends approximately 5GB/s of burst throughput from storage per GPU. - -Balancing Cost and Performance --------------------------------- -Prior to designing and deploying a SQream cluster, a number of important factors must be considered. - -The **Balancing Cost and Performance** section provides a breakdown of deployment details to ensure that this installation exceeds or meets the stated requirements. The rationale provided includes the necessary information for modifying configurations to suit the customer use-case scenario, as shown in the following table: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Component - - Value - * - Compute - CPU - - Balance price and performance - * - Compute – GPU - - Balance price with performance and concurrency - * - Memory – GPU RAM - - Balance price with concurrency and performance. - * - Memory - RAM - - Balance price and performance - * - Operating System - - Availability, reliability, and familiarity - * - Storage - - Balance price with capacity and performance - * - Network - - Balance price and performance - -CPU Compute -------------- - -SQream relies on multi-core Intel Gold Xeon processors or IBM POWER9 processors, and recommends a dual-socket machine populated with CPUs with 18C/36HT or better. While a higher core count may not necessarily affect query performance, more cores will enable higher concurrency and better load performance. - -GPU Compute and RAM -------------------------- - -The NVIDIA Tesla range of high-throughput GPU accelerators provides the best performance for enterprise environments. Most cards have ECC memory, which is crucial for delivering correct results every time. SQream recommends the NVIDIA Tesla V100 32GB or NVIDIA Tesla A100 40GB GPU for best performance and highest concurrent user support. - -GPU RAM, sometimes called GRAM or VRAM, is used for processing queries. It is possible to select GPUs with less RAM, like the NVIDIA Tesla V100 16GB or P100 16GB, or T4 16GB. However, the smaller GPU RAM results in reduced concurrency, as the GPU RAM is used extensively in operations like JOINs, ORDER BY, GROUP BY, and all SQL transforms. - -RAM --------- - -SQream requires using **Error-Correcting Code memory (ECC)**, standard on most enterprise servers. Large amounts of memory are required for improved performance for heavy external operations, such as sorting and joining. - -SQream recommends at least 256GB of RAM per GPU on your machine. - -Operating System ---------------------- -SQream can run on the following 64-bit Linux operating systems: - - * Red Hat Enterprise Linux (RHEL) v7 - * CentOS v7 - * Amazon Linux 2018.03 - * Ubuntu v16.04 LTS, v18.04 LTS - * Other Linux distributions may be supported via nvidia-docker - -Storage ------------ -For clustered scale-out installations, SQream relies on NAS/SAN storage. For stand-alone installations, SQream relies on redundant disk configurations, such as RAID 5, 6, or 10. These RAID configurations replicate blocks of data between disks to avoid data loss or system unavailability. - -SQream recommends using enterprise-grade SAS SSD or NVMe drives. For a 32-user configuration, the number of GPUs should roughly match the number of users. SQream recommends 1 Tesla V100 or A100 GPU per 2 users, for full, uninterrupted dedicated access. - -Download the full `SQream Reference Architecture `_ document. diff --git a/operational_guides/hdfs.rst b/operational_guides/hdfs.rst deleted file mode 100644 index 274926e36..000000000 --- a/operational_guides/hdfs.rst +++ /dev/null @@ -1,252 +0,0 @@ -.. _hdfs: - -.. _back_to_top_hdfs: - -Using SQream in an HDFS Environment -======================================= - -.. _configuring_an_hdfs_environment_for_the_user_sqream: - -Configuring an HDFS Environment for the User **sqream** ----------------------------------------------------------- - -This section describes how to configure an HDFS environment for the user **sqream** and is only relevant for users with an HDFS environment. - -**To configure an HDFS environment for the user sqream:** - -1. Open your **bash_profile** configuration file for editing: - - .. code-block:: console - - $ vim /home/sqream/.bash_profile - -.. - Comment: - see below; do we want to be a bit more specific on what changes we're talking about? - - .. code-block:: console - - $ #PATH=$PATH:$HOME/.local/bin:$HOME/bin - - $ #export PATH - - $ # PS1 - $ #MYIP=$(curl -s -XGET "http://ip-api.com/json" | python -c 'import json,sys; jstr=json.load(sys.stdin); print jstr["query"]') - $ #PS1="\[\e[01;32m\]\D{%F %T} \[\e[01;33m\]\u@\[\e[01;36m\]$MYIP \[\e[01;31m\]\w\[\e[37;36m\]\$ \[\e[1;37m\]" - - $ SQREAM_HOME=/usr/local/sqream - $ export SQREAM_HOME - - $ export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk - $ export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop - $ export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob` - $ export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native - $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR - - - $ PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin - $ export PATH - -3. Verify that the edits have been made: - - .. code-block:: console - - source /home/sqream/.bash_profile - -4. Check if you can access Hadoop from your machine: - - .. code-block:: console - - $ hadoop fs -ls hdfs://:8020/ - -.. - Comment: - - **NOTICE:** If you cannot access Hadoop from your machine because it uses Kerberos, see `Connecting a SQream Server to Cloudera Hadoop with Kerberos `_ - - -5. Verify that an HDFS environment exists for SQream services: - - .. code-block:: console - - $ ls -l /etc/sqream/sqream_env.sh - -.. _step_6: - - -6. If an HDFS environment does not exist for SQream services, create one (sqream_env.sh): - - .. code-block:: console - - $ #!/bin/bash - - $ SQREAM_HOME=/usr/local/sqream - $ export SQREAM_HOME - - $ export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk - $ export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop - $ export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob` - $ export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native - $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR - - - $ PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin - $ export PATH - -:ref:`Back to top ` - - -.. _authenticate_hadoop_servers_that_require_kerberos: - -Authenticating Hadoop Servers that Require Kerberos ---------------------------------------------------- - -If your Hadoop server requires Kerberos authentication, do the following: - -1. Create a principal for the user **sqream**. - - .. code-block:: console - - $ kadmin -p root/admin@SQ.COM - $ addprinc sqream@SQ.COM - -2. If you do not know yor Kerberos root credentials, connect to the Kerberos server as a root user with ssh and run **kadmin.local**: - - .. code-block:: console - - $ kadmin.local - - Running **kadmin.local** does not require a password. - -3. If a password is not required, change your password to **sqream@SQ.COM**. - - .. code-block:: console - - $ change_password sqream@SQ.COM - -4. Connect to the hadoop name node using ssh: - - .. code-block:: console - - $ cd /var/run/cloudera-scm-agent/process - -5. Check the most recently modified content of the directory above: - - .. code-block:: console - - $ ls -lrt - -5. Look for a recently updated folder containing the text **hdfs**. - -The following is an example of the correct folder name: - - .. code-block:: console - - cd -hdfs- - - This folder should contain a file named **hdfs.keytab** or another similar .keytab file. - - - -.. - Comment: - Does "something" need to be replaced with "file name" - - -6. Copy the .keytab file to user **sqream's** Home directory on the remote machines that you are planning to use Hadoop on. - -7. Copy the following files to the **sqream sqream@server:/hdfs/hadoop/etc/hadoop:** directory: - - * core-site.xml - * hdfs-site.xml - -8. Connect to the sqream server and verify that the .keytab file's owner is a user sqream and is granted the correct permissions: - - .. code-block:: console - - $ sudo chown sqream:sqream /home/sqream/hdfs.keytab - $ sudo chmod 600 /home/sqream/hdfs.keytab - -9. Log into the sqream server. - -10. Log in as the user **sqream**. - -11. Navigate to the Home directory and check the name of a Kerberos principal represented by the following .keytab file: - - .. code-block:: console - - $ klist -kt hdfs.keytab - - The following is an example of the correct output: - - .. code-block:: console - - $ sqream@Host-121 ~ $ klist -kt hdfs.keytab - $ Keytab name: FILE:hdfs.keytab - $ KVNO Timestamp Principal - $ ---- ------------------- ------------------------------------------------------ - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM - -12. Verify that the hdfs service named **hdfs/nn1@SQ.COM** is shown in the generated output above. - -13. Run the following: - - .. code-block:: console - - $ kinit -kt hdfs.keytab hdfs/nn1@SQ.COM - - 13. Check the output: - - .. code-block:: console - - $ klist - - The following is an example of the correct output: - - .. code-block:: console - - $ Ticket cache: FILE:/tmp/krb5cc_1000 - $ Default principal: sqream@SQ.COM - $ - $ Valid starting Expires Service principal - $ 09/16/2020 13:44:18 09/17/2020 13:44:18 krbtgt/SQ.COM@SQ.COM - -14. List the files located at the defined server name or IP address: - - .. code-block:: console - - $ hadoop fs -ls hdfs://:8020/ - -15. Do one of the following: - - * If the list below is output, continue with Step 16. - * If the list is not output, verify that your environment has been set up correctly. - -If any of the following are empty, verify that you followed :ref:`Step 6 ` in the **Configuring an HDFS Environment for the User sqream** section above correctly: - - .. code-block:: console - - $ echo $JAVA_HOME - $ echo $SQREAM_HOME - $ echo $CLASSPATH - $ echo $HADOOP_COMMON_LIB_NATIVE_DIR - $ echo $LD_LIBRARY_PATH - $ echo $PATH - -16. Verify that you copied the correct keytab file. - -17. Review this procedure to verify that you have followed each step. - -:ref:`Back to top ` \ No newline at end of file diff --git a/operational_guides/health_monitoring.rst b/operational_guides/health_monitoring.rst new file mode 100644 index 000000000..a476e934b --- /dev/null +++ b/operational_guides/health_monitoring.rst @@ -0,0 +1,575 @@ +.. _health_monitoring: + +***************** +Health Monitoring +***************** + +The Health Monitoring service enhances observability, enabling shorter investigation times and facilitating both high-level and detailed drill-downs. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +It is essential that you follow these prerequisites: + +* :ref:`Log files` must be saved as ``JSON`` files + +* Configure `Grafana authentication `_, even if you're using `LDAP `_ for authentication management + + +Installation +============ + +All Health Monitoring-related installations are on a stand-alone installations. + +Grafana +------- + +Grafana is an open-source analytics and monitoring platform designed for visualizing and analyzing real-time and historical data through customizable dashboards. It offers both an open-source version and an enterprise edition, catering to varying needs and scales of deployment. + +For more details, refer to the `Grafana specification `_. + +.. note:: Log in as root user. + +Disabling SELinux +~~~~~~~~~~~~~~~~~ + +#. Check for the current SELinux status: + + .. code-block:: console + + getenforce + +#. Open the SELinux configuration file: + + .. code-block:: console + + vim /etc/sysconfig/selinux + +#. Configure ``SELINUX`` to be ``disabled``: + + .. code-block:: console + + SELINUX=disabled + +#. Reboot your system: + + .. code-block:: console + + reboot + +Installing Grafana via YUM Repository +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +#. Create a repo file: + + .. code-block:: console + + vim /etc/yum.repos.d/grafana.repo + +#. Add the following flags to the repo file: + + .. code-block:: console + + [grafana] + name=grafana + baseurl=https://packages.grafana.com/oss/rpm + repo_gpgcheck=1 + enabled=1 + gpgcheck=1 + gpgkey=https://packages.grafana.com/gpg.key + sslverify=1 + sslcacert=/etc/pki/tls/certs/ca-bundle.crt + +#. Install Grafana + + .. code-block:: console + + sudo yum install grafana + + The installed package performs the following actions: + + * Installs the Grafana server binary at ``/usr/sbin/grafana-server`` + * Copies the init.d script to ``/etc/init.d/grafana-server`` + * Places the default configuration file in ``/etc/sysconfig/grafana-server`` + * Copies the main configuration file to ``/etc/grafana/grafana.ini`` + * Installs the systemd service file (if systemd is supported) as ``grafana-server.service`` + * By default, logs are written to ``/var/log/grafana/grafana.log`` + +#. Install free type and urw fonts: + + .. code-block:: console + + yum install fontconfig + yum install freetype* + yum install urw-fonts + +Enabling the Grafana Service +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +#. Check for the service status: + + .. code-block:: console + + systemctl status grafana-server + +#. If not active, start the service: + + .. code-block:: console + + systemctl start grafana-server + +#. Enable the Grafana service on system boot: + + .. code-block:: console + + systemctl enable grafana-server.service + +Modifying your Firewall +~~~~~~~~~~~~~~~~~~~~~~~ + +#. Enabling the Grafana port: + + .. code-block:: console + + firewall-cmd --zone=public --add-port=3000/tcp --permanent + +#. Reload Firewall service: + + .. code-block:: console + + firewall-cmd --reload + +Prometheus +---------- + +Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. Prometheus can be used to scrape and store metrics, enabling real-time monitoring, alerting, and analysis of performance and health. + +Your sqream installation includes a Prometheus ``yml`` file. + +#. Download Prometheus. + +#. Set the YML path: + + .. code-block:: console + + PROMETHEUS_YML_PATH=/ymls/prometheus.yml + +#. Run the following script: + + .. code-block:: console + + Prometheus_Server_install () { + echo "Prometheus_Server_install" + sudo useradd --no-create-home --shell /bin/false prometheus + sudo mkdir /etc/prometheus + sudo mkdir /var/lib/prometheus + sudo touch /etc/prometheus/prometheus.yml + cat <:9256 + - :9256 + - job_name: 'nvidia' + scrape_interval: 5s + static_configs: + - targets: + - :9445 + - :9445 + - job_name: 'nodes' + scrape_interval: 5s + static_configs: + - targets: + - :9100 + - :9100 + EOF + # Assign ownership of the files above to prometheus user + sudo chown -R prometheus:prometheus /etc/prometheus + sudo chown prometheus:prometheus /var/lib/prometheus + + # Download prometheus and copy utilities to where they should be in the filesystem + #VERSION=2.2.1 + #VERSION=$(curl https://raw.githubusercontent.com/prometheus/prometheus/master/VERSION) + #wget https://github.com/prometheus/prometheus/releases/download/v2.31.1/prometheus-2.31.1.linux-amd64.tar.gz + wget ftp://drivers:drivers11@ftp.sq.l/IT-Scripts+Packages/prometheus-2.31.1.linux-amd64.tar.gz + + tar xvzf prometheus-2.31.1.linux-amd64.tar.gz + + sudo cp prometheus-2.31.1.linux-amd64/prometheus /usr/local/bin/ + sudo cp prometheus-2.31.1.linux-amd64/promtool /usr/local/bin/ + sudo cp -r prometheus-2.31.1.linux-amd64/consoles /etc/prometheus + sudo cp -r prometheus-2.31.1.linux-amd64/console_libraries /etc/prometheus + + # Assign the ownership of the tools above to prometheus user + sudo chown -R prometheus:prometheus /etc/prometheus/consoles + sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries + sudo chown prometheus:prometheus /usr/local/bin/prometheus + sudo chown prometheus:prometheus /usr/local/bin/promtool + + # Populate configuration files + #cat ./prometheus/prometheus.yml | sudo tee /etc/prometheus/prometheus.yml + #cat ./prometheus/prometheus.rules.yml | sudo tee /etc/prometheus/prometheus.rules.yml + cat < + User=root + Group= + + [Install] + WantedBy=multi-user.target + +#. Reload systemd to recognize the new service: + + .. code-block:: console + + systemctl daemon-reload + +#. Restart the Promtail service: + + .. code-block:: console + + sudo systemctl restart promtail + +Exporters +--------- + +An Exporter is a software component that gathers metrics from various sources (such as hardware, software, or services) and exposes them in a format that Prometheus can scrape and store. + +#. Download `Exporters`_. + +#. Install Exporters: + + .. code-block:: console + + rpm -i + +#. Reload your system: + + .. code-block:: console + + sudo systemctl daemon-reload + +#. Restart Exporters service: + + .. code-block:: console + + sudo systemctl restart nvidia_gpu_exporter + +CPU Exporter +~~~~~~~~~~~~ + +#. Download the `CPU Exporter `_. + +#. Extract package content: + + .. code-block:: console + + tar -xvf + +#. Move the ``node_exporter`` binary to the ``/usr/bin directory``: + + .. code-block:: console + + sudo mv /node_exporter /usr/bin + +#. Open the ``/etc/systemd/system/node_exporter.service`` file: + + .. code-block:: console + + sudo vim /etc/systemd/system/node_exporter.service + +Add the following to the service file: + + .. code-block:: console + + [Unit] + Description=Node Exporter + Wants=network-online.target + After=network-online.target + + [Service] + User=prometheus + Group=prometheus + Restart=always + SyslogIdentifier=prometheus + ExecStart=/usr/bin/node_exporter + + [Install] + WantedBy=default.target + +#. Reload the **systemd** manager configuration: + + .. code-block:: console + + sudo systemctl daemon-reload + +#. Restart the **Node Exporter** service managed by **systemd** + + .. code-block:: console + + sudo systemctl restart node_exporter + +Process Exporter +~~~~~~~~~~~~~~~~ + +#. (Prometheus Exporter installation)-Slavi + +#. Start the Exporter: + + .. code-block:: console + + /usr/bin/process-exporter --config.path /etc/process-exporter/all.yaml --web.listen-address=:9256 &> process_exporter.out & + +Deployment +========== + +Grafana +------- + +#. Access the Grafana web interface by entering your server IP or host name to the following URL: + + .. code-block:: console + + http://:3000/ + +#. Type in ``admin`` for both user name and password. + +#. Change your password. + +#. Go to **Data Sources** and choose **prometheus**. + +#. Go to **Data Sources** and choose **loki**. + +#. Set **URL** as your Prometheus server ip. + +#. Go to **Dashboards** and choose **Import**. + +#. Import dashboards one by one. + +Using the Monitor Service +========================= + +The Monitor service package includes two files (which must be placed in the same folder): + +* ``monitor_service`` (an executable) +* ``monitor_input.json`` + +Configuring the Monitor Service Worker +-------------------------------------- + +Before running the monitor service worker, ensure the following Sqream configuration flags are properly set: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag + - Configuration File + - Description + * - ``"cudaMemQuota": 0`` + - :ref:`Worker configuration file` + - This setting disables GPU memory usage for the monitor service. Consequently, the Worker must be a non-GPU Worker to avoid exceptions from the monitor service. + * - ``"initialSubscribedServices": "monitor"`` + - :ref:`Worker configuration file` + - This configuration specifies that the monitor service should run on a non-GPU Worker. To avoid mixing with GPU Worker processes, the monitor service is set to operate on a designated non-GPU Worker. By default, it runs under the service name ``monitor``, but this can be adjusted if needed. + * - ``"enableNvprofMarkers" : false`` + - :ref:`Cluster and session configuration file` + - Enabling this flag while using a non-GPU Worker results in exceptions. Ensure this flag is turned off to avoid issues since there are no GPU instances involved. + +Execution Arguments +------------------- + +When executing the Monitor service, you can configure the following flags: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Flag + - Type + - Description + - State + - Default + * - ``-h``, ``--help`` + - option + - + - Shows help message + - + * - ``--host`` + - string + - The SQreamDB host address + - Optional + - ``localhost`` + * - ``--port`` + - integer + - The SQreamDB port number + - Optional + - ``5000`` + * - ``--database`` + - string + - The SQreamDB database name + - Optional + - ``master`` + * - ``--username`` + - string + - The SQreamDB username + - Mandatory + - ``sqream`` + * - ``--password`` + - string + - The SQreamDB password + - Mandatory + - ``sqream`` + * - ``--clustered`` + - option + - An option if the ``server_picker`` is running + - Optional + - ``False`` + * - ``--service`` + - string + - The SQreamDB service name + - Optional + - ``monitor`` + * - ``--loki_host`` + - string + - The Loki instance host address + - Optional + - ``localhost`` + * - ``--loki_port`` + - integer + - The Loki port number + - Optional + - ``3100`` + * - ``--log_file_path`` + - string + - The path to where log files are saved + - Optional + - NA + * - ``--metrics_json_path`` + - string + - The path to where the ``monitor_input.json`` file is stored + - Optional + - ```` + +Example +~~~~~~~ + +Execution example: + +.. code-block:: console + + ./monitor_service --username=sqream --password=sqream --host=1.2.3.4 --port=2711 --service=monitor --loki_host=1.2.3.5 --loki_port=3100 --metrics_json_path='/home/arielw/monitor_service/monitor_input.json' + +Monitor Service Output Example +------------------------------ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Type + - Color + * - Information about monitor service triggering + - Blue + * - Successful insertion + - Green + * - Error + - Red + +|monitor_service_example| + +.. |monitor_service_example| image:: /_static/images/monitor_service_example.png + :align: middle + + diff --git a/operational_guides/index.rst b/operational_guides/index.rst index b7ea1502d..e5f640a59 100644 --- a/operational_guides/index.rst +++ b/operational_guides/index.rst @@ -1,8 +1,9 @@ .. _operational_guides: -********************************** +****************** Operational Guides -********************************** +****************** + The **Operational Guides** section describes processes that SQream users can manage to affect the way their system operates, such as creating storage clusters and monitoring query performance. This section summarizes the following operational guides: @@ -13,15 +14,18 @@ This section summarizes the following operational guides: :titlesonly: access_control + accelerating_filtered_queries_with_metadata_partitions creating_or_cloning_a_storage_cluster external_data - external_tables - exporting_data + foreign_tables + health_monitoring + delete_guide logging + query_split monitoring_query_performance security saved_queries - seeing_system_objects_as_ddl - configuration optimization_best_practices - hardware_guide + oracle_migration + + \ No newline at end of file diff --git a/operational_guides/logging.rst b/operational_guides/logging.rst index a40e08601..6b6dbc5ab 100644 --- a/operational_guides/logging.rst +++ b/operational_guides/logging.rst @@ -1,13 +1,15 @@ .. _logging: -*********************** +******* Logging -*********************** +******* Locating the Log Files -========================== +====================== -The :ref:`storage cluster` contains a ``logs`` directory. Each worker produces a log file in its own directory, which can be identified by the worker's hostname and port. +The ``logs`` directory path is controlled by a ``DefaultPathToLogs`` cluster flag (legacy config). By default it is set to ``~/tmp_log``, the best practice is to set it to ``/logs``. + +Each worker produces a log file in its own directory, which can be identified by the worker's hostname and port. .. TODO: expand this by giving some use caes for working with log files directly in sqream (troubleshooting, performance analysis, monitoring, that kind of thing. Stick to things customers actually use and/or we instruct them to do with the logs, not theoretical things they could do with the logs @@ -23,10 +25,14 @@ The worker logs contain information messages, warnings, and errors pertaining to * Statement execution success / failure * Statement execution statistics +.. _log_structure: + Log Structure and Contents ---------------------------------- +-------------------------- -The log is a CSV, with several fields. +By default, logs are saved as ``CSV`` files. To configure your log files to be saved as ``JSON`` instead, use the ``logFormat`` flag in your :ref:`legacy config file`. + +For effective :ref:`health_monitoring`, it's essential that logs are saved in ``JSON`` format, as Health Monitoring does not support ``CSV`` files. If your current logs are in ``CSV`` format and you require RCA, it's advisable to configure your logs to be saved in both ``CSV`` and ``JSON`` formats as outlined above. .. list-table:: Log fields :widths: auto @@ -84,6 +90,10 @@ The log is a CSV, with several fields. - Warnings * - ``INFO`` - Information and statistics + * - ``DEBUG`` + - Information helpful for debugging + * - ``TRACE`` + - In-depth information helpful for debugging, such as tracing system function executions and identifying specific error conditions or performance issues. .. _message_type: @@ -105,7 +115,7 @@ The log is a CSV, with several fields. - ``INFO`` - Statement passed to another worker for execution - - * ``""Reconstruct query before parsing"`` + * ``"Reconstruct query before parsing"`` * ``"SELECT * FROM nba WHERE ""Team"" NOT LIKE ""Portland%%"""`` (statement preparing on node) * - ``4`` - ``INFO`` @@ -192,7 +202,7 @@ The log is a CSV, with several fields. - ``"Server shutdown"`` Log-Naming ---------------------------- +---------- Log file name syntax @@ -211,10 +221,10 @@ See the :ref:`log_rotation` below for information about controlling this setting Log Control and Maintenance -====================================== +=========================== Changing Log Verbosity --------------------------- +---------------------- A few configuration settings alter the verbosity of the logs: @@ -240,7 +250,7 @@ A few configuration settings alter the verbosity of the logs: .. _log_rotation: Changing Log Rotation ------------------------ +--------------------- A few configuration settings alter the log rotation policy: @@ -252,23 +262,19 @@ A few configuration settings alter the log rotation policy: - Description - Default - Values - * - ``useLogMaxFileSize`` - - Rotate log files once they reach a certain file size. When ``true``, set the ``logMaxFileSizeMB`` accordingly. - - ``false`` - - ``false`` or ``true``. * - ``logMaxFileSizeMB`` - Sets the size threshold in megabytes after which a new log file will be opened. - - ``20`` + - ``100`` - ``1`` to ``1024`` (1MB to 1GB) * - ``logFileRotateTimeFrequency`` - Frequency of log rotation - - ``never`` - - ``daily``, ``weekly``, ``monthly``, ``never`` + - ``daily`` + - ``daily``, ``weekly``, or ``monthly`` .. _collecting_logs2: Collecting Logs from Your Cluster -==================================== +================================= Collecting logs from your cluster can be as simple as creating an archive from the ``logs`` subdirectory: ``tar -czvf logs.tgz *.log``. @@ -290,9 +296,9 @@ SQL Syntax Command Line Utility --------------------------- +-------------------- -If you cannot access SQream DB for any reason, you can also use a command line toolto collect the same information: +If you cannot access SQream DB for any reason, you can also use a command line tool to collect the same information: .. code-block:: console @@ -300,7 +306,7 @@ If you cannot access SQream DB for any reason, you can also use a command line t Parameters ---------------- +---------- .. list-table:: :widths: auto @@ -318,7 +324,7 @@ Parameters * ``'db_and_log'`` - Collect both log files and metadata database Example ------------------ +------- Write an archive to ``/home/rhendricks``, containing log files: @@ -343,18 +349,18 @@ Using the command line utility: Troubleshooting with Logs -=============================== +========================= Loading Logs with Foreign Tables ---------------------------------------- +-------------------------------- -Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a database administrator can access the logs using the :ref:`external_tables` concept through SQream DB. +Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a database administrator can access the logs using the :ref:`foreign_tables` concept through SQreamDB. .. code-block:: postgres CREATE FOREIGN TABLE logs ( - start_marker VARCHAR(4), + start_marker TEXT(4), row_id BIGINT, timestamp DATETIME, message_level TEXT, @@ -368,24 +374,24 @@ Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a databas service_name TEXT, message_type_id INT, message TEXT, - end_message VARCHAR(5) + end_message TEXT(5) ) WRAPPER csv_fdw OPTIONS ( LOCATION = '/home/rhendricks/sqream_storage/logs/**/sqream*.log', - DELIMITER = '|' + DELIMITER = '|', CONTINUE_ON_ERROR = true ) ; -For more information, see `Loading Logs with Foreign Tables `_. +For more information, see :ref:`Loading Logs with Foreign Tables `. Counting Message Types ------------------------------- +---------------------- .. code-block:: psql @@ -411,19 +417,19 @@ Counting Message Types 1010 | 5 Finding Fatal Errors ----------------------- +-------------------- .. code-block:: psql t=> SELECT message FROM logs WHERE message_type_id=1010; - Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable - Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable + Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/rocksdb/LOCK: Resource temporarily unavailable + Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/rocksdb/LOCK: Resource temporarily unavailable Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26 Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26 Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/LOCK: Resource temporarily unavailable Countng Error Events Within a Certain Timeframe ---------------------------------------------------- +----------------------------------------------- .. code-block:: psql @@ -442,7 +448,7 @@ Countng Error Events Within a Certain Timeframe .. _tracing_errors: Tracing Errors to Find Offending Statements -------------------------------------------------- +------------------------------------------- If we know an error occured, but don't know which statement caused it, we can find it using the connection ID and statement ID. diff --git a/operational_guides/monitoring_query_performance.rst b/operational_guides/monitoring_query_performance.rst index a542f61e6..81050e03b 100644 --- a/operational_guides/monitoring_query_performance.rst +++ b/operational_guides/monitoring_query_performance.rst @@ -1,27 +1,26 @@ .. _monitoring_query_performance: -********************************* +**************************** Monitoring Query Performance -********************************* -When analyzing options for query tuning, the first step is to analyze the query plan and execution. -The query plan and execution details explains how SQream DB processes a query and where time is spent. -This document details how to analyze query performance with execution plans. -This guide focuses specifically on identifying bottlenecks and possible optimization techniques to improve query performance. -Performance tuning options for each query are different. You should adapt the recommendations and tips for your own workloads. -See also our :ref:`sql_best_practices` guide for more information about data loading considerations and other best practices. - -.. contents:: In this section: +**************************** + +The initial step in query tuning involves a thorough analysis of the query plan and its execution. The query plan and execution details illuminate how SQreamDB handles a query and pinpoint where time resources are consumed. This document offers a comprehensive guide on analyzing query performance through execution plans, with a specific emphasis on recognizing bottlenecks and exploring potential optimization strategies to enhance query efficiency. + +It's important to note that performance tuning approaches can vary for each query, necessitating adaptation of recommendations and tips to suit specific workloads. Additionally, for further insights into data loading considerations and other best practices, refer to our :ref:`sql_best_practices` guide. + +.. contents:: :local: + :depth: 1 + +Setting Up System Monitoring Preferences +======================================== -Setting Up the System for Monitoring -================================================= -By default, SQream DB logs execution details for every statement that runs for more than 60 seconds. -If you want to see the execution details for a currently running statement, see :ref:`using_show_node_info` below. +By default, SQreamDB automatically logs execution details for any query that runs longer than 60 seconds. This means that by default, queries shorter than 60 seconds are not logged. You can adjust this parameter to your own preference. Adjusting the Logging Frequency ---------------------------------------- -To adjust the frequency of logging for statements, you may want to reduce the interval from 60 seconds down to, -say, 5 or 10 seconds. Modify the configuration files and set the ``nodeInfoLoggingSec`` parameter as you see fit: +------------------------------- + +To customize statement logging frequency to be more frequent, consider reducing the interval from the default 60 seconds to a shorter duration like 5 or 10 seconds. This adjustment can be made by modifying the ``nodeInfoLoggingSec`` in your SQreamDB :ref:`configuration files` and setting the parameter to your preferred value. .. code-block:: json :emphasize-lines: 7 @@ -37,106 +36,122 @@ say, 5 or 10 seconds. Modify the configuration files and set the ``nodeInfoLoggi "server":{ } } -After restarting the SQream DB cluster, the execution plan details will be logged to the :ref:`standard SQream DB logs directory`, as a message of type ``200``. -You can see these messages with a text viewer or with queries on the log :ref:`external_tables`. -Reading Execution Plans with a Foreign Table ------------------------------------------------------ -First, create a foreign table for the logs +After customizing the frequency, please restart your SQreamDB cluster. Execution plan details are logged to the default SQreamDB :ref:`log directory` as :ref:`message type` ``200``. + +You can access these log details by using a text viewer or by creating a dedicated :ref:`foreign table` to store the logs in a SQreamDB table. + +Creating a Dedicated Foreign Table to Store Log Details +------------------------------------------------------- + +Utilizing a SQreamDB table for storing and accessing log details helps simplify log management by avoiding direct handling of raw logs. + +To create a foreign table for storing your log details, use the following table DDL: + +.. code-block:: postgres + + CREATE FOREIGN TABLE logs ( + start_marker TEXT, + row_id BIGINT, + timestamp DATETIME, + message_level TEXT, + thread_id TEXT, + worker_hostname TEXT, + worker_port INT, + connection_id INT, + database_name TEXT, + user_name TEXT, + statement_id INT, + service_name TEXT, + message_type_id INT, + message TEXT, + end_message TEXT + ) + WRAPPER + csv_fdw + OPTIONS + ( + LOCATION = '/home/rhendricks/sqream_storage/logs/**/sqream*.log', + DELIMITER = '|' + ); + +Use the following query structure as an example to view previously logged execution plans: .. code-block:: postgres - CREATE FOREIGN TABLE logs - ( - start_marker VARCHAR(4), - row_id BIGINT, - timestamp DATETIME, - message_level TEXT, - thread_id TEXT, - worker_hostname TEXT, - worker_port INT, - connection_id INT, - database_name TEXT, - user_name TEXT, - statement_id INT, - service_name TEXT, - message_type_id INT, - message TEXT, - end_message VARCHAR(5) - ) - WRAPPER cdv_fdw - OPTIONS - ( - LOCATION = '/home/rhendricks/sqream_storage/logs/**/sqream*.log', - DELIMITER = '|' - ) - ; -Once you've defined the foreign table, you can run queries to observe the previously logged execution plans. -This is recommended over looking at the raw logs. - -.. code-block:: psql - t=> SELECT message - . FROM logs - . WHERE message_type_id = 200 - . AND timestamp BETWEEN '2020-06-11' AND '2020-06-13'; - message - --------------------------------------------------------------------------------------------------------------------------------- - SELECT *,coalesce((depdelay > 15),false) AS isdepdelayed FROM ontime WHERE year IN (2005, 2006, 2007, 2008, 2009, 2010) - : - : 1,PushToNetworkQueue ,10354468,10,1035446,2020-06-12 20:41:42,-1,,,,13.55 - : 2,Rechunk ,10354468,10,1035446,2020-06-12 20:41:42,1,,,,0.10 - : 3,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:42,2,,,,0.00 - : 4,DeferredGather ,10354468,10,1035446,2020-06-12 20:41:42,3,,,,1.23 - : 5,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:41,4,,,,0.01 - : 6,GpuToCpu ,10354468,10,1035446,2020-06-12 20:41:41,5,,,,0.07 - : 7,GpuTransform ,10354468,10,1035446,2020-06-12 20:41:41,6,,,,0.02 - : 8,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:41,7,,,,0.00 - : 9,Filter ,10354468,10,1035446,2020-06-12 20:41:41,8,,,,0.07 - : 10,GpuTransform ,10485760,10,1048576,2020-06-12 20:41:41,9,,,,0.07 - : 11,GpuDecompress ,10485760,10,1048576,2020-06-12 20:41:41,10,,,,0.03 - : 12,GpuTransform ,10485760,10,1048576,2020-06-12 20:41:41,11,,,,0.22 - : 13,CpuToGpu ,10485760,10,1048576,2020-06-12 20:41:41,12,,,,0.76 - : 14,ReorderInput ,10485760,10,1048576,2020-06-12 20:41:40,13,,,,0.11 - : 15,Rechunk ,10485760,10,1048576,2020-06-12 20:41:40,14,,,,5.58 - : 16,CpuDecompress ,10485760,10,1048576,2020-06-12 20:41:34,15,,,,0.04 - : 17,ReadTable ,10485760,10,1048576,2020-06-12 20:41:34,16,832MB,,public.ontime,0.55 + + SELECT + message + FROM + logs + WHERE + message_type_id = 200 + AND timestamp BETWEEN '2020-06-11' AND '2020-06-13'; + + message + --------------------------------------------------------------------------------------------------------------------------------- + SELECT *,coalesce((depdelay > 15),false) AS isdepdelayed FROM ontime WHERE year IN (2005, 2006, 2007, 2008, 2009, 2010) + + 1,PushToNetworkQueue ,10354468,10,1035446,2020-06-12 20:41:42,-1,,,,13.55 + 2,Rechunk ,10354468,10,1035446,2020-06-12 20:41:42,1,,,,0.10 + 3,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:42,2,,,,0.00 + 4,DeferredGather ,10354468,10,1035446,2020-06-12 20:41:42,3,,,,1.23 + 5,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:41,4,,,,0.01 + 6,GpuToCpu ,10354468,10,1035446,2020-06-12 20:41:41,5,,,,0.07 + 7,GpuTransform ,10354468,10,1035446,2020-06-12 20:41:41,6,,,,0.02 + 8,ReorderInput ,10354468,10,1035446,2020-06-12 20:41:41,7,,,,0.00 + 9,Filter ,10354468,10,1035446,2020-06-12 20:41:41,8,,,,0.07 + 10,GpuTransform ,10485760,10,1048576,2020-06-12 20:41:41,9,,,,0.07 + 11,GpuDecompress ,10485760,10,1048576,2020-06-12 20:41:41,10,,,,0.03 + 12,GpuTransform ,10485760,10,1048576,2020-06-12 20:41:41,11,,,,0.22 + 13,CpuToGpu ,10485760,10,1048576,2020-06-12 20:41:41,12,,,,0.76 + 14,ReorderInput ,10485760,10,1048576,2020-06-12 20:41:40,13,,,,0.11 + 15,Rechunk ,10485760,10,1048576,2020-06-12 20:41:40,14,,,,5.58 + 16,CpuDecompress ,10485760,10,1048576,2020-06-12 20:41:34,15,,,,0.04 + 17,ReadTable ,10485760,10,1048576,2020-06-12 20:41:34,16,832MB,,public.ontime,0.55 .. _using_show_node_info: Using the ``SHOW_NODE_INFO`` Command -===================================== -The :ref:`show_node_info` command returns a snapshot of the current query plan, similar to ``EXPLAIN ANALYZE`` from other databases. -The :ref:`show_node_info` result, just like the periodically-logged execution plans described above, are an at-the-moment -view of the compiler's execution plan and runtime statistics for the specified statement. -To inspect a currently running statement, execute the ``show_node_info`` utility function in a SQL client like - :ref:`sqream sql`, the :ref:`SQream Studio Editor`, or any other :ref:`third party SQL terminal`. -In this example, we inspect a statement with statement ID of 176. The command looks like this: - -.. code-block:: psql - - t=> SELECT SHOW_NODE_INFO(176); - stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum - --------+---------+--------------------+------+--------+-------------------+---------------------+----------------+------+-------+------------+-------- - 176 | 1 | PushToNetworkQueue | 1 | 1 | 1 | 2019-12-25 23:53:13 | -1 | | | | 0.0025 - 176 | 2 | Rechunk | 1 | 1 | 1 | 2019-12-25 23:53:13 | 1 | | | | 0 - 176 | 3 | GpuToCpu | 1 | 1 | 1 | 2019-12-25 23:53:13 | 2 | | | | 0 - 176 | 4 | ReorderInput | 1 | 1 | 1 | 2019-12-25 23:53:13 | 3 | | | | 0 - 176 | 5 | Filter | 1 | 1 | 1 | 2019-12-25 23:53:13 | 4 | | | | 0.0002 - 176 | 6 | GpuTransform | 457 | 1 | 457 | 2019-12-25 23:53:13 | 5 | | | | 0.0002 - 176 | 7 | GpuDecompress | 457 | 1 | 457 | 2019-12-25 23:53:13 | 6 | | | | 0 - 176 | 8 | CpuToGpu | 457 | 1 | 457 | 2019-12-25 23:53:13 | 7 | | | | 0.0003 - 176 | 9 | Rechunk | 457 | 1 | 457 | 2019-12-25 23:53:13 | 8 | | | | 0 - 176 | 10 | CpuDecompress | 457 | 1 | 457 | 2019-12-25 23:53:13 | 9 | | | | 0 - 176 | 11 | ReadTable | 457 | 1 | 457 | 2019-12-25 23:53:13 | 10 | 4MB | | public.nba | 0.0004 +==================================== + +The :ref:`show_node_info` command provides a snapshot of the current query plan. Similar to periodically-logged execution plans, ``SHOW_NODE_INFO`` displays the compiler's execution plan and runtime statistics for a specified statement at the moment of execution. + +You can execute the ``SHOW_NODE_INFO`` utility function using :ref:`sqream sql`, :ref:`SQream Studio Editor`, or other :ref:`third party tool`. + +In this example, we inspect a statement with statement ID of 176: + +.. code-block:: postgres + + SELECT + SHOW_NODE_INFO(176); + + stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum + --------+---------+--------------------+------+--------+-------------------+---------------------+----------------+------+-------+------------+-------- + 176 | 1 | PushToNetworkQueue | 1 | 1 | 1 | 2019-12-25 23:53:13 | -1 | | | | 0.0025 + 176 | 2 | Rechunk | 1 | 1 | 1 | 2019-12-25 23:53:13 | 1 | | | | 0 + 176 | 3 | GpuToCpu | 1 | 1 | 1 | 2019-12-25 23:53:13 | 2 | | | | 0 + 176 | 4 | ReorderInput | 1 | 1 | 1 | 2019-12-25 23:53:13 | 3 | | | | 0 + 176 | 5 | Filter | 1 | 1 | 1 | 2019-12-25 23:53:13 | 4 | | | | 0.0002 + 176 | 6 | GpuTransform | 457 | 1 | 457 | 2019-12-25 23:53:13 | 5 | | | | 0.0002 + 176 | 7 | GpuDecompress | 457 | 1 | 457 | 2019-12-25 23:53:13 | 6 | | | | 0 + 176 | 8 | CpuToGpu | 457 | 1 | 457 | 2019-12-25 23:53:13 | 7 | | | | 0.0003 + 176 | 9 | Rechunk | 457 | 1 | 457 | 2019-12-25 23:53:13 | 8 | | | | 0 + 176 | 10 | CpuDecompress | 457 | 1 | 457 | 2019-12-25 23:53:13 | 9 | | | | 0 + 176 | 11 | ReadTable | 457 | 1 | 457 | 2019-12-25 23:53:13 | 10 | 4MB | | public.nba | 0.0004 + +You may also :ref:`download the query execution plan` to a CSV file using the **Execution Details View** feature. Understanding the Query Execution Plan Output -================================================== +============================================= + Both :ref:`show_node_info` and the logged execution plans represents the query plan as a graph hierarchy, with data separated into different columns. Each row represents a single logical database operation, which is also called a **node** or **chunk producer**. A node reports several metrics during query execution, such as how much data it has read and written, how many chunks and rows, and how much time has elapsed. -Consider the example show_node_info presented above. The source node with ID #11 (``ReadTable``), has a parent node ID #10 +Consider the example SHOW_NODE_INFO presented above. The source node with ID #11 (``ReadTable``), has a parent node ID #10 (``CpuDecompress``). If we were to draw this out in a graph, it'd look like this: + .. figure:: /_static/images/show_node_info_graph.png - :height: 70em + :scale: 60 % :align: center This graph explains how the query execution details are arranged in a logical order, from the bottom up. @@ -184,13 +199,15 @@ When using :ref:`show_node_info`, a tabular representation of the currently runn See the examples below to understand how the query execution plan is instrumental in identifying bottlenecks and optimizing long-running statements. Information Presented in the Execution Plan ----------------------------------------------------- -.. include:: /reference/sql/sql_statements/monitoring_commands/show_node_info.rst +------------------------------------------- + +.. include:: /reference/sql/sql_statements/utility_commands/show_node_info.rst :start-line: 47 :end-line: 78 Commonly Seen Nodes ----------------------- +------------------- + .. list-table:: Node types :widths: auto :header-rows: 1 @@ -200,7 +217,7 @@ Commonly Seen Nodes - Description * - ``CpuDecompress`` - CPU - - Decompression operation, common for longer ``VARCHAR`` types + - Decompression operation, common for longer ``TEXT`` types * - ``CpuLoopJoin`` - CPU - A non-indexed nested loop join, performed on the CPU @@ -254,7 +271,7 @@ Commonly Seen Nodes - Reads data from a standard table stored on disk * - ``Rechunk`` - - - Reorganize multiple small :ref:`chunks` into a full chunk. Commonly found after joins and when :ref:`HIGH_SELECTIVITY` is used + - Reorganize multiple small chunks into a full chunk. Commonly found after joins and when :ref:`HIGH_SELECTIVITY` is used * - ``Reduce`` - GPU - A reduction operation, such as a ``GROUP BY`` @@ -295,60 +312,73 @@ Commonly Seen Nodes .. tip:: The full list of nodes appears in the :ref:`Node types table`, as part of the :ref:`show_node_info` reference. Examples -================== -In general, looking at the top three longest running nodes (as is detailed in the ``timeSum`` column) can indicate the biggest bottlenecks. -In the following examples you will learn how to identify and solve some common issues. +======== -.. contents:: In this section: +Typically, examining the top three longest running nodes (detailed in the ``timeSum`` column) can highlight major bottlenecks. The following examples will demonstrate how to identify and address common issues. + +.. contents:: :local: + :depth: 1 + +Spooling to Disk +---------------- -1. Spooling to Disk ------------------------ -When there is not enough RAM to process a statement, SQream DB will spill over data to the ``temp`` folder in the storage disk. -While this ensures that a statement can always finish processing, it can slow down the processing significantly. -It's worth identifying these statements, to figure out if the cluster is configured correctly, as well as potentially reduce -the statement size. -You can identify a statement that spools to disk by looking at the ``write`` column in the execution details. -A node that spools will have a value, shown in megabytes in the ``write`` column. -Common nodes that write spools include ``Join`` or ``LoopJoin``. +When SQreamDB doesn't have enough RAM to process a statement, it will temporarily store overflow data in the ``temp`` folder on the storage disk. While this ensures that statements complete processing, it can significantly slow down performance. It's important to identify these statements to assess cluster configuration and potentially optimize statement size. + +To identify statements that spill data to disk, check the ``write`` column in the execution details. Nodes that write to disk will display a value (in megabytes) in this column. Common nodes that may write spillover data include ``Join`` and ``LoopJoin``. Identifying the Offending Nodes -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + #. Run a query. - For example, a query from the TPC-H benchmark: + This example is from the TPC-H benchmark: .. code-block:: postgres - SELECT o_year, - SUM(CASE WHEN nation = 'BRAZIL' THEN volume ELSE 0 END) / SUM(volume) AS mkt_share - FROM (SELECT datepart(YEAR,o_orderdate) AS o_year, - l_extendedprice*(1 - l_discount / 100.0) AS volume, - n2.n_name AS nation - FROM lineitem - JOIN part ON p_partkey = CAST (l_partkey AS INT) - JOIN orders ON l_orderkey = o_orderkey - JOIN customer ON o_custkey = c_custkey - JOIN nation n1 ON c_nationkey = n1.n_nationkey - JOIN region ON n1.n_regionkey = r_regionkey - JOIN supplier ON s_suppkey = l_suppkey - JOIN nation n2 ON s_nationkey = n2.n_nationkey - WHERE o_orderdate BETWEEN '1995-01-01' AND '1996-12-31') AS all_nations - GROUP BY o_year - ORDER BY o_year; + SELECT + o_year, + SUM( + CASE + WHEN nation = 'BRAZIL' THEN volume + ELSE 0 + END + ) / SUM(volume) AS mkt_share + FROM + ( + SELECT + datepart(YEAR, o_orderdate) AS o_year, + l_extendedprice * (1 - l_discount / 100.0) AS volume, + n2.n_name AS nation + FROM + lineitem + JOIN part ON p_partkey = CAST (l_partkey AS INT) + JOIN orders ON l_orderkey = o_orderkey + JOIN customer ON o_custkey = c_custkey + JOIN nation n1 ON c_nationkey = n1.n_nationkey + JOIN region ON n1.n_regionkey = r_regionkey + JOIN supplier ON s_suppkey = l_suppkey + JOIN nation n2 ON s_nationkey = n2.n_nationkey + WHERE + o_orderdate BETWEEN '1995-01-01' AND '1996-12-31' + ) AS all_nations + GROUP BY + o_year + ORDER BY + o_year; #. - Observe the execution information by using the foreign table, or use ``show_node_info`` + Use a foreign table or ``SHOW_NODE_INFO`` to view the execution information. This statement is made up of 199 nodes, starting from a ``ReadTable``, and finishes by returning only 2 results to the client. The execution below has been shortened, but note the highlighted rows for ``LoopJoin``: - .. code-block:: psql - :emphasize-lines: 33,35,37,39 + .. code-block:: postgres + :emphasize-lines: 33,35,37,39 - t=> SELECT message FROM logs WHERE message_type_id = 200 LIMIT 1; + SELECT message FROM logs WHERE message_type_id = 200 LIMIT 1; message ----------------------------------------------------------------------------------------- SELECT o_year, @@ -389,146 +419,178 @@ Identifying the Offending Nodes : 150,LoopJoin ,182369485,10,18236948,2020-09-04 18:31:47,149,12860MB,12860MB,inner,23.62 [...] : 199,ReadTable ,20000000,1,20000000,2020-09-04 18:30:33,198,0MB,,public.part,0.83 - Because of the relatively low amount of RAM in the machine and because the data set is rather large at around 10TB, SQream DB needs to spool. - The total spool used by this query is around 20GB (1915MB + 2191MB + 3064MB + 12860MB). + Due to the machine's limited RAM and the large dataset of approximately 10TB, SQreamDB requires spooling. + + The total spool used by this query amounts to approximately 20GB (1915MB + 2191MB + 3064MB + 12860MB). Common Solutions for Reducing Spool -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* - Increase the amount of spool memory available for the workers, as a proportion of the maximum statement memory. - When the amount of spool memory is increased, SQream DB may not need to write to disk. - - This setting is called ``spoolMemoryGB``. Refer to the :ref:`configuration` guide. -* - Reduce the amount of **workers** per host, and increase the amount of spool available to the (now reduced amount of) active workers. - This may reduce the amount of concurrent statements, but will improve performance for heavy statements. -2. Queries with Large Result Sets ------------------------------------- -When queries have large result sets, you may see a node called ``DeferredGather``. -This gathering occurs when the result set is assembled, in preparation for sending it to the client. +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - Increasing Spool Memory Amount + - Increase the amount of spool memory available for the Workers relative to the maximum statement memory. By increasing spool memory, SQreamDB may avoid the need to write to disk. This setting is known as ``spoolMemoryGB``. Refer to the :ref:`concurrency_and_scaling_in_sqream` guide for details. + * - Reducing Workers Per Host + - Reduce the number of Workers per host and allocate more spool memory to the reduced number of active Workers. This approach may decrease concurrent statements but can enhance performance for resource-intensive queries. + +Queries with Large Result Sets +------------------------------ + +When queries produce large result sets, you may encounter a node called ``DeferredGather``. This node is responsible for assembling the result set in preparation for sending it to the client. Identifying the Offending Nodes -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + #. Run a query. - For example, a modified query from the TPC-H benchmark: + This example is from the TPC-H benchmark: .. code-block:: postgres - SELECT s.*, - l.*, - r.*, - n1.*, - n2.*, - p.*, - o.*, - c.* - FROM lineitem l - JOIN part p ON p_partkey = CAST (l_partkey AS INT) - JOIN orders o ON l_orderkey = o_orderkey - JOIN customer c ON o_custkey = c_custkey - JOIN nation n1 ON c_nationkey = n1.n_nationkey - JOIN region r ON n1.n_regionkey = r_regionkey - JOIN supplier s ON s_suppkey = l_suppkey - JOIN nation n2 ON s_nationkey = n2.n_nationkey - WHERE r_name = 'AMERICA' - AND o_orderdate BETWEEN '1995-01-01' AND '1996-12-31' - AND high_selectivity(p_type = 'ECONOMY BURNISHED NICKEL'); + SELECT + s.*, + l.*, + r.*, + n1.*, + n2.*, + p.*, + o.*, + c.* + FROM + lineitem l + JOIN part p ON p_partkey = CAST (l_partkey AS INT) + JOIN orders o ON l_orderkey = o_orderkey + JOIN customer c ON o_custkey = c_custkey + JOIN nation n1 ON c_nationkey = n1.n_nationkey + JOIN region r ON n1.n_regionkey = r_regionkey + JOIN supplier s ON s_suppkey = l_suppkey + JOIN nation n2 ON s_nationkey = n2.n_nationkey + WHERE + r_name = 'AMERICA' + AND o_orderdate BETWEEN '1995-01-01' AND '1996-12-31' + AND high_selectivity(p_type = 'ECONOMY BURNISHED NICKEL'); #. - Observe the execution information by using the foreign table, or use ``show_node_info`` + Use a foreign table or ``SHOW_NODE_INFO`` to view the execution information. This statement is made up of 221 nodes, containing 8 ``ReadTable`` nodes, and finishes by returning billions of results to the client. The execution below has been shortened, but note the highlighted rows for ``DeferredGather``: - .. code-block:: psql + .. code-block:: postgres :emphasize-lines: 7,9,11 - t=> SELECT show_node_info(494); - stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum - --------+---------+----------------------+-----------+--------+-------------------+---------------------+----------------+---------+-------+-----------------+-------- - 494 | 1 | PushToNetworkQueue | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | -1 | | | | 0.36 - 494 | 2 | Rechunk | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 1 | | | | 0 - 494 | 3 | ReorderInput | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 2 | | | | 0 - 494 | 4 | DeferredGather | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 3 | | | | 0.16 - [...] - 494 | 166 | DeferredGather | 3998730 | 39 | 102531 | 2020-09-04 19:07:47 | 165 | | | | 21.75 - [...] - 494 | 194 | DeferredGather | 133241 | 20 | 6662 | 2020-09-04 19:07:03 | 193 | | | | 0.41 - [...] - 494 | 221 | ReadTable | 20000000 | 20 | 1000000 | 2020-09-04 19:07:01 | 220 | 20MB | | public.part | 0.1 + SELECT SHOW_NODE_INFO(494); + stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum + --------+---------+----------------------+-----------+--------+-------------------+---------------------+----------------+---------+-------+-----------------+-------- + 494 | 1 | PushToNetworkQueue | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | -1 | | | | 0.36 + 494 | 2 | Rechunk | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 1 | | | | 0 + 494 | 3 | ReorderInput | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 2 | | | | 0 + 494 | 4 | DeferredGather | 242615 | 1 | 242615 | 2020-09-04 19:07:55 | 3 | | | | 0.16 + [...] + 494 | 166 | DeferredGather | 3998730 | 39 | 102531 | 2020-09-04 19:07:47 | 165 | | | | 21.75 + [...] + 494 | 194 | DeferredGather | 133241 | 20 | 6662 | 2020-09-04 19:07:03 | 193 | | | | 0.41 + [...] + 494 | 221 | ReadTable | 20000000 | 20 | 1000000 | 2020-09-04 19:07:01 | 220 | 20MB | | public.part | 0.1 - When you see ``DeferredGather`` operations taking more than a few seconds, that's a sign that you're selecting too much data. - In this case, the DeferredGather with node ID 166 took over 21 seconds. + If you notice that ``DeferredGather`` operations are taking more than a few seconds, it could indicate that you're selecting a large amount of data. For example, in this case, the ``DeferredGather`` with node ID 166 took over 21 seconds. -#. Modify the statement to see the difference - Altering the select clause to be more restrictive will reduce the deferred gather time back to a few milliseconds. +#. + + Modify the statement by making the ``SELECT`` clause more restrictive. + + This adjustment will reduce the ``DeferredGather`` time from several seconds to just a few milliseconds. .. code-block:: postgres - SELECT DATEPART(year, o_orderdate) AS o_year, - l_extendedprice * (1 - l_discount / 100.0) as volume, - n2.n_name as nation - FROM ... + SELECT + DATEPART(year, o_orderdate) AS o_year, + l_extendedprice * (1 - l_discount / 100.0) as volume, + n2.n_name as nation + FROM ... Common Solutions for Reducing Gather Time -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* Reduce the effect of the preparation time. Avoid selecting unnecessary columns (``SELECT * FROM...``), or reduce the result set size by using more filters. -.. `` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - minimizing preparation time + - To minimize preparation time, avoid selecting unnecessary columns (e.g., ``SELECT * FROM`` ...) or reduce the result set size by applying more filters. -3. Inefficient Filtering --------------------------------- -When running statements, SQream DB tries to avoid reading data that is not needed for the statement by :ref:`skipping chunks`. -If statements do not include efficient filtering, SQream DB will read a lot of data off disk. -In some cases, you need the data and there's nothing to do about it. However, if most of it gets pruned further down the line, -it may be efficient to skip reading the data altogether by using the :ref:`metadata`. +Inefficient Filtering +--------------------- + +When executing statements, SQreamDB optimizes data retrieval by :ref:`skipping unnecessary chunks`. However, if statements lack efficient filtering, SQreamDB may end up reading excessive data from disk. Identifying the Situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -We consider the filtering to be inefficient when the ``Filter`` node shows that the number of rows processed is less -than a third of the rows passed into it by the ``ReadTable`` node. -For example: +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Filtering is considered inefficient when the ``Filter`` node processes less than one-third of the rows passed into it by the ``ReadTable`` node. + #. Run a query. In this example, we execute a modified query from the TPC-H benchmark. + Our ``lineitem`` table contains 600,037,902 rows. .. code-block:: postgres - SELECT o_year, - SUM(CASE WHEN nation = 'BRAZIL' THEN volume ELSE 0 END) / SUM(volume) AS mkt_share - FROM (SELECT datepart(YEAR,o_orderdate) AS o_year, - l_extendedprice*(1 - l_discount / 100.0) AS volume, - n2.n_name AS nation - FROM lineitem - JOIN part ON p_partkey = CAST (l_partkey AS INT) - JOIN orders ON l_orderkey = o_orderkey - JOIN customer ON o_custkey = c_custkey - JOIN nation n1 ON c_nationkey = n1.n_nationkey - JOIN region ON n1.n_regionkey = r_regionkey - JOIN supplier ON s_suppkey = l_suppkey - JOIN nation n2 ON s_nationkey = n2.n_nationkey - WHERE r_name = 'AMERICA' - AND lineitem.l_quantity = 3 - AND o_orderdate BETWEEN '1995-01-01' AND '1996-12-31' - AND high_selectivity(p_type = 'ECONOMY BURNISHED NICKEL')) AS all_nations - GROUP BY o_year - ORDER BY o_year; + SELECT + o_year, + SUM( + CASE + WHEN nation = 'BRAZIL' THEN volume + ELSE 0 + END + ) / SUM(volume) AS mkt_share + FROM + ( + SELECT + datepart(YEAR, o_orderdate) AS o_year, + l_extendedprice * (1 - l_discount / 100.0) AS volume, + n2.n_name AS nation + FROM + lineitem + JOIN part ON p_partkey = CAST (l_partkey AS INT) + JOIN orders ON l_orderkey = o_orderkey + JOIN customer ON o_custkey = c_custkey + JOIN nation n1 ON c_nationkey = n1.n_nationkey + JOIN region ON n1.n_regionkey = r_regionkey + JOIN supplier ON s_suppkey = l_suppkey + JOIN nation n2 ON s_nationkey = n2.n_nationkey + WHERE + r_name = 'AMERICA' + AND lineitem.l_quantity = 3 + AND o_orderdate BETWEEN '1995-01-01' AND '1996-12-31' + AND high_selectivity(p_type = 'ECONOMY BURNISHED NICKEL') + ) AS all_nations + GROUP BY + o_year + ORDER BY + o_year; #. - Observe the execution information by using the foreign table, or use ``show_node_info`` + Use a foreign table or ``SHOW_NODE_INFO`` to view the execution information. The execution below has been shortened, but note the highlighted rows for ``ReadTable`` and ``Filter``: - .. code-block:: psql + .. code-block:: postgres :linenos: :emphasize-lines: 9,17,19,27 - t=> SELECT show_node_info(559); + SELECT SHOW_NODE_INFO(559); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+----------------------+-----------+--------+-------------------+---------------------+----------------+--------+-------+-----------------+-------- 559 | 1 | PushToNetworkQueue | 2 | 1 | 2 | 2020-09-07 11:12:01 | -1 | | | | 0.28 @@ -555,19 +617,20 @@ For example: 559 | 214 | Rechunk | 20000000 | 20 | 1000000 | 2020-09-07 11:11:57 | 213 | | | | 0 559 | 215 | CpuDecompress | 20000000 | 20 | 1000000 | 2020-09-07 11:11:57 | 214 | | | | 0 559 | 216 | ReadTable | 20000000 | 20 | 1000000 | 2020-09-07 11:11:57 | 215 | 20MB | | public.part | 0 + + Note the following: + + * The ``Filter`` on line 9 has processed 12,007,447 rows, but the output of ``ReadTable`` on ``public.lineitem`` on line 17 was 600,037,902 rows. + + This means that it has filtered out 98% (:math:`1 - \dfrac{600037902}{12007447} = 98\%`) of the data, but the entire table was read. - * - The ``Filter`` on line 9 has processed 12,007,447 rows, but the output of ``ReadTable`` on ``public.lineitem`` - on line 17 was 600,037,902 rows. This means that it has filtered out 98% (:math:`1 - \dfrac{600037902}{12007447} = 98\%`) - of the data, but the entire table was read. - - * - The ``Filter`` on line 19 has processed 133,000 rows, but the output of ``ReadTable`` on ``public.part`` - on line 27 was 20,000,000 rows. This means that it has filtered out >99% (:math:`1 - \dfrac{133241}{20000000} = 99.4\%`) - of the data, but the entire table was read. However, this table is small enough that we can ignore it. + * The ``Filter`` on line 19 has processed 133,000 rows, but the output of ``ReadTable`` on ``public.part`` on line 27 was 20,000,000 rows. -#. Modify the statement to see the difference - Altering the statement to have a ``WHERE`` condition on the clustered ``l_orderkey`` column of the ``lineitem`` table will help SQream DB skip reading the data. + This means that it has filtered out >99% (:math:`1 - \dfrac{133241}{20000000} = 99.4\%`) of the data, but the entire table was read. However, this table is small enough that we can ignore it. + +#. modify the statement by adding a ``WHERE`` condition on the clustered ``l_orderkey`` column of the ``lineitem`` table. + + This adjustment will enable SQreamDB to skip reading unnecessary data. .. code-block:: postgres :emphasize-lines: 15 @@ -592,11 +655,11 @@ For example: GROUP BY o_year ORDER BY o_year; - .. code-block:: psql + .. code-block:: postgres :linenos: :emphasize-lines: 5,13 - t=> SELECT show_node_info(586); + SELECT SHOW_NODE_INFO(586); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+----------------------+-----------+--------+-------------------+---------------------+----------------+--------+-------+-----------------+-------- [...] @@ -610,67 +673,86 @@ For example: 586 | 197 | CpuDecompress | 494927872 | 8 | 61865984 | 2020-09-07 13:20:44 | 196 | | | | 0 586 | 198 | ReadTable | 494927872 | 8 | 61865984 | 2020-09-07 13:20:44 | 197 | 6595MB | | public.lineitem | 0.09 [...] - In this example, the filter processed 494,621,593 rows, while the output of ``ReadTable`` on ``public.lineitem`` - was 494,927,872 rows. This means that it has filtered out all but 0.01% (:math:`1 - \dfrac{494621593}{494927872} = 0.01\%`) - of the data that was read. - The metadata skipping has performed very well, and has pre-filtered the data for us by pruning unnecessary chunks. + Note the following: + + * The filter processed 494,621,593 rows, while the output of ``ReadTable`` on ``public.lineitem`` was 494,927,872 rows. + + This means that it has filtered out all but 0.01% (:math:`1 - \dfrac{494621593}{494927872} = 0.01\%`) of the data that was read. + + * The metadata skipping has performed very well, and has pre-filtered the data for us by pruning unnecessary chunks. Common Solutions for Improving Filtering -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* Use :ref:`clustering keys and naturally ordered data` in your filters. -* Avoid full table scans when possible +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -4. Joins with ``varchar`` Keys ------------------------------------ -Joins on long text keys, such as ``varchar(100)`` do not perform as well as numeric data types or very short text keys. +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - :ref:`Clustering keys` and ordering data + - Utilize clustering keys and naturally ordered data to enhance filtering efficiency. + * - Avoiding full table scans + - Minimize full table scans by applying targeted filtering conditions. + +Joins with ``TEXT`` Keys +------------------------ + +Joins on long ``TEXT`` keys may result in reduced performance compared to joins on ``NUMERIC`` data types or very short ``TEXT`` keys. Identifying the Situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When a join is inefficient, you may note that a query spends a lot of time on the ``Join`` node. -For example, consider these two table structures: +^^^^^^^^^^^^^^^^^^^^^^^^^ + +When a join is inefficient, you may observe that a query spends a significant amount of time on the ``Join`` node. + +Consider these two table structures: .. code-block:: postgres - CREATE TABLE t_a - ( - amt FLOAT NOT NULL, - i INT NOT NULL, - ts DATETIME NOT NULL, - country_code VARCHAR(3) NOT NULL, - flag VARCHAR(10) NOT NULL, - fk VARCHAR(50) NOT NULL - ); - CREATE TABLE t_b - ( - id VARCHAR(50) NOT NULL - prob FLOAT NOT NULL, - j INT NOT NULL, - ); + + CREATE TABLE + t_a ( + amt FLOAT NOT NULL, + i INT NOT NULL, + ts DATETIME NOT NULL, + country_code TEXT NOT NULL, + flag TEXT NOT NULL, + fk TEXT NOT NULL + ); + + CREATE TABLE + t_b ( + id TEXT NOT NULL, + prob FLOAT NOT NULL, + j INT NOT NULL + ); #. Run a query. - In this example, we will join ``t_a.fk`` with ``t_b.id``, both of which are ``VARCHAR(50)``. + In this example, we join ``t_a.fk`` with ``t_b.id``, both of which are ``TEXT``. .. code-block:: postgres - SELECT AVG(t_b.j :: BIGINT), - t_a.country_code - FROM t_a - JOIN t_b ON (t_a.fk = t_b.id) - GROUP BY t_a.country_code + SELECT + AVG(t_b.j :: BIGINT), + t_a.country_code + FROM + t_a + JOIN t_b ON (t_a.fk = t_b.id) + GROUP BY + t_a.country_code; + #. - Observe the execution information by using the foreign table, or use ``show_node_info`` + Use a foreign table or ``SHOW_NODE_INFO`` to view the execution information. The execution below has been shortened, but note the highlighted rows for ``Join``. - The ``Join`` node is by far the most time-consuming part of this statement - clocking in at 69.7 seconds - joining 1.5 billion records. - .. code-block:: psql + .. code-block:: postgres :linenos: :emphasize-lines: 8 - t=> SELECT show_node_info(5); + SELECT SHOW_NODE_INFO(5); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+----------------------+------------+--------+-------------------+---------------------+----------------+-------+-------+------------+-------- [...] @@ -686,62 +768,70 @@ For example, consider these two table structures: 5 | 41 | CpuDecompress | 10000000 | 2 | 5000000 | 2020-09-08 18:26:09 | 40 | | | | 0 5 | 42 | ReadTable | 10000000 | 2 | 5000000 | 2020-09-08 18:26:09 | 41 | 14MB | | public.t_a | 0 -Improving Query Performance -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* In general, try to avoid ``VARCHAR`` as a join key. As a rule of thumb, ``BIGINT`` works best as a join key. -* - Convert text values on-the-fly before running the query. For example, the :ref:`crc64` function takes a text - input and returns a ``BIGINT`` hash. + Note the following: - For example: + * The ``Join`` node is the most time-consuming part of this statement, taking 69.7 seconds to join 1.5 billion records. - .. code-block:: postgres +Common Solutions for Improving Query Performance +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In general, try to avoid ``TEXT`` as a join key. As a rule of thumb, ``BIGINT`` works best as a join key. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - Mapping + - Use a dimension table to map ``TEXT`` values to ``NUMERIC`` types, and then reconcile these values as needed by joining the dimension table. + * - Conversion + - Use functions like :ref:`crc64` to convert ``TEXT`` values into BIGINT hashes directly before running the query. + + For example: + + .. code-block:: postgres - SELECT AVG(t_b.j :: BIGINT), - t_a.country_code - FROM t_a - JOIN t_b ON (crc64_join(t_a.fk) = crc64_join(t_b.id)) - GROUP BY t_a.country_code - The execution below has been shortened, but note the highlighted rows for ``Join``. - The ``Join`` node went from taking nearly 70 seconds, to just 6.67 seconds for joining 1.5 billion records. + SELECT AVG(t_b.j::BIGINT), t_a.country_code + FROM "public"."t_a" + JOIN "public"."t_b" ON (CRC64(t_a.fk::TEXT) = CRC64(t_b.id::TEXT)) + GROUP BY t_a.country_code; + + The execution below has been shortened, but note the highlighted rows for ``Join``. + The ``Join`` node went from taking nearly 70 seconds, to just 6.67 seconds for joining 1.5 billion records. - .. code-block:: psql - :linenos: - :emphasize-lines: 8 + .. code-block:: postgres + :linenos: + :emphasize-lines: 8 - t=> SELECT show_node_info(6); - stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum - --------+---------+----------------------+------------+--------+-------------------+---------------------+----------------+-------+-------+------------+-------- - [...] - 6 | 19 | GpuTransform | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 18 | | | | 1.48 - 6 | 20 | ReorderInput | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 19 | | | | 0 - 6 | 21 | ReorderInput | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 20 | | | | 0 - 6 | 22 | Join | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 21 | | | inner | 6.67 - 6 | 24 | AddSortedMinMaxMet.. | 6291456 | 1 | 6291456 | 2020-09-08 18:55:12 | 22 | | | | 0 - [...] - 6 | 32 | ReadTable | 6291456 | 1 | 6291456 | 2020-09-08 18:55:12 | 31 | 235MB | | public.t_b | 0.02 - [...] - 6 | 43 | CpuDecompress | 10000000 | 2 | 5000000 | 2020-09-08 18:55:13 | 42 | | | | 0 - 6 | 44 | ReadTable | 10000000 | 2 | 5000000 | 2020-09-08 18:55:13 | 43 | 14MB | | public.t_a | 0 - -* You can map some text values to numeric types by using a dimension table. Then, reconcile the values when you need them by joining the dimension table. - -5. Sorting on big ``VARCHAR`` fields ---------------------------------------- -In general, SQream DB automatically inserts a ``Sort`` node which arranges the data prior to reductions and aggregations. -When running a ``GROUP BY`` on large ``VARCHAR`` fields, you may see nodes for ``Sort`` and ``Reduce`` taking a long time. + SELECT SHOW_NODE_INFO(6); + stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum + --------+---------+----------------------+------------+--------+-------------------+---------------------+----------------+-------+-------+------------+-------- + [...] + 6 | 19 | GpuTransform | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 18 | | | | 1.48 + 6 | 20 | ReorderInput | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 19 | | | | 0 + 6 | 21 | ReorderInput | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 20 | | | | 0 + 6 | 22 | Join | 1497366528 | 85 | 17825792 | 2020-09-08 18:57:04 | 21 | | | inner | 6.67 + 6 | 24 | AddSortedMinMaxMet.. | 6291456 | 1 | 6291456 | 2020-09-08 18:55:12 | 22 | | | | 0 + [...] + 6 | 32 | ReadTable | 6291456 | 1 | 6291456 | 2020-09-08 18:55:12 | 31 | 235MB | | public.t_b | 0.02 + [...] + 6 | 43 | CpuDecompress | 10000000 | 2 | 5000000 | 2020-09-08 18:55:13 | 42 | | | | 0 + 6 | 44 | ReadTable | 10000000 | 2 | 5000000 | 2020-09-08 18:55:13 | 43 | 14MB | | public.t_a | 0 + +Sorting on Big ``TEXT`` Fields +------------------------------ + +In SQreamDB, a ``Sort`` node is automatically added to organize data prior to reductions and aggregations. When executing a ``GROUP BY`` operation on extensive ``TEXT`` fields, you might observe that the ``Sort`` and subsequent ``Reduce`` nodes require a considerable amount of time to finish. Identifying the Situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When running a statement, inspect it with :ref:`show_node_info`. If you see ``Sort`` and ``Reduce`` among +^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you see ``Sort`` and ``Reduce`` among your top five longest running nodes, there is a potential issue. -For example: -#. - Run a query to test it out. - - - Our ``t_inefficient`` table contains 60,000,000 rows, and the structure is simple, but with an oversized ``country_code`` column: - + +Consider this ``t_inefficient`` table which contains 60,000,000 rows, and the structure is simple, but with an oversized ``country_code`` column: + .. code-block:: postgres :emphasize-lines: 5 @@ -749,21 +839,22 @@ For example: i INT NOT NULL, amt DOUBLE NOT NULL, ts DATETIME NOT NULL, - country_code VARCHAR(100) NOT NULL, - flag VARCHAR(10) NOT NULL, - string_fk VARCHAR(50) NOT NULL + country_code TEXT NOT NULL, + flag TEXT NOT NULL, + string_fk TEXTNOT NULL ); + +#. + Run a query. - We will run a query, and inspect it's execution details: - - .. code-block:: psql + .. code-block:: postgres - t=> SELECT country_code, - . SUM(amt) - . FROM t_inefficient - . GROUP BY country_code; - executed - time: 47.55s + SELECT + country_code, + SUM(amt) + FROM t_inefficient + GROUP BY country_code; + country_code | sum -------------+----------- @@ -772,11 +863,13 @@ For example: TUR | 1195946178 [...] +#. + Use a foreign table or ``SHOW_NODE_INFO`` to view the execution information. - .. code-block:: psql + .. code-block:: postgres :emphasize-lines: 8,9 - t=> select show_node_info(30); + SELECT SHOW_NODE_INFO(30); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+--------------------+----------+--------+-------------------+---------------------+----------------+-------+-------+----------------------+-------- 30 | 1 | PushToNetworkQueue | 249 | 1 | 249 | 2020-09-10 16:17:10 | -1 | | | | 0.25 @@ -792,104 +885,126 @@ For example: 30 | 11 | CpuDecompress | 60000000 | 15 | 4000000 | 2020-09-10 16:17:10 | 10 | | | | 0 30 | 12 | ReadTable | 60000000 | 15 | 4000000 | 2020-09-10 16:17:10 | 11 | 520MB | | public.t_inefficient | 0.05 -#. We can look to see if there's any shrinking we can do on the ``GROUP BY`` key +#. Look to see if there's any shrinking that can be done on the ``GROUP BY`` key: - .. code-block:: psql + .. code-block:: postgres - t=> SELECT MAX(LEN(country_code)) FROM t_inefficient; + SELECT MAX(LEN(country_code)) FROM t_inefficient; max --- 3 - With a maximum string length of just 3 characters, our ``VARCHAR(100)`` is way oversized. + With a maximum string length of just 3 characters, our ``TEXT(100)`` is way oversized. #. - We can recreate the table with a more restrictive ``VARCHAR(3)``, and can examine the difference in performance: - - .. code-block:: psql - t=> CREATE TABLE t_efficient - . AS SELECT i, - . amt, - . ts, - . country_code::VARCHAR(3) AS country_code, - . flag - . FROM t_inefficient; - executed - time: 16.03s - - t=> SELECT country_code, - . SUM(amt::bigint) - . FROM t_efficient - . GROUP BY country_code; - executed - time: 4.75s - country_code | sum - -------------+----------- - VUT | 1195416012 - GIB | 1195710372 - TUR | 1195946178 - [...] + Recreate the table with a more restrictive ``TEXT(3)``, and examine the difference in performance: - This time, the entire query took just 4.75 seconds, or just about 91% faster. + .. code-block:: postgres + + CREATE TABLE + t_efficient AS + SELECT + i, + amt, + ts, + country_code :: TEXT(3) AS country_code, + flag + FROM + t_inefficient; + + SELECT + country_code, + SUM(amt :: bigint) + FROM + t_efficient + GROUP BY + country_code; -Improving Sort Performance on Text Keys -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When using VARCHAR, ensure that the maximum length defined in the table structure is as small as necessary. -For example, if you're storing phone numbers, don't define the field as ``VARCHAR(255)``, as that affects sort performance. + country_code | sum + -------------+----------- + VUT | 1195416012 + GIB | 1195710372 + TUR | 1195946178 + [...] -You can run a query to get the maximum column length (e.g. ``MAX(LEN(a_column))``), and potentially modify the table structure. + This time, the query should be about 91% faster. + +Common Solutions for Improving Sort Performance on ``TEXT`` Keys +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - Using Appropriate Text Length + - Define the maximum length of ``TEXT`` fields in your table structure as small as necessary. For example, if you're storing phone numbers, avoid defining the field as ``TEXT(255)`` to optimize sort performance. + * - Optimize Column Length + - Execute a query to determine the maximum length of data in the column (e.g., ``MAX(LEN(a_column))``) and consider modifying the table structure based on this analysis. .. _high_selectivity_data_opt: -6. High Selectivity Data --------------------------- -Selectivity is the ratio of cardinality to the number of records of a chunk. We define selectivity as :math:`\frac{\text{Distinct values}}{\text{Total number of records in a chunk}}` -SQream DB has a hint called ``HIGH_SELECTIVITY``, which is a function you can wrap a condition in. -The hint signals to SQream DB that the result of the condition will be very sparse, and that it should attempt to rechunk -the results into fewer, fuller chunks. +High Selectivity Data +--------------------- + +In SQreamDB, selectivity refers to the ratio of distinct values to the total number of records within a chunk. It is defined by the formula: :math:`\frac{\text{Distinct values}}{\text{Total number of records in a chunk}}` + +SQreamDB provides a hint called ``HIGH_SELECTIVITY`` that can be used to optimize queries. When you wrap a condition with this hint, it signals to SQreamDB that the result of the condition will yield a sparse output. As a result, SQreamDB attempts to rechunk the results into fewer, fuller chunks for improved performance. + .. note:: - SQream DB doesn't do this automatically because it adds a significant overhead on naturally ordered and - well-clustered data, which is the more common scenario. + SQreamDB does not apply this optimization automatically because it introduces significant overhead for naturally ordered and well-clustered data, which is the more common scenario. Identifying the Situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This is easily identifiable - when the amount of average of rows in a chunk is small, following a ``Filter`` operation. -Consider this execution plan: +^^^^^^^^^^^^^^^^^^^^^^^^^ + +This condition is easily identifiable when the average number of rows in a chunk is small, particularly after a Filter operation. + +Consider the following execution plan: -.. code-block:: psql +.. code-block:: postgres - t=> select show_node_info(30); + SELECT SHOW_NODE_INFO(30); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+-------------------+-----------+--------+-------------------+---------------------+----------------+-------+-------+------------+-------- [...] 30 | 38 | Filter | 18160 | 74 | 245 | 2020-09-10 12:17:09 | 37 | | | | 0.012 [...] 30 | 44 | ReadTable | 77000000 | 74 | 1040540 | 2020-09-10 12:17:09 | 43 | 277MB | | public.dim | 0.058 -The table was read entirely - 77 million rows into 74 chunks. -The filter node reduced the output to just 18,160 relevant rows, but they're distributed across the original 74 chunks. -All of these rows could fit in one single chunk, instead of spanning 74 rather sparse chunks. - -Improving Performance with High Selectivity Hints -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* - Use when there's a ``WHERE`` condition on an :ref:`unclustered column`, and when you expect the filter - to cut out more than 60% of the result set. -* Use when the data is uniformly distributed or random - -7. Performance of unsorted data in joins ------------------------------------------- + +The table was initially read entirely, containing 77 million rows divided into 74 chunks. After applying a filter node, the output was reduced to just 18,160 relevant rows, which are still distributed across the original 74 chunks. However, all these rows could fit into a single chunk instead of spanning across 74 sparsely populated chunks. + +Common Solutions for Improving Performance with High Selectivity Hints +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Solution + - Description + * - Using ``HIGH_SELECTIVITY`` hint + - + * When a ``WHERE`` condition is used on an :ref:`unclustered column`, especially if you anticipate the filter to reduce more than 60% of the result set + + * When the data is uniformly distributed or random + + + +Performance of Unsorted Data in Joins +------------------------------------- + When data is not well-clustered or naturally ordered, a join operation can take a long time. Identifying the Situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When running a statement, inspect it with :ref:`show_node_info`. If you see ``Join`` and ``DeferredGather`` among your -top five longest running nodes, there is a potential issue. -In this case, we're also interested in the number of chunks produced by these nodes. +^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you identify ``Join`` and ``DeferredGather`` as two of the top five longest running nodes, this could indicate a potential issue. Additionally, it's important to consider the number of chunks generated by these nodes in such cases. Consider this execution plan: -.. code-block:: psql +.. code-block:: postgres :emphasize-lines: 6,11 - t=> select show_node_info(30); + SELECT SHOW_NODE_INFO(30); stmt_id | node_id | node_type | rows | chunks | avg_rows_in_chunk | time | parent_node_id | read | write | comment | timeSum --------+---------+-------------------+-----------+--------+-------------------+---------------------+----------------+-------+-------+------------+-------- [...] @@ -904,19 +1019,17 @@ Consider this execution plan: 30 | 38 | Filter | 18160 | 74 | 245 | 2020-09-10 12:17:09 | 37 | | | | 0.012 [...] 30 | 44 | ReadTable | 77000000 | 74 | 1040540 | 2020-09-10 12:17:09 | 43 | 277MB | | public.dim | 0.058 -* ``Join`` is the node that matches rows from both table relations. -* ``DeferredGather`` gathers the required column chunks to decompress -Pay special attention to the volume of data removed by the ``Filter`` node. -The table was read entirely - 77 million rows into 74 chunks. -The filter node reduced the output to just 18,160 relevant rows, but they're distributed across the original 74 chunks. -All of these rows could fit in one single chunk, instead of spanning 74 rather sparse chunks. + +The ``Join`` node performs row matching between table relations, while ``DeferredGather`` is responsible for gathering necessary column chunks for decompression. Notably, closely monitor the data volume filtered out by the ``Filter`` node. + +The table of 77 million rows was read into 74 chunks. After applying a filter, only 18,160 relevant rows remained, dispersed across these 74 chunks. Ideally, these rows could be consolidated into a single chunk rather than spanning multiple sparse chunks. Improving Join Performance when Data is Sparse -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -You can tell SQream DB to reduce the amount of chunks involved, if you know that the filter is going to be quite -agressive by using the :ref:`HIGH_SELECTIVITY` hint described :ref:`above`. -This forces the compiler to rechunk the data into fewer chunks. -To tell SQream DB to rechunk the data, wrap a condition (or several) in the ``HIGH_SELECTIVITY`` hint: +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To optimize performance in SQreamDB, especially when dealing with aggressive filtering, you can use the :ref:`HIGH_SELECTIVITY` hint as described :ref:`above`. This hint instructs the compiler to rechunk the data into fewer chunks. + +To apply this optimization, wrap your filtering condition (or conditions) with the ``HIGH_SELECTIVITY`` hint like this: .. code-block:: postgres :emphasize-lines: 13 @@ -925,34 +1038,32 @@ To tell SQream DB to rechunk the data, wrap a condition (or several) in the ``HI SELECT * FROM cdrs WHERE - RequestReceiveTime BETWEEN '2018-01-01 00:00:00.000' AND '2018-08-31 23:59:59.999' - AND EnterpriseID=1150 - AND MSISDN='9724871140341'; + RequestReceiveTime BETWEEN '2018-01-01 00:00:00.000' AND '2018-08-31 23:59:59.999' + AND EnterpriseID=1150 + AND MSISDN='9724871140341'; -- With the hint SELECT * FROM cdrs WHERE - HIGH_SELECTIVITY(RequestReceiveTime BETWEEN '2018-01-01 00:00:00.000' AND '2018-08-31 23:59:59.999') - AND EnterpriseID=1150 - AND MSISDN='9724871140341'; + HIGH_SELECTIVITY(RequestReceiveTime BETWEEN '2018-01-01 00:00:00.000' AND '2018-08-31 23:59:59.999') + AND EnterpriseID=1150 + AND MSISDN='9724871140341'; -8. Manual Join Reordering --------------------------------- -When joining multiple tables, you may wish to change the join order to join the smallest tables first. +Manual Join Reordering +---------------------- + +When performing joins involving multiple tables, consider changing the join order to start with the smallest tables first. Identifying the situation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When joining more than two tables, the ``Join`` nodes will be the most time-consuming nodes. +^^^^^^^^^^^^^^^^^^^^^^^^^ + +When joining more than two tables, the ``Join`` nodes typically represent the most time-consuming operations. Changing the Join Order -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Always prefer to join the smallest tables first. -.. note:: - We consider small tables to be tables that only retain a small amount of rows after conditions - are applied. This bears no direct relation to the amount of total rows in the table. -Changing the join order can reduce the query runtime significantly. In the examples below, we reduce the time -from 27.3 seconds to just 6.4 seconds. +^^^^^^^^^^^^^^^^^^^^^^^ + +It's advisable to prioritize joining the smallest tables first. By small tables, we mean tables that retain a relatively low number of rows after applying filtering conditions, regardless of the total row count in the table. Changing the join order in this way can lead to a significant reduction in query runtime. For instance, in specific examples, this approach has resulted in a remarkable 76.64% reduction in query time. .. code-block:: postgres :caption: Original query @@ -993,5 +1104,6 @@ from 27.3 seconds to just 6.4 seconds. GROUP BY c_nationkey Further Reading -================== +=============== + See our :ref:`sql_best_practices` guide for more information about query optimization and data loading considerations. \ No newline at end of file diff --git a/operational_guides/nba-t10.csv b/operational_guides/nba-t10.csv new file mode 100644 index 000000000..024530355 --- /dev/null +++ b/operational_guides/nba-t10.csv @@ -0,0 +1,10 @@ +Name,Team,Number,Position,Age,Height,Weight,College,Salary +Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0 +Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0 +John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University, +R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0 +Jonas Jerebko,Boston Celtics,8.0,PF,29.0,6-10,231.0,,5000000.0 +Amir Johnson,Boston Celtics,90.0,PF,29.0,6-9,240.0,,12000000.0 +Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0 +Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0 +Terry Rozier,Boston Celtics,12.0,PG,22.0,6-2,190.0,Louisville,1824360.0 diff --git a/operational_guides/optimization_best_practices.rst b/operational_guides/optimization_best_practices.rst index 1cc0ca01e..0ce1b1add 100644 --- a/operational_guides/optimization_best_practices.rst +++ b/operational_guides/optimization_best_practices.rst @@ -1,108 +1,88 @@ .. _sql_best_practices: -********************************** +******************************* Optimization and Best Practices -********************************** +******************************* -This topic explains some best practices of working with SQream DB. +This topic explains some best practices of working with SQreamDB. See also our :ref:`monitoring_query_performance` guide for more information. -.. contents:: In this topic: +.. contents:: :local: + :depth: 1 .. _table_design_best_practices: Table design -============== -This section describes best practices and guidelines for designing tables. +============ -Use date and datetime types for columns ------------------------------------------ -When creating tables with dates or timestamps, using the purpose-built ``DATE`` and ``DATETIME`` types over integer types or ``VARCHAR`` will bring performance and storage footprint improvements, and in many cases huge performance improvements (as well as data integrity benefits). SQream DB stores dates and datetimes very efficiently and can strongly optimize queries using these specific types. +Using ``DATE`` and ``DATETIME`` Data Types +------------------------------------------ -Reduce varchar length to a minimum --------------------------------------- - -With the ``VARCHAR`` type, the length has a direct effect on query performance. - -If the size of your column is predictable, by defining an appropriate column length (no longer than the maximum actual value) you will get the following benefits: - -* Data loading issues can be identified more quickly +When creating tables with dates or timestamps, using the purpose-built ``DATE`` and ``DATETIME`` types over integer types or ``TEXT`` will bring performance and storage footprint improvements, and in many cases huge performance improvements (as well as data integrity benefits). SQreamDB stores dates and datetimes very efficiently and can strongly optimize queries using these specific types. -* SQream DB can reserve less memory for decompression operations - -* Third-party tools that expect a data size are less likely to over-allocate memory - -Don't flatten or denormalize data ------------------------------------ +Avoiding Data flattening and Denormalization +-------------------------------------------- -SQream DB executes JOIN operations very effectively. It is almost always better to JOIN tables at query-time rather than flatten/denormalize your tables. +SQreamDB executes ``JOIN`` operations very effectively. It is almost always better to ``JOIN`` tables at query-time rather than flatten/denormalize your tables. This will also reduce storage size and reduce row-lengths. -We highly suggest using ``INT`` or ``BIGINT`` as join keys, rather than a text/string type. +We highly suggest using ``INT`` or ``BIGINT`` as join keys, rather than a ``TEXT`` or ``STRING`` type. -Convert foreign tables to native tables -------------------------------------------- +Converting Foreign Tables to Native Tables +------------------------------------------ -SQream DB's native storage is heavily optimized for analytic workloads. It is always faster for querying than other formats, even columnar ones such as Parquet. It also enables the use of additional metadata to help speed up queries, in some cases by many orders of magnitude. +SQreamDB's native storage is heavily optimized for analytic workloads. It is always faster for querying than other formats, even columnar ones such as Parquet. It also enables the use of additional metadata to help speed up queries, in some cases by many orders of magnitude. -You can improve the performance of all operations by converting :ref:`foreign tables` into native tables by using the :ref:`create_table_as` syntax. +You can improve the performance of all operations by converting :ref:`foreign_tables` into native tables by using the :ref:`create_table_as` syntax. For example, .. code-block:: postgres - CREATE TABLE native_table AS SELECT * FROM external_table + CREATE TABLE native_table AS SELECT * FROM foreign_table; The one situation when this wouldn't be as useful is when data will be only queried once. -Use information about the column data to your advantage -------------------------------------------------------------- +Leveraging Column Data Information +---------------------------------- Knowing the data types and their ranges can help design a better table. -Set ``NULL`` or ``NOT NULL`` when relevant -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Appropriately Using ``NULL`` and ``NOT NULL`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -For example, if a value can't be missing (or ``NULL``), specify a ``NOT NULL`` constraint on the columns. +For example, if a value cannot be missing (or ``NULL``), specify a ``NOT NULL`` constraint on the columns. Not only does specifying ``NOT NULL`` save on data storage, it lets the query compiler know that a column cannot have a ``NULL`` value, which can improve query performance. -Keep VARCHAR lengths to a minimum -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -While it won't make a big difference in storage, large strings allocate a lot of memory at query time. - -If a column's string length never exceeds 50 characters, specify ``VARCHAR(50)`` rather than an arbitrarily large number. - - Sorting -============== +======= Data sorting is an important factor in minimizing storage size and improving query performance. -* Minimizing storage saves on physical resources and increases performance by reducing overall disk I/O. Prioritize the sorting of low-cardinality columns. This reduces the number of chunks and extents that SQream DB reads during query execution. +* Minimizing storage saves on physical resources and increases performance by reducing overall disk I/O. Prioritize the sorting of low-cardinality columns. This reduces the number of chunks and extents that SQreamDB reads during query execution. -* Where possible, sort columns with the lowest cardinality first. Avoid sorting ``VARCHAR`` and ``TEXT/NVARCHAR`` columns with lengths exceeding 50 characters. +* Where possible, sort columns with the lowest cardinality first. Avoid sorting ``TEXT`` columns with lengths exceeding 50 characters. -* For longer-running queries that run on a regular basis, performance can be improved by sorting data based on the ``WHERE`` and ``GROUP BY`` parameters. Data can be sorted during insert by using :ref:`external_tables` or by using :ref:`create_table_as`. +* For longer-running queries that run on a regular basis, performance can be improved by sorting data based on the ``WHERE`` and ``GROUP BY`` parameters. Data can be sorted during insert by using :ref:`foreign_tables` or by using :ref:`create_table_as`. .. _query_best_practices: -Query best practices -===================== +Query Best Practices +==================== This section describes best practices for writing SQL queries. -Reduce data sets before joining tables ------------------------------------------ +Reducing Datasets Before Joining Tables +--------------------------------------- Reducing the input to a ``JOIN`` clause can increase performance. -Some queries benefit from retreiving a reduced dataset as a subquery prior to a join. +Some queries benefit from retrieving a reduced dataset as a subquery prior to a join. For example, @@ -125,11 +105,11 @@ Can be rewritten as group by 2) AS fact ON dim.store_id=fact.store_id; -Prefer the ANSI JOIN ----------------------------- +Using ANSI ``JOIN`` +------------------- -SQream DB prefers the ANSI JOIN syntax. -In some cases, the ANSI JOIN performs better than the non-ANSI variety. +SQreamDB prefers the ANSI ``JOIN`` syntax. +In some cases, the ANSI ``JOIN`` performs better than the non-ANSI variety. For example, this ANSI JOIN example will perform better: @@ -143,7 +123,7 @@ For example, this ANSI JOIN example will perform better: JOIN "Customers" as c ON s.c_id = c.id AND c.id = 20301125; -This non-ANSI JOIN is supported, but not recommended: +This non-ANSI ``JOIN`` is supported, but not recommended: .. code-block:: postgres :caption: Non-ANSI JOIN may not perform well @@ -158,17 +138,17 @@ This non-ANSI JOIN is supported, but not recommended: .. _high_selectivity: -Use the high selectivity hint --------------------------------- +Using High-Selectivity hint +--------------------------- Selectivity is the ratio of cardinality to the number of records of a chunk. We define selectivity as :math:`\frac{\text{Distinct values}}{\text{Total number of records in a chunk}}` -SQream DB has a hint function called ``HIGH_SELECTIVITY``, which is a function you can wrap a condition in. +SQreamDB has a hint function called ``HIGH_SELECTIVITY``, which is a function you can wrap a condition in. -The hint signals to SQream DB that the result of the condition will be very sparse, and that it should attempt to rechunk +The hint signals to SQreamDB that the result of the condition will be very sparse, and that it should attempt to rechunk the results into fewer, fuller chunks. -Use the high selectivity hint when you expect a predicate to filter out most values. For example, when the data is dispersed over lots of chunks (meaning that the data is :ref:`not well-clustered`). +Use the high selectivity hint when you expect a predicate to filter out most values. For example, when the data is dispersed over lots of chunks (meaning that the data is :ref:`not well-clustered`). For example, @@ -184,8 +164,8 @@ This hint tells the query compiler that the ``WHERE`` condition is expected to f Read more about identifying the scenarios for the high selectivity hint in our :ref:`Monitoring query performance guide`. -Cast smaller types to avoid overflow in aggregates ------------------------------------------------------- +Avoiding Aggregation Overflow +----------------------------- When using an ``INT`` or smaller type, the ``SUM`` and ``COUNT`` operations return a value of the same type. To avoid overflow on large results, cast the column up to a larger type. @@ -198,33 +178,32 @@ For example GROUP BY 1; -Prefer ``COUNT(*)`` and ``COUNT`` on non-nullable columns ------------------------------------------------------------- +Prefer ``COUNT(*)`` and ``COUNT`` to Non-nullable Columns +--------------------------------------------------------- -SQream DB optimizes ``COUNT(*)`` queries very strongly. This also applies to ``COUNT(column_name)`` on non-nullable columns. Using ``COUNT(column_name)`` on a nullable column will operate quickly, but much slower than the previous variations. +SQreamDB optimizes ``COUNT(*)`` queries very strongly. This also applies to ``COUNT(column_name)`` on non-nullable columns. Using ``COUNT(column_name)`` on a nullable column will operate quickly, but much slower than the previous variations. -Return only required columns +Returning Only Required Columns ------------------------------- Returning only the columns you need to client programs can improve overall query performance. This also reduces the overall result set, which can improve performance in third-party tools. -SQream is able to optimize out unneeded columns very strongly due to its columnar storage. +SQreamDB is able to optimize out unneeded columns very strongly due to its columnar storage. -Use saved queries to reduce recurring compilation time -------------------------------------------------------- +Reducing Recurring Compilation Time +----------------------------------- -:ref:`saved_queries` are compiled when they are created. The query plan is saved in SQream DB's metadata for later re-use. +:ref:`saved_queries` are compiled when they are created. The query plan is saved in SQreamDB's metadata for later re-use. -Because the query plan is saved, they can be used to reduce compilation overhead, especially with very complex queries, such as queries with lots of values in an :ref:`IN` predicate. +Saved query plans enable reduced compilation overhead, especially with very complex queries, such as queries with lots of values in an :ref:`IN` predicate. When executed, the saved query plan is recalled and executed on the up-to-date data stored on disk. -See how to use saved queries in the :ref:`saved queries guide`. -Pre-filter to reduce :ref:`JOIN` complexity --------------------------------------------------------- +Reducing :ref:`JOIN` Complexity +-------------------------------------- Filter and reduce table sizes prior to joining on them @@ -232,7 +211,7 @@ Filter and reduce table sizes prior to joining on them SELECT store_name, SUM(amount) - FROM dimention dim + FROM dimension dim JOIN fact ON dim.store_id = fact.store_id WHERE p_date BETWEEN '2019-07-01' AND '2019-07-31' GROUP BY store_name; @@ -243,7 +222,7 @@ Can be rewritten as: SELECT store_name, sum_amount - FROM dimention AS dim + FROM dimension AS dim INNER JOIN (SELECT SUM(amount) AS sum_amount, store_id FROM fact @@ -253,23 +232,20 @@ Can be rewritten as: .. _data_loading_considerations: -Data loading considerations -================================= +Data Loading Considerations +=========================== -Allow and use natural sorting on data ----------------------------------------- +Using Natural Data Sorting +-------------------------- Very often, tabular data is already naturally ordered along a dimension such as a timestamp or area. -This natural order is a major factor for query performance later on, as data that is naturally sorted can be more easily compressed and analyzed with SQream DB's metadata collection. +This natural order is a major factor for query performance later on, as data that is naturally sorted can be more easily compressed and analyzed with SQreamDB's metadata collection. For example, when data is sorted by timestamp, filtering on this timestamp is more effective than filtering on an unordered column. Natural ordering can also be used for effective :ref:`delete` operations. -Further reading and monitoring query performance -======================================================= - -Read our :ref:`monitoring_query_performance` guide to learn how to use the built in monitoring utilities. -The guide also gives concerete examples for improving query performance. +Use the :ref:`monitoring_query_performance` guide to learn about built-in monitoring utilities. +The guide also gives concrete examples for improving query performance. diff --git a/operational_guides/oracle_migration.rst b/operational_guides/oracle_migration.rst new file mode 100644 index 000000000..92fbd3a2a --- /dev/null +++ b/operational_guides/oracle_migration.rst @@ -0,0 +1,751 @@ +.. _oracle_migration: + +********************** +Oracle Migration Guide +********************** + +This guide is designed to assist those who wish to migrate their database systems from Oracle to SQreamDB. Use this guide to learn how to use the most commonly used Oracle functions with their equivalents in SQreamDB. For functions that do not have direct equivalents in SQreamDB, we provide :ref:`User-Defined Functions (UDFs)`. If you need further assistance, our `SQream support team `_ is available to help with any custom UDFs or additional migration questions. + +.. contents:: + :local: + :depth: 2 + +Using SQream Commands, Statements, and UDFs +=========================================== + +Operation Functions +------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``+`` (unary) + - ``+`` (unary) + - +a + * - ``+`` + - ``+`` + - a+ b + * - ``-`` (unary) + - ``-`` (unary) + - -a + * - ``-`` + - ``-`` + - a - b + * - ``*`` + - ``*`` + - a * b + * - ``/`` + - ``/`` + - a / b + * - ``%`` + - ``%`` + - a % b + * - ``&`` + - ``&`` + - AND + * - ``~`` + - ``~`` + - NOT + * - ``|`` + - ``|`` + - OR + * - ``<<`` + - ``<<`` + - Shift left + * - ``>>`` + - ``>>`` + - Shift right + * - ``XOR`` + - ``XOR`` + - XOR + +Conditional Functions +--------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``BETWEEN`` + - ``BETWEEN`` + - Value is in [ or not within ] the range + * - ``CASE`` + - ``CASE`` + - Tests a conditional expression, depending on the result + * - ``COALESCE`` + - ``COALESCE`` + - Evaluate first non-NULL expression + * - ``IN`` + - ``IN`` + - Value is in [ or not within ] a set of values + * - ``ISNULL`` + - ``ISNULL`` + - Alias for COALESCE with two expressions + * - ``IS_ASCII`` + - ``IS_ASCII`` + - Test a TEXT for ASCII-only characters + * - ``IS_NULL`` + - ``IS NULL`` + - Check for NULL [ or non-NULL ] values + * - ``DECODE`` + - ``DECODE`` + - Decodes or extracts binary data from a textual input string + +Conversion Functions +-------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``TO_DATE`` + - ``CAST`` + - Converts a string to a date + * - ``TO_NUMBER`` + - .. code-block:: postgres + + CREATE OR REPLACE FUNCTION SIGN(n,numeric) + RETURNS numeric + AS $$ + CAST(TEXT AS NUMERIC) + $$ LANGUAGE SQL + ; + - Converts a string to a number + +Numeric Functions +----------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``ABS`` + - ``ABS`` + - Calculates the absolute value of an argument + * - ``ACOS`` + - ``ACOS`` + - Calculates the inverse cosine of an argument + * - ``ASIN`` + - ``ASIN`` + - Calculates the inverse sine of an argument + * - ``ATAN`` + - ``ATAN`` + - Calculates the inverse tangent of an argument + * - ``ATN2`` + - ``ATN2`` + - Calculates the inverse tangent for a point (y, x) + * - ``BITAND`` + - ``&`` + - Computes an AND operation on the bits of expr1 and expr2 + * - ``CEIL`` + - ``CEILING``, ``CEIL`` + - Calculates the next integer for an argument + * - ``COS`` + - ``COS`` + - Calculates the cosine of an argument + * - ``COSH`` + - .. code-block:: postgres + + CREATE or replace FUNCTION COSH(x double) + RETURNS double + AS $$ + SELECT (exp(x) + exp(-1*x))/2 + $$ LANGUAGE SQL + ; + - Returns the hyperbolic cosine of n + * - NA + - ``COT`` + - Calculates the cotangent of an argument + * - NA + - ``CRC64`` + - Calculates a CRC-64 hash of an argument + * - NA + - ``DEGREES`` + - Converts a value from radian values to degrees + * - ``EXP`` + - ``EXP`` + - Calculates the natural exponent for an argument + * - ``FLOOR`` + - ``FLOOR`` + - Calculates the largest integer smaller than the argument + * - ``LN`` + - ``LOG`` + - Returns the natural logarithm of n + * - ``LOG(b,n)`` + - .. code-block:: postgres + + CREATE or replace FUNCTION log(b double, n double) + RETURNS double + AS $$ + SELECT (log(n)/log(b)) + $$ LANGUAGE SQL + ; + - Calculates the natural log for an argument + * - ``LOG(10,x)`` + - ``LOG10`` + - Calculates the 10-based log for an argument + * - ``MOD`` + - ``MOD``, ``%`` + - Calculates the modulus (remainder) of two arguments + * - NA + - ``PI`` + - Returns the constant value for π + * - ``NANVL`` + - NA + - Useful only for floating-point numbers of type + * - ``POWER`` + - ``POWER`` + - Calculates x to the power of y (xy) + * - NA + - ``SQUARE`` + - Returns the square value of a numeric expression (x2) + * - NA + - ``RADIANS`` + - Converts a value from degree values to radians + * - ``REMAINDER`` + - .. code-block:: postgres + + CREATE or replace FUNCTION remainder(n1 bigint, n2 bigint) + RETURNS bigint + AS $$ + SELECT (n1 - floor(n1/n2)*n2) + $$ LANGUAGE SQL + ; + - Returns the arguments any numeric datatype + * - ``ROUND (number)`` + - ``ROUND`` + - Rounds an argument down to the nearest integer + * - ``SIGN`` + - .. code-block:: postgres + + CREATE or replace FUNCTION my_sign(n bigint) + RETURNS int + AS $$ + SELECT case when n < 0 then -1 when n = 0 then 0 when n > 0 then 1 end + $$ LANGUAGE SQL + ; + - Returns the sign of the input value + * - ``SIN`` + - ``SIN`` + - Calculates the sine + * - ``SINH`` + - .. code-block:: postgres + + CREATE or replace FUNCTION SINH(x double) + RETURNS double + AS $$ + SELECT (exp(x) - exp(-1*x))/2 + $$ LANGUAGE SQL + ; + - Calculates the hyperbolic sine + * - ``SQRT`` + - ``SQRT`` + - Calculates the square root + * - ``TAN`` + - ``TAN`` + - Calculates the tangent + * - ``TANH`` + - .. code-block:: postgres + + CREATE or replace FUNCTION TANH(x double) + RETURNS double + AS $$ + SELECT (exp(x) - exp(-1*x))/(exp(x) + exp(-1*x)) + $$ LANGUAGE SQL + ; + - Calculates the hyperbolic tangent + * - ``TRUNC (number)`` + - ``TRUNC`` + - Rounds a number to its integer representation towards 0 + * - ``WIDTH_BUCKET(value, low, high, num_buckets)`` + - .. code-block:: postgres + + CREATE or replace FUNCTION myWIDTH_BUCKET(value float, low float, high float, num_buckets int ) + RETURNS INT + AS $$ + select CASE + WHEN value < low THEN 0 + WHEN value >= high THEN num_buckets + 1 + ELSE CEIL(((value - low) / ((high - low) / num_buckets))+1)::INT END + $$ LANGUAGE SQL + ; + - Returns the ID of the bucket into which the value of a specific expression falls + * - NA + - ``TO_HEX`` + - Converts an integer to a hexadecimal representation + +Character Functions Returning Character Values +---------------------------------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``CHR`` + - ``CHR`` + - Returns the character having the binary equivalent + * - ``CONCAT`` + - ``||`` (Concatenate) + - Concatenates all the specified strings and returns the final string + * - ``INITCAP`` + - NA + - Returns char, with the first letter of each word in uppercase + * - ``LOWER`` + - ``LOWER`` + - Returns char, with all letters lowercase + * - ``LPAD`` + - NA + - Returns expr1, left-padded to length n characters + * - ``LTRIM`` + - ``LTRIM`` + - Removes from the left end of char + * - ``NLS_INITCAP`` + - NA + - Returns char, with the first letter of each word in uppercase + * - ``NLS_LOWER`` + - NA + - Returns char, with all letters lowercase + * - ``NLSSORT`` + - NA + - Returns the string of bytes used to sort char + * - ``NLS_UPPER`` + - NA + - Returns char, with all letters uppercase + * - ``REGEXP_REPLACE`` + - ``REGEXP_REPLACE`` + - Replaces a substring in a string that matches a specified pattern + * - ``REGEXP_SUBSTR`` + - ``REGEXP_SUBSTR`` + - Returns a substring of an argument that matches a regular expression + * - ``REPLACE`` + - ``REPLACE`` + - Replaces characters in a string + * - ``RPAD`` + - NA + - Right pads a string to a specified length + * - ``RTRIM`` + - ``RTRIM`` + - Removes the space from the right side of a string + * - ``SOUNDEX`` + - NA + - Converts a normal string into a string of the SOUNDEX type + * - ``SUBSTR`` + - ``SUBSTRING``, ``SUBSTR`` + - Returns a substring of an argument + * - ``TRANSLATE`` + - NA + - Returns ``expr`` with all occurrences of each character in ``from_string``, replaced by its corresponding character + * - ``TRIM`` + - ``TRIM`` + - Trims whitespaces from an argument + * - ``UPPER`` + - ``UPPER`` + - Converts an argument to an upper-case equivalent + * - NA + - ``REPEAT`` + - Repeats a string as many times as specified + * - NA + - ``REVERSE`` + - Returns a reversed order of a character string + * - NA + - ``LEFT`` + - Returns the left part of a character string with the specified number of characters + * - NA + - ``RIGHT`` + - Returns the right part of a character string with the specified number of characters + * - NA + - ``LIKE`` + - Tests if a string matches a given pattern. SQL patterns + * - NA + - ``RLIKE`` + - Tests if a string matches a given regular expression pattern. POSIX regular expressions + * - NA + - ``ISPREFIXOF`` + - Checks if one string is a prefix of the other + +Character Functions Returning Number Values +------------------------------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``ASCII`` + - NA + - Returns the decimal representation in the database character set + * - ``INSTR`` + - ``CHARINDEX`` + - Search string for substring + * - ``LENGTH`` + - ``CHAR_LENGTH`` + - Calculates the length of a string in characters + * - NA + - ``LEN`` + - Calculates the number of characters in a string. (This function is provided for SQL Server compatibility) + * - NA + - ``OCTET_LENGTH`` + - Calculates the number of bytes in a string + * - NA + - ``CHARINDEX`` + - Returns the starting position of a string inside another string + * - NA + - ``PATINDEX`` + - Returns the starting position of a string inside another string + * - ``REGEXP_COUNT`` + - ``REGEXP_COUNT`` + - Calculates the number of matches of a regular expression + * - ``REGEXP_INSTR`` + - ``REGEXP_INSTR`` + - Returns the start position of a regular expression match in an argument + +Datetime Functions +------------------ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``ADD_MONTHS`` + - NA + - Returns a number of months are added to a specified date + * - NA + - ``CURDATE`` + - This function is equivalent to CURRENT_DATE + * - ``CURRENT_DATE`` + - ``CURRENT_DATE`` + - Returns the current date as DATE + * - ``CURRENT_TIMESTAMP`` + - ``CURRENT_TIMESTAMP`` + - Equivalent to ``GETDATE`` + * - ``DBTIMEZONE`` + - NA + - Returns the value of the database time zone + * - ``EXTRACT`` (datetime) + - ``EXTRACT`` + - ANSI syntax for extracting date or time element from a date expression + * - ``FROM_TZ`` + - NA + - Converts a timestamp value and a time zone + * - ``LAST_DAY`` + - ``EOMONTH`` + - Returns the last day of the month in which the specified date value falls + * - NA + - ``CURRENT_TIMESTAMP`` + - Returns the current date and time in the session time zone + * - ``MONTHS_BETWEEN`` + - ``DATEDIFF`` + - Returns the number of months between specified date values + * - ``NEW_TIME`` + - NA + - returns the date and time in time zone + * - ``NEXT_DAY`` + - NA + - Returns the date of the first weekday that is later than a specified data + * - ``NUMTODSINTERVAL`` + - NA + - Converts n to an INTERVAL DAY TO SECOND literal + * - ``NUMTOYMINTERVAL`` + - NA + - Converts number n to an INTERVAL YEAR TO MONTH literal + * - ``ORA_DST_AFFECTED`` + - NA + - Changing the time zone data file + * - ``ORA_DST_CONVERT`` + - NA + - Changing the time zone data file for specify error handling + * - ``ORA_DST_ERROR`` + - NA + - Changing the time zone data file for takes as an argument a datetime + * - ``ROUND`` (date) + - ``ROUND`` + - Rounds an argument down to the nearest integer, or an arbitrary precision + * - ``SESSIONTIMEZONE`` + - NA + - Returns the time zone of the current session + * - ``SYS_EXTRACT_UTC`` + - NA + - extracts the UTC from a datetime value with time zone offset + * - ``SYSDATE`` + - ``SYSDATE`` + - Equivalent to ``GETDATE`` + * - ``SYSTIMESTAMP`` + - ``CURRENT_TIMESTAMP`` + - Returns the current timestamp + * - ``TO_CHAR`` (datetime) + - NA + - Converts a date value to a string in a specified format + * - ``TO_TIMESTAMP`` + - NA + - Converts datatype to a value of TIMESTAMP datatype + * - ``TO_TIMESTAMP_TZ`` + - NA + - Converts datatype to a value of TIMESTAMP WITH TIME ZONE datatype + * - ``TO_DSINTERVAL`` + - NA + - Converts a character string of CHAR datatype + * - ``TO_YMINTERVAL`` + - NA + - Converts a character string of CHAR datatype + * - ``TRUNC`` (date) + - ``TRUNC`` + - Truncates a date element down to a specified date or time element + * - ``TZ_OFFSET`` + - NA + - Returns the time zone offset + * - NA + - ``DATEADD`` + - Adds or subtracts an interval to ``DATE`` or ``DATETIME`` value. + * - NA + - ``DATEDIFF`` + - Calculates the difference between two DATE or DATETIME expressions, in terms of a specific date part + * - NA + - ``DATEPART`` + - Extracts a date or time part from a ``DATE`` or ``DATETIME`` value + * - NA + - ``GETDATE`` + - Returns the current date and time of the system + * - NA + - ``TO_UNIXTS``, ``TO_UNIXTSMS`` + - Converts a ``DATETIME`` value to a ``BIGINT`` representing a ``UNIX`` timestamp + * - NA + - ``FROM_UNIXTS``, ``FROM_UNIXTSMS`` + - Converts a ``BIGINT`` representing a ``UNIX`` timestamp to a ``DATETIME`` value + + +General Comparison Functions +---------------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``GREATEST`` + - NA + - Returns the greatest of a list of one or more expressions + * - ``LEAST`` + - NA + - Returns the least of a list of one or more expressions + +NULL-Related Functions +---------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``COALESCE`` + - ``COALESCE`` + - Returns the first non-null + * - ``LNNVL`` + - NA + - Provides a concise way to evaluate a condition when one or both operands of the condition may be null + * - ``NANVL`` + - NA + - Takes as arguments any numeric data type or any nonnumeric data type + * - ``NULLIF`` + - ``IS NULL`` + - If they are equal, then the function returns null + * - ``NVL`` + - ``ISNULL`` + - Replace null (returned as a blank) with a string in the results of a query + * - ``NVL2`` + - NA + - Determine the value returned by a specified expression is null or not null + +Aggregate Functions +------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - ``AVG`` + - ``AVG`` + - Calculates the average of all of the values + * - ``CHECKSUM`` + - NA + - Detect changes in a table + * - ``COLLECT`` + - NA + - Takes as its argument a column of any type and creates a nested table + * - ``CORR`` + - ``CORR`` + - Calculates the Pearson correlation coefficient + * - ``COUNT`` + - ``COUNT`` + - Calculates the count of all of the values or only distinct values + * - ``COVAR_POP`` + - ``COVAR_POP`` + - Calculates population covariance of values + * - ``COVAR_SAMP`` + - ``COVAR_SAMP`` + - Calculates sample covariance of values + * - ``CUME_DIST`` + - ``CUME_DIST`` + - Calculates the cumulative distribution of a value in a group of values + * - ``FIRST`` + - ``FIRST_VALUE`` + - The FIRST_VALUE function returns the value located in the selected column of the first row of a segment + * - ``GROUP_ID`` + - NA + - Distinguishes duplicate groups resulting from a GROUP BY specification + * - ``GROUPING`` + - NA + - Distinguishes superaggregate rows from regular grouped rows + * - ``GROUPING_ID`` + - NA + - Returns a number corresponding to the GROUPING bit vector associated with a row + * - ``LAST`` + - ``LAST_VALUE`` + - The LAST_VALUE function returns the value located in the selected column of the last row of a segment + * - NA + - ``NTH_VALUE`` + - The NTH_VALUE function returns the value located in the selected column of a specified row of a segment + * - ``MAX`` + - ``MAX`` + - Returns maximum value of all values + * - ``MEDIAN`` + - NA + - Calculates the median value of a column + * - ``MIN`` + - ``MIN`` + - Returns minimum value of all values + * - NA + - ``NTILE`` + - Divides an ordered data set into a number of buckets + * - ``PERCENTILE_CONT`` + - ``PERCENTILE_CONT`` + - Inverse distribution function that assumes a continuous distribution model + * - ``PERCENTILE_DISC`` + - ``PERCENTILE_DISC`` + - Inverse distribution function that assumes a discrete distribution model + * - ``PERCENT_RANK`` + - ``PERCENT_RANK`` + - Range of values returned by PERCENT_RANK is 0 to 1, inclusive + * - ``RANK`` + - ``RANK`` + - Calculates the rank of a value in a group of values + * - ``DENSE_RANK`` + - ``DENSE_RANK`` + - Computes the rank of a row in an ordered group of rows + * - ``STATS_BINOMIAL_TEST`` + - NA + - Exact probability test used for dichotomous variables + * - ``STATS_CROSSTAB`` + - NA + - Method used to analyze two nominal variables + * - ``STATS_F_TEST`` + - NA + - Tests whether two variances are significantly different + * - ``STATS_KS_TEST`` + - NA + - Kolmogorov-Smirnov function that compares two samples to test + * - ``STATS_MODE`` + - NA + - Takes as its argument a set of values and returns the value + * - ``STDDEV`` + - ``STDDEV`` + - Returns the population standard deviation of all input values + * - ``STDDEV_POP`` + - ``STDDEV_POP`` + - Calculates population standard deviation of values + * - ``STDDEV_SAMP`` + - ``STDDEV_SAMP`` + - Calculates sample standard deviation of values + * - ``SUM`` + - ``SUM`` + - Calculates the sum of all of the values or only distinct values + * - ``VAR_POP`` + - ``VAR_POP`` + - Calculates population variance of values + * - ``VAR_SAMP`` + - ``VAR_SAMP`` + - Calculates sample variance of values + * - ``VARIANCE`` + - ``VAR``, ``VARIANCE`` + - Returns the variance of expr + +Analytic Functions +------------------ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Oracle + - SQream + - Description + * - NA + - ``MODE`` + - The ``MODE`` function returns the most common value in the selected column. If there are no repeating values, or if there is the same frequency of multiple values, this function returns the top value based on the ``ORDER BY`` clause + * - ``FEATURE_DETAILS`` + - NA + - Returns feature details for each row in the selection + * - ``FEATURE_ID`` + - NA + - Returns the identifier of the highest value feature for each row + * - ``FEATURE_SET`` + - NA + - Returns a set of feature ID and feature value pairs for each row + * - ``FEATURE_VALUE`` + - NA + - Returns a feature value for each row in the selection + * - ``LEAD`` + - ``LEAD`` + - Returns a value from a subsequent row within the partition of a result set + * - ``LAG`` + - ``LAG`` + - Returns a value from a previous row within the partition of a result set + * - ``PREDICTION`` + - NA + - Returns a prediction for each row in the selection + * - ``PREDICTION_COST`` + - NA + - Returns prediction details for each row in the selection + * - ``PREDICTION_DETAILS`` + - NA + - Returns prediction details for each row in the selection + * - ``PREDICTION_PROBABILITY`` + - NA + - Returns a probability for each row in the selection + * - ``PREDICTION_SET`` + - NA + - Returns a set of predictions with either probabilities or costs for each row + * - ``ROW_NUMBER`` + - ``ROW_NUMBER`` + - Assigns a unique number to each row to which it is applied \ No newline at end of file diff --git a/operational_guides/query_split.rst b/operational_guides/query_split.rst new file mode 100644 index 000000000..239f9e7e7 --- /dev/null +++ b/operational_guides/query_split.rst @@ -0,0 +1,375 @@ +.. _query_split: + +************ +Query Split +************ + +The split query operation optimizes long-running queries by executing them in parallel on different GPUs and/or Workers, reducing overall runtime. This involves breaking down a complex query into parallel executions on small data subsets. To ensure an ordered result set aligned with the original complex query, two prerequisites are essential. First, create an empty table mirroring the original result set's structure. Second, define the ``@@SetResult`` operator to split the query using an ``INTEGER``, ``DATE``, or ``DATETIME`` column, as these types are compatible with the operator's ``min`` and ``max`` variables. + +Splitting is exclusive to the UI, utilizing Meta-scripting, a unique UI feature. Keep in mind that not all queries benefit, as this method introduces overhead runtime. + +.. contents:: + :local: + :depth: 1 + +Syntax +======== + +Creating an empty table mirroring the original query result set's structure using the same DDL: + +.. code-block:: sql + + CREATE TABLE + AS + ( + SELECT + -- Original query.. + WHERE + + ) + -- A false_filter example: 1=2 + +Defining the ``@@setresult`` operator to split the original query using an ``INTEGER``, ``DATE``, or ``DATETIME`` column with ``min`` and ``max`` variables. If the column you're splitting by is used in a ``WHERE`` clause in the original query, use a ``WHERE`` clause when setting the ``SetResult`` operator as well. + +.. code-block:: sql + + @@SetResult minMax + SELECT + MIN() AS min, + MAX() AS max + FROM + + [WHERE + BETWEEN + -- Integer Range: + 1 AND 100 + | -- Date Range: + 'yyyy-mm-dd' AND 'yyyy-mm-dd' + | -- DateTime Range: + 'yyyy-mm-dd hh:mm:ss:SSS' AND 'yyyy-mm-dd hh:mm:ss:SSS'] + +Defining the operator that determines the number of instances (splits) based on the data type of the column by which the query is split: + +* **INTEGER column:** use the ``@@SplitQueryByNumber`` operator + +.. code-block:: sql + + @@SplitQueryByNumber instances = , from = minMax[0].min, to = minMax[0].max + INSERT INTO + ( + SELECT + -- Original query.. + WHERE + BETWEEN '${from}' and '${to}' + ) + +* **DATE column:** use the ``@@SplitQueryByDate`` operator + +.. code-block:: sql + + @@SplitQueryByDate instances = , from = minMax[0].min, to = minMax[0].max + INSERT INTO + ( + SELECT + -- Original query.. + WHERE + BETWEEN '${from}' and '${to}' + ) + +* **DATETIME column:** use the ``@@SplitQueryByDateTime`` operator + +.. code-block:: sql + + @@SplitQueryByDateTime instances = , from = minMax[0].min, to = minMax[0].max + INSERT INTO + ( + SELECT + -- Original query.. + WHERE BETWEEN '${from}' and '${to}' + ) + +Gathering results: + +.. code-block:: sql + + -- Gathering results for queries without aggregations: + + SELECT * + FROM + + ; + + -- Gathering results for queries with aggregations and/or AVERAGE statement: + + - AVERAGE: + + SELECT + , [,...], + [SUM([DISTINCT] expr) AS ], + [SUM(count_column) AS ], + [SUM(avg_column1) / SUM(avg_column2) AS ] + FROM + + GROUP BY + , [,...] + ORDER BY + + + -- Do not use a WHERE clause + +Example +======== + +.. contents:: + :local: + :depth: 1 + +Creating a Sample Table and Query +---------------------------------- + +To split your first query, create the following table and insert data into it: + +.. code-block:: sql + + CREATE TABLE MyTable ( + id INT, + name TEXT NOT NULL, + age INT, + salary INT, + quantity INT + ); + + -- Inserting data into the table + INSERT INTO MyTable (id, name, age, salary, quantity) + VALUES + (1, 'John', 25, 50000, 10), + (2, 'Jane', 30, 60000, 20), + (3, 'Bob', 28, 55000, 15), + (4, 'Emily', 35, 70000, 18), + (5, 'David', 32, 62000, 22), + (6, 'Sarah', 27, 52000, 12), + (7, 'Michael', 40, 75000, 17), + (8, 'Olivia', 22, 48000, 25), + (9, 'William', 31, 58000, 14), + (10, 'Sophia', 29, 56000, 19), + (11, 'Liam', 26, 51000, 13), + (12, 'Emma', 33, 64000, 16), + (13, 'Daniel', 24, 49000, 23), + (14, 'Ava', 37, 69000, 21), + (15, 'Matthew', 23, 47000, 28), + (16, 'Ella', 34, 67000, 24), + (17, 'James', 28, 55000, 11), + (18, 'Grace', 39, 72000, 26), + (19, 'Benjamin', 30, 60000, 18), + (20, 'Chloe', 25, 50000, 14), + (21, 'Logan', 38, 71000, 20), + (22, 'Mia', 27, 52000, 16), + (23, 'Christopher', 32, 62000, 22), + (24, 'Aiden', 29, 56000, 19), + (25, 'Lily', 36, 68000, 15), + (26, 'Jackson', 31, 58000, 23), + (27, 'Harper', 24, 49000, 12), + (28, 'Ethan', 35, 70000, 17), + (29, 'Isabella', 22, 48000, 25), + (30, 'Carter', 37, 69000, 14), + (31, 'Amelia', 26, 51000, 21), + (32, 'Lucas', 33, 64000, 19), + (33, 'Abigail', 28, 55000, 16), + (34, 'Mason', 39, 72000, 18), + (35, 'Evelyn', 30, 60000, 25), + (36, 'Alexander', 23, 47000, 13), + (37, 'Addison', 34, 67000, 22), + (38, 'Henry', 25, 50000, 20), + (39, 'Avery', 36, 68000, 15), + (40, 'Sebastian', 29, 56000, 24), + (41, 'Layla', 31, 58000, 11), + (42, 'Wyatt', 38, 71000, 26), + (43, 'Nora', 27, 52000, 19), + (44, 'Grayson', 32, 62000, 17), + (45, 'Scarlett', 24, 49000, 14), + (46, 'Gabriel', 35, 70000, 23), + (47, 'Hannah', 22, 48000, 16), + (48, 'Eli', 37, 69000, 25), + (49, 'Paisley', 28, 55000, 18), + (50, 'Owen', 33, 64000, 12); + +Next, we'll split the following query: + +.. code-block:: sql + + SELECT + age, + COUNT(*) AS total_people, + AVG(salary) AS avg_salary, + SUM(quantity) AS total_quantity, + SUM(CASE WHEN quantity > 20 THEN 1 ELSE 0 END) AS high_quantity_count, + SUM(CASE WHEN age BETWEEN 25 AND 30 THEN salary ELSE 0 END) AS total_salary_age_25_30 + FROM + MyTable + WHERE + salary > 55000 + GROUP BY + age + ORDER BY + age; + +Splitting the Query +-------------------- + +1. Prepare an empty table mirroring the original query result set’s structure with the same DDL, using a false filter under the ``WHERE`` clause. + + An **empty** table named ``FinalResult`` is created. + +.. code-block:: sql + + CREATE OR REPLACE TABLE FinalResult + AS + ( + SELECT + age, + COUNT(*) AS total_people, + SUM(salary) AS avg_salary, + COUNT(salary) AS avg_salary2, + SUM(quantity) AS total_quantity, + SUM(CASE WHEN quantity > 20 THEN 1 ELSE 0 END) AS high_quantity_count, + SUM(CASE WHEN age BETWEEN 25 AND 30 THEN salary ELSE 0 END) AS total_salary_age_25_30 + FROM + MyTable + WHERE + 1=0 + AND salary > 55000 + GROUP BY + age + ORDER BY + age + ); + +2. Set the ``@@setresult`` operator to split the original query using ``min`` and ``max`` variables. + +.. code-block:: sql + + @@ SetResult minMax + SELECT min(age) as min, max(age) as max + FROM mytable + ; + +3. Set the ``@@SplitQueryByNumber`` operator with the number of instances (splits) of your query (here based on an ``INTEGER`` column), and set the ``between ${from} and ${to}`` clause with the name of the column by which you wish to split your query (here the query is split by the ``age`` column. + +.. code-block:: postgres + + @@SplitQueryByNumber instances = 4, from = minMax[0].min, to = minMax[0].max + INSERT INTO FinalResult + ( + SELECT + age, + COUNT(*) AS total_people, + SUM(salary) AS avg_salary, + COUNT(salary) AS avg_salary2, + SUM(quantity) AS total_quantity, + SUM(CASE WHEN quantity > 20 THEN 1 ELSE 0 END) AS high_quantity_count, + SUM(CASE WHEN age BETWEEN 25 AND 30 THEN salary ELSE 0 END) AS total_salary_age_25_30 + FROM + MyTable + WHERE + age between '${from}' and '${to}' + AND salary > 55000 + GROUP BY + age + ORDER BY + age + ); + +4. Gather the results of your query. + +.. code-block:: sql + + SELECT * FROM FinalResult ; + +If we were to split the query +Create a query that gathers the results of all instances (splits) into the empty table you created in step 1. + +.. code-block:: sql + + SELECT + age, + SUM(total_people) AS total_people, + SUM(avg_salary) / SUM(avg_salary2) AS avg_salary, + SUM(total_quantity) AS total_quantity, + SUM(high_quantity_count) AS high_quantity_count, + SUM(total_salary_age_25_30) AS total_salary_age_25_30 + FROM + FinalResult + GROUP BY + age + ORDER BY + age + ; + +5. Arrange ALL sequential scripts on one Editor tab. + +6. Ensure that EACH script ends with a ``;``. + +7. Ensure that the **Execute** button is set to **All** so that all queries are consecutively executed. + +8. Select the **Execute** button. + + All scripts are executed, resulting in the splitting of the initial query and a table containing the final result set. + +Best Practices +================ + +General +-------- + +* When incorporating the ``LIMIT`` clause or any aggregate function in your query, split the query based only on a ``GROUP BY`` column. If no relevant columns are present in the ``GROUP BY`` clause, the query might not be suitable for splitting. + +* If you are not using aggregations, it's best to split the query using a column that appears in the a ``WHERE`` or ``JOIN`` clause. + +* When using the ``JOIN`` key, it is usually better to use the key of the smaller table. + +Choosing a Column to Split by +------------------------------ + +The column you split by must be sorted or mostly sorted. Meaning, that even if the column values may not be perfectly ordered, they still follow a general sequence or trend. + +Aggregation Best Practices +-------------------------- + +Aggregation functions, or special functions need to have adjustments in the query that gathers the results of all instances (splits) into the empty table: + +* ``COUNT`` becomes ``SUM`` + +* The following statement and functions are split into two columns in the query split and then merged to be executed as one statement or function in the final query: + + * ``AVERAGE`` + * User defined functions + * Variance functions + * Standard deviation functions + +Date as Number best practices +------------------------------- + +When date is stored as number, using the number of workers as the instances number may not result in the expected way. +e.g. if date run from 20210101 to 20210630 splitting to 8 will result in 6 relevant splits, as SQream only checks min and max and splits accordingly (20210630-20210101)/8. we get an instance of empty data with dates ranging from 20210432 to 20210499 (not really dates, but real numbers). +In this case, we need to adjust the number of instance to get the right size splits. In the above example we need to split to 64, and each worker will run 3 splits with actual data. + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/operational_guides/saved_queries.rst b/operational_guides/saved_queries.rst index d554b4dc8..2aade02e9 100644 --- a/operational_guides/saved_queries.rst +++ b/operational_guides/saved_queries.rst @@ -4,119 +4,72 @@ Saved Queries *********************** -Saved queries can be used to reuse a query plan for a query to eliminate compilation times for repeated queries. They also provide a way to implement 'parameterized views'. +The ``save_query`` command serves to both generate and store an execution plan, offering time savings for the execution of frequently used complex queries. It's important to note that the saved execution plan is closely tied to the structure of its underlying tables. Consequently, if any of the objects mentioned in the query undergo modification, the saved query must be recreated. -How saved queries work -========================== +Saved queries undergo compilation during their creation. When executed, these queries utilize the precompiled query plan instead of compiling a new plan at query runtime. -Saved queries are compiled when they are created. When a saved query is run, this query plan is used instead of compiling a query plan at query time. +Syntax +====== -Parameters support -=========================== +Saved queries related syntax: -Query parameters can be used as substitutes for literal expressions in queries. +.. code-block:: sql -* Parameters cannot be used to substitute things like column names and table names. + -- Saving a query + SELECT SAVE_QUERY(saved_query_name, parameterized_query_string) + + -- Showing a saved query + SELECT SHOW_SAVED_QUERY(saved_query_name) -* Query parameters of a string datatype (like ``VARCHAR``) must be of a fixed length, and can be used in equality checks, but not patterns (e.g. :ref:`like`, :ref:`rlike`, etc.) + -- Listing saved queries + SELECT LIST_SAVED_QUERIES() + + -- Executing a saved query + SELECT EXECUTE_SAVED_QUERY(saved_query_name, [ , argument [ , ... ] ] ) + + -- Dropping a saved query + SELECT DROP_SAVED_QUERY(saved_query_name) -Creating a saved query -====================== + saved_query_name ::= string_literal + parameterized_query_string ::= string_literal + argument ::= string_literal | number_literal -A saved query is created using the :ref:`save_query` utility command. +Parameter Support +------------------ -Saving a simple query ---------------------------- +Query parameters can be used as substitutes for constants expressions in queries. -.. code-block:: psql +* Parameters cannot be used to substitute identifiers like column names and table names. - t=> SELECT SAVE_QUERY('select_all','SELECT * FROM nba'); - executed +* Query parameters of a string datatype must be of a fixed length and may be used in equality checks but not with patterns such as :ref:`like` and :ref:`rlike`. -Saving a parametrized query ------------------------------------------- +Permissions +============ -Use parameters to replace them later at execution time. - - - -.. code-block:: psql - - t=> SELECT SAVE_QUERY('select_by_weight_and_team','SELECT * FROM nba WHERE Weight > ? AND Team = ?'); - executed - -.. TODO tip Use dollar quoting (`$$`) to avoid escaping strings. -.. this makes no sense unless you have a query which would otherwise need escaping -.. t=> SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); -.. executed - - -Listing and executing saved queries -====================================== - -Saved queries are saved as a database objects. They can be listed in one of two ways: - -Using the :ref:`catalog`: - -.. code-block:: psql - - t=> SELECT * FROM sqream_catalog.savedqueries; - name | num_parameters - --------------------------+--------------- - select_all | 0 - select_by_weight | 1 - select_by_weight_and_team | 2 - -Using the :ref:`list_saved_queries` utility function: - -.. code-block:: psql - - t=> SELECT LIST_SAVED_QUERIES(); - saved_query - ------------------------- - select_all - select_by_weight - select_by_weight_and_team - -Executing a saved query requires calling it by it's name in a :ref:`execute_saved_query` statement. A saved query with no parameter is called without parameters. - -.. code-block:: psql - - t=> SELECT EXECUTE_SAVED_QUERY('select_all'); - Name | Team | Number | Position | Age | Height | Weight | College | Salary - -------------------------+------------------------+--------+----------+-----+--------+--------+-----------------------+--------- - Avery Bradley | Boston Celtics | 0 | PG | 25 | 6-2 | 180 | Texas | 7730337 - Jae Crowder | Boston Celtics | 99 | SF | 25 | 6-6 | 235 | Marquette | 6796117 - John Holland | Boston Celtics | 30 | SG | 27 | 6-5 | 205 | Boston University | - R.J. Hunter | Boston Celtics | 28 | SG | 22 | 6-5 | 185 | Georgia State | 1148640 - [...] - -Executing a saved query with parameters requires specifying the parameters in the order they appear in the query: - -.. code-block:: psql +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Statement / Function + - Permission + * - :ref:`save_query` + - Saving queries requires no special permissions per se, however, it does require from the user to have permissions to access the tables referenced in the query and other query element permissions. The user who saved the query is granted all permissions on the saved query. + * - :ref:`show_saved_query` + - Showing a saved query requires ``SELECT`` permissions on the saved query. + * - :ref:`list_saved_queries` + - Listing saved queries requires no special permissions. + * - :ref:`execute_saved_query` + - Executing a saved query requires ``USAGE`` permissions on the saved query and ``SELECT`` permissions to access the tables referenced in the query. + * - :ref:`drop_saved_query` + - Dropping a saved query requires ``DDL`` permissions on the saved query and ``SELECT`` permissions to access the tables referenced in the query. - t=> SELECT EXECUTE_SAVED_QUERY('select_by_weight_and_team', 240, 'Toronto Raptors'); - Name | Team | Number | Position | Age | Height | Weight | College | Salary - ------------------+-----------------+--------+----------+-----+--------+--------+-------------+-------- - Bismack Biyombo | Toronto Raptors | 8 | C | 23 | 6-9 | 245 | | 2814000 - James Johnson | Toronto Raptors | 3 | PF | 29 | 6-9 | 250 | Wake Forest | 2500000 - Jason Thompson | Toronto Raptors | 1 | PF | 29 | 6-11 | 250 | Rider | 245177 - Jonas Valanciunas | Toronto Raptors | 17 | C | 24 | 7-0 | 255 | | 4660482 +Parameterized Query +==================== +Parameterized queries, also known as prepared statements, enable the usage of parameters which may be replaced by actual values when executing the query. They are created and managed in application code, primarily to optimize query execution, enhance security, and allow for the reuse of query templates with different parameter values. -Dropping a saved query -============================= +.. code-block:: sql -When you're done with a saved query, or would like to replace it with another, you can drop it with :ref:`drop_saved_query`: + SELECT SAVE_QUERY('select_by_weight_and_team','SELECT * FROM nba WHERE Weight > ? AND Team = ?'); -.. code-block:: psql - t=> SELECT DROP_SAVED_QUERY('select_all'); - executed - t=> SELECT DROP_SAVED_QUERY('select_by_weight_and_team'); - executed - - t=> SELECT LIST_SAVED_QUERIES(); - saved_query - ------------------------- - select_by_weight diff --git a/operational_guides/seeing_system_objects_as_ddl.rst b/operational_guides/seeing_system_objects_as_ddl.rst deleted file mode 100644 index 4f9f596dd..000000000 --- a/operational_guides/seeing_system_objects_as_ddl.rst +++ /dev/null @@ -1,171 +0,0 @@ -.. _seeing_system_objects_as_ddl: - -******************************** -Seeing System Objects as DDL -******************************** - -Dump specific objects -=========================== - -Tables ----------- - -See :ref:`get_ddl` for more information. - -.. rubric:: Examples - -Getting the DDL for a table -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - farm=> SELECT GET_DDL('cool_animals'); - create table "public"."cool_animals" ( - "id" int not null, - "name" varchar(30) not null, - "weight" double null, - "is_agressive" bool default false not null ) - ; - -Exporting table DDL to a file -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: postgres - - COPY (SELECT GET_DDL('cool_animals')) TO '/home/rhendricks/animals.ddl'; - -Views ----------- - -See :ref:`get_view_ddl` for more information. - -.. rubric:: Examples - -Listing all views -^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - farm=> SELECT view_name FROM sqream_catalog.views; - view_name - ---------------------- - angry_animals - only_agressive_animals - - -Getting the DDL for a view -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - farm=> SELECT GET_VIEW_DDL('angry_animals'); - create view "public".angry_animals as - select - "cool_animals"."id" as "id", - "cool_animals"."name" as "name", - "cool_animals"."weight" as "weight", - "cool_animals"."is_agressive" as "is_agressive" - from - "public".cool_animals as cool_animals - where - "cool_animals"."is_agressive" = false; - -Exporting view DDL to a file -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: postgres - - COPY (SELECT GET_VIEW_DDL('angry_animals')) TO '/home/rhendricks/angry_animals.sql'; - -User defined functions -------------------------- - -See :ref:`get_function_ddl` for more information. - -.. rubric:: Examples - -Listing all UDFs -^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - master=> SELECT * FROM sqream_catalog.user_defined_functions; - database_name | function_id | function_name - --------------+-------------+-------------- - master | 1 | my_distance - -Getting the DDL for a function -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: psql - - master=> SELECT GET_FUNCTION_DDL('my_distance'); - create function "my_distance" (x1 float, - y1 float, - x2 float, - y2 float) returns float as - $$ - import math - if y1 < x1: - return 0.0 - else: - return math.sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2) - $$ - language python volatile; - -Exporting function DDL to a file ------------------------------------- - -.. code-block:: postgres - - COPY (SELECT GET_FUNCTION_DDL('my_distance')) TO '/home/rhendricks/my_distance.sql'; - -Saved queries ------------------ - -See :ref:`list_saved_queries`, :ref:`show_saved_query` for more information. - -Dump entire database DDLs -================================== - -Dumping the database DDL includes tables and views, but not UDFs and saved queries. - -See :ref:`dump_database_ddl` for more information. - -.. rubric:: Examples - -Exporting database DDL to a client ---------------------------------------- - -.. code-block:: psql - - farm=> SELECT DUMP_DATABASE_DDL(); - create table "public"."cool_animals" ( - "id" int not null, - "name" varchar(30) not null, - "weight" double null, - "is_agressive" bool default false not null - ) - ; - - create view "public".angry_animals as - select - "cool_animals"."id" as "id", - "cool_animals"."name" as "name", - "cool_animals"."weight" as "weight", - "cool_animals"."is_agressive" as "is_agressive" - from - "public".cool_animals as cool_animals - where - "cool_animals"."is_agressive" = false; - -Exporting database DDL to a file ---------------------------------------- - -.. code-block:: postgres - - COPY (SELECT DUMP_DATABASE_DDL()) TO '/home/rhendricks/database.ddl'; - - - -.. note:: To export data in tables, see :ref:`copy_to`. diff --git a/reference/.DS_Store b/reference/.DS_Store new file mode 100644 index 000000000..bed65e952 Binary files /dev/null and b/reference/.DS_Store differ diff --git a/reference/catalog_reference.rst b/reference/catalog_reference.rst index 8cfa8e832..14b1f2a2d 100644 --- a/reference/catalog_reference.rst +++ b/reference/catalog_reference.rst @@ -1,606 +1,88 @@ .. _catalog_reference: -************************************* -Catalog reference -************************************* +*********************** +Catalog Reference +*********************** -SQream DB contains a schema called ``sqream_catalog`` that contains information about your database's objects - tables, columns, views, permissions, and more. +The SQreamDB database uses a schema called ``sqream_catalog`` that contains information about database objects such as tables, columns, views, and permissions. Some additional catalog tables are used primarily for internal analysis and may differ across SQreamDB versions. -Some additional catalog tables are used primarily for internal introspection, which could change across SQream DB versions. +What Information Does the Schema Contain? +========================================== -.. contents:: In this topic: - :local: +The schema contains data management tables with information about structure and management of database elements, including tables, schemas, queries, and permissions, and physical storage and organization of data tables of extents, chunk columns, chunks, and delete predicates. + +How to Get Table Information? +============================= + +To get the information stored on a table, use this syntax, as in this example of working with the ``parameters`` table: + +.. code-block:: sql -Types of data exposed by ``sqream_catalog`` -============================================== + SELECT * FROM sqream_catalog.parameters; + +To get the table ddl, use this syntax, as in this example of working with the ``parameters`` table: + +.. code-block:: sql + + SELECT get_ddl('sqream_catalog.parameters'); + +Database Management Tables +--------------------------- -.. list-table:: Database objects +.. list-table:: :widths: auto :header-rows: 1 - * - Object + * - Database Object - Table - * - Clustering keys + * - :ref:`Clustering Keys` - ``clustering_keys`` - * - Columns + * - :ref:`Columns` - ``columns``, ``external_table_columns`` - * - Databases + * - :ref:`Databases` - ``databases`` - * - Permissions - - ``table_permissions``, ``database_permissions``, ``schema_permissions``, ``permission_types``, ``udf_permissions`` - * - Roles - - ``roles``, ``roles_memeberships`` - * - Schemas + * - :ref:`Parameters` + - ``parameters`` + * - :ref:`Permissions` + - ``table_permissions``, ``database_permissions``, ``schema_permissions``, ``permission_types``, ``udf_permissions``, ``sqream_catalog.table_default_permissions`` + * - :ref:`Queries` + - ``savedqueries`` + * - :ref:`Roles` + - ``roles``, ``role_memberships`` + * - :ref:`Schemas` - ``schemas`` - * - Sequences - - ``identity_key`` - * - Tables + * - :ref:`Tables` - ``tables``, ``external_tables`` - * - Views + * - :ref:`Views` - ``views`` - * - UDFs + * - :ref:`User Defined Functions` - ``user_defined_functions`` -The catalog contains a few more tables which contain storage details for internal use +Data Storage and Organization Tables +--------------------------------------- -.. list-table:: Storage objects +.. list-table:: :widths: auto :header-rows: 1 - * - Object + * - Database Object - Table - * - Extents - - ``extents`` - * - Chunks - - ``chunks`` - * - Delete predicates - - ``delete_predicates`` - -Tables in the catalog -======================== - -clustering_keys ------------------------ - -Explicit clustering keys for tables. - -When more than one clustering key is defined, each key is listed in a separate row. - - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the table - * - ``table_id`` - - ID of the table containing the column - * - ``schema_name`` - - Name of the schema containing the table - * - ``table_name`` - - Name of the table containing the column - * - ``clustering_key`` - - Name of the column that is a clustering key for this table - -columns --------- - -Column objects for standard tables - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the table - * - ``schema_name`` - - Name of the schema containing the table - * - ``table_id`` - - ID of the table containing the column - * - ``table_name`` - - Name of the table containing the column - * - ``column_id`` - - Ordinal of the column in the table (begins at 0) - * - ``column_name`` - - Name of the column - * - ``type_name`` - - :ref:`Data type ` of the column - * - ``column_size`` - - The maximum length in bytes. - * - ``has_default`` - - ``NULL`` if the column has no default value. ``1`` if the default is a fixed value, or ``2`` if the default is an :ref:`identity` - * - ``default_value`` - - :ref:`Default value` for the column - * - ``compression_strategy`` - - User-overridden compression strategy - * - ``created`` - - Timestamp when the column was created - * - ``altered`` - - Timestamp when the column was last altered - - -.. _external_tables_table: - -external_tables ----------------- - -``external_tables`` identifies external tables in the database. - -For ``TABLES`` see :ref:`tables ` - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the table - * - ``table_id`` - - Database-unique ID for the table - * - ``schema_name`` - - Name of the schema containing the table - * - ``table_name`` - - Name of the table - * - ``format`` - - - Identifies the foreign data wrapper used. - - ``0`` for csv_fdw, ``1`` for parquet_fdw, ``2`` for orc_fdw. - - * - ``created`` - - Identifies the clause used to create the table - -external_table_columns ------------------------- - -Column objects for external tables - -databases ------------ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_Id`` - - Unique ID of the database - * - ``database_name`` - - Name of the database - * - ``default_disk_chunk_size`` - - Internal use - * - ``default_process_chunk_size`` - - Internal use - * - ``rechunk_size`` - - Internal use - * - ``storage_subchunk_size`` - - Internal use - * - ``compression_chunk_size_threshold`` - - Internal use - -database_permissions ----------------------- - -``database_permissions`` identifies all permissions granted to databases. - -There is one row for each combination of role (grantee) and permission granted to a database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database the permission applies to - * - ``role_id`` - - ID of the role granted permissions (grantee) - * - ``permission_type`` - - Identifies the permission type - - -identity_key --------------- - - -permission_types ------------------- - -``permission_types`` Identifies the permission names that exist in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``permission_type_id`` - - ID of the permission type - * - ``name`` - - Name of the permission type - -roles ------- - -``roles`` identifies the roles in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``role_id`` - - Database-unique ID of the role - * - ``name`` - - Name of the role - * - ``superuser`` - - Identifies if this role is a superuser. ``1`` for superuser or ``0`` otherwise. - * - ``login`` - - Identifies if this role can be used to log in to SQream DB. ``1`` for yes or ``0`` otherwise. - * - ``has_password`` - - Identifies if this role has a password. ``1`` for yes or ``0`` otherwise. - * - ``can_create_function`` - - Identifies if this role can create UDFs. ``1`` for yes, ``0`` otherwise. - -roles_memberships -------------------- - -``roles_memberships`` identifies the role memberships in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``role_id`` - - Role ID - * - ``member_role_id`` - - ID of the parent role from which this role will inherit - * - ``inherit`` - - Identifies if permissions are inherited. ``1`` for yes or ``0`` otherwise. - -savedqueries ----------------- - -``savedqueries`` identifies the :ref:`saved_queries` in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``name`` - - Saved query name - * - ``num_parameters`` - - Number of parameters to be replaced at run-time - -schemas ----------- - -``schemas`` identifies all the database's schemas. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``schema_id`` - - Unique ID of the schema - * - ``schema_name`` - - Name of the schema - * - ``schema_owner`` - - Name of the role who owns this schema - * - ``rechunker_ignore`` - - Internal use - - -schema_permissions --------------------- - -``schema_permissions`` identifies all permissions granted to schemas. - -There is one row for each combination of role (grantee) and permission granted to a schema. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the schema - * - ``schema_id`` - - ID of the schema the permission applies to - * - ``role_id`` - - ID of the role granted permissions (grantee) - * - ``permission_type`` - - Identifies the permission type - - -.. _tables_table: - -tables ----------- - -``tables`` identifies proper SQream tables in the database. - -For ``EXTERNAL TABLES`` see :ref:`external_tables ` - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the table - * - ``table_id`` - - Database-unique ID for the table - * - ``schema_name`` - - Name of the schema containing the table - * - ``table_name`` - - Name of the table - * - ``row_count_valid`` - - Identifies if the ``row_count`` can be used - * - ``row_count`` - - Number of rows in the table - * - ``rechunker_ignore`` - - Internal use - - -table_permissions ------------------- - -``table_permissions`` identifies all permissions granted to tables. - -There is one row for each combination of role (grantee) and permission granted to a table. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the table - * - ``table_id`` - - ID of the table the permission applies to - * - ``role_id`` - - ID of the role granted permissions (grantee) - * - ``permission_type`` - - Identifies the permission type - - -udf_permissions ------------------- - -user_defined_functions -------------------------- - -``user_defined_functions`` identifies UDFs in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the database containing the view - * - ``function_id`` - - Database-unique ID for the UDF - * - ``function_name`` - - Name of the UDF - -views -------- - -``views`` identifies views in the database. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``view_id`` - - Database-unique ID for the view - * - ``view_schema`` - - Name of the schema containing the view - * - ``view_name`` - - Name of the view - * - ``view_data`` - - Internal use - * - ``view_query_text`` - - Identifies the ``AS`` clause used to create the view - - -Additional tables -====================== - -There are additional tables in the catalog that can be used for performance monitoring and inspection. - -The definition for these tables is provided below could change across SQream DB versions. - -extents ----------- - -``extents`` identifies storage extents. - -Each storage extents can contain several chunks. - -.. note:: This is an internal table designed for low-level performance troubleshooting. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the databse containing the extent - * - ``table_id`` - - ID of the table containing the extent - * - ``column_id`` - - ID of the column containing the extent - * - ``extent_id`` - - ID for the extent - * - ``size`` - - Extent size in megabytes - * - ``path`` - - Full path to the extent on the file system - -chunk_columns -------------------- - -``chunk_columns`` lists chunk information by column. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the databse containing the extent - * - ``table_id`` - - ID of the table containing the extent - * - ``column_id`` - - ID of the column containing the extent - * - ``chunk_id`` - - ID for the chunk - * - ``extent_id`` - - ID for the extent - * - ``compressed_size`` - - Actual chunk size in bytes - * - ``uncompressed_size`` - - Uncompressed chunk size in bytes - * - ``compression_type`` - - Actual compression scheme for this chunk - * - ``long_min`` - - Minimum numeric value in this chunk (if exists) - * - ``long_max`` - - Maximum numeric value in this chunk (if exists) - * - ``string_min`` - - Minimum text value in this chunk (if exists) - * - ``string_max`` - - Maximum text value in this chunk (if exists) - * - ``offset_in_file`` - - Internal use - -.. note:: This is an internal table designed for low-level performance troubleshooting. - -chunks -------- - -``chunks`` identifies storage chunks. - -.. note:: This is an internal table designed for low-level performance troubleshooting. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the databse containing the chunk - * - ``table_id`` - - ID of the table containing the chunk - * - ``column_id`` - - ID of the column containing the chunk - * - ``rows_num`` - - Amount of rows contained in the chunk - * - ``deletion_status`` - - When data is deleted from the table, it is first deleted logically. This value identifies how much data is deleted from the chunk. ``0`` for no data, ``1`` for some data, ``2`` to specify the entire chunk is deleted. - -delete_predicates -------------------- - -``delete_predicates`` identifies the existing delete predicates that have not been cleaned up. - -Each :ref:`DELETE ` command may result in several entries in this table. - -.. note:: This is an internal table designed for low-level performance troubleshooting. - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Column - - Description - * - ``database_name`` - - Name of the databse containing the predicate - * - ``table_id`` - - ID of the table containing the predicate - * - ``max_chunk_id`` - - Internal use. Placeholder marker for the highest ``chunk_id`` logged during the DELETE operation. - * - ``delete_predicate`` - - Identifies the DELETE predicate - - -Examples -=========== - -List all tables in the database ----------------------------------- - -.. code-block:: psql - - master=> SELECT * FROM sqream_catalog.tables; - database_name | table_id | schema_name | table_name | row_count_valid | row_count | rechunker_ignore - --------------+----------+-------------+----------------+-----------------+-----------+----------------- - master | 1 | public | nba | true | 457 | 0 - master | 12 | public | cool_dates | true | 5 | 0 - master | 13 | public | cool_numbers | true | 9 | 0 - master | 27 | public | jabberwocky | true | 8 | 0 - -List all schemas in the database ------------------------------------- - -.. code-block:: psql - - master=> SELECT * FROM sqream_catalog.schemas; - schema_id | schema_name | schema_owner | rechunker_ignore - ----------+---------------+--------------+----------------- - 0 | public | sqream | false - 1 | secret_schema | mjordan | false - - -List columns and their types for a specific table ---------------------------------------------------- - -.. code-block:: postgres - - SELECT column_name, type_name - FROM sqream_catalog.columns - WHERE table_name='cool_animals'; - -List delete predicates ------------------------- - -.. code-block:: postgres - - SELECT t.table_name, d.* FROM - sqream_catalog.delete_predicates AS d - INNER JOIN sqream_catalog.tables AS t - ON d.table_id=t.table_id; - - -List :ref:`saved_queries` ------------------------------ - -.. code-block:: postgres - - SELECT * FROM sqream_catalog.savedqueries; + * - :ref:`Extents` + - Shows ``extents`` + * - :ref:`Chunk columns` + - Shows ``chunks_columns`` + * - :ref:`Chunks` + - Shows ``chunks`` + * - :ref:`Delete predicates` + - Shows ``delete_predicates``. For more information, see :ref:`Deleting Data` + +.. toctree:: + :maxdepth: 1 + :glob: + :hidden: + + + catalog_reference_catalog_tables + catalog_reference_additonal_tables + catalog_reference_examples \ No newline at end of file diff --git a/reference/catalog_reference_additonal_tables.rst b/reference/catalog_reference_additonal_tables.rst new file mode 100644 index 000000000..37979b216 --- /dev/null +++ b/reference/catalog_reference_additonal_tables.rst @@ -0,0 +1,129 @@ +.. _catalog_reference_additonal_tables: + +************************************* +Additional Tables +************************************* + +The Reference Catalog includes additional tables that can be used for performance monitoring and inspection. The definition for these tables described on this page may change across SQream versions. + +.. contents:: + :local: + :depth: 1 + +.. _extents: + +Extents +---------- +The ``extents`` storage object identifies storage extents, and each storage extents can contain several chunks. + +.. note:: This is an internal table designed for low-level performance troubleshooting. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the databse containing the extent. + * - ``table_id`` + - Shows the ID of the table containing the extent. + * - ``column_id`` + - Shows the ID of the column containing the extent. + * - ``extent_id`` + - Shows the ID for the extent. + * - ``size`` + - Shows the extent size in megabytes. + * - ``path`` + - Shows the full path to the extent on the file system. + +.. _chunk_columns: + +Chunk Columns +------------------- +The ``chunk_columns`` storage object lists chunk information by column. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the databse containing the extent. + * - ``table_id`` + - Shows the ID of the table containing the extent. + * - ``column_id`` + - Shows the ID of the column containing the extent. + * - ``chunk_id`` + - Shows the chunk ID. + * - ``extent_id`` + - Shows the extent ID. + * - ``compressed_size`` + - Shows the compressed chunk size in bytes. + * - ``uncompressed_size`` + - Shows the uncompressed chunk size in bytes. + * - ``compression_type`` + - Shows the chunk's actual compression scheme. + * - ``long_min`` + - Shows the minimum numeric value in the chunk (if one exists). + * - ``long_max`` + - Shows the maximum numeric value in the chunk (if one exists). + * - ``string_min`` + - Shows the minimum text value in the chunk (if one exists). + * - ``string_max`` + - Shows the maximum text value in the chunk (if one exists). + * - ``offset_in_file`` + - Reserved for internal use. + +.. note:: This is an internal table designed for low-level performance troubleshooting. + +.. _chunks: + +Chunks +------- +The ``chunks`` storage object identifies storage chunks. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the databse containing the chunk. + * - ``table_id`` + - Shows the ID of the table containing the chunk. + * - ``column_id`` + - Shows the ID of the column containing the chunk. + * - ``rows_num`` + - Shows the amount of rows in the chunk. + * - ``deletion_status`` + - Determines what data to logically delete from the table first, and identifies how much data to delete from the chunk. The value ``0`` is ued for no data, ``1`` for some data, and ``2`` to delete the entire chunk. + +.. note:: This is an internal table designed for low-level performance troubleshooting. + +.. _delete_predicates: + +Delete Predicates +------------------- +The ``delete_predicates`` storage object identifies the existing delete predicates that have not been cleaned up. + +Each :ref:`DELETE ` command may result in several entries in this table. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the databse containing the predicate. + * - ``table_id`` + - Shows the ID of the table containing the predicate. + * - ``max_chunk_id`` + - Reserved for internal use, this is a placeholder marker for the highest ``chunk_id`` logged during the ``DELETE`` operation. + * - ``delete_predicate`` + - Identifies the DELETE predicate. + +.. note:: This is an internal table designed for low-level performance troubleshooting. \ No newline at end of file diff --git a/reference/catalog_reference_catalog_tables.rst b/reference/catalog_reference_catalog_tables.rst new file mode 100644 index 000000000..ae631d7d7 --- /dev/null +++ b/reference/catalog_reference_catalog_tables.rst @@ -0,0 +1,492 @@ +.. _catalog_reference_catalog_tables: + +************** +Catalog Tables +************** + +The ``sqream_catalog`` includes the following tables: + +.. contents:: + :local: + :depth: 1 + +.. _clustering_keys: + +Clustering Keys +---------------- + +The ``clustering_keys`` data object is used for explicit clustering keys for tables. If you define more than one clustering key, each key is listed in a separate row, and is described in the following table: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the table. + * - ``table_id`` + - Shows the ID of the table containing the column. + * - ``schema_name`` + - Shows the name of the schema containing the table. + * - ``table_name`` + - Shows the name of the table containing the column. + * - ``clustering_key`` + - Shows the name of the column used as a clustering key for this table. + +.. _columns: + +Columns +---------------- + +The **Columns** database object shows the following tables: + +.. contents:: + :local: + :depth: 1 + +Columns +*********** + +The ``column`` data object is used with standard tables and is described in the following table: + +.. list-table:: + :widths: 20 150 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the table. + * - ``schema_name`` + - Shows the name of the schema containing the table. + * - ``table_id`` + - Shows the ID of the table containing the column. + * - ``table_name`` + - Shows the name of the table containing the column. + * - ``column_id`` + - Shows the ordinal number of the column in the table (begins at **0**). + * - ``column_name`` + - Shows the column's name. + * - ``type_name`` + - Shows the column's data type. For more information see :ref:`Supported Data Types `. + * - ``column_size`` + - Shows the maximum length in bytes. + * - ``has_default`` + - Shows ``NULL`` if the column has no default value, ``1`` if the default is a fixed value, or ``2`` if the default is an identity. For more information, see :ref:`identity`. + * - ``default_value`` + - Shows the column's default value. For more information, see :ref:`Default Value Constraints`. + * - ``compression_strategy`` + - Shows the compression strategy that a user has overridden. + * - ``created`` + - Shows the timestamp displaying when the column was created. + * - ``altered`` + - Shows the timestamp displaying when the column was last altered. + +External Table Columns +************************ + +The ``external_table_columns`` is used for viewing data from foreign tables. + +For more information on foreign tables, see :ref:`CREATE FOREIGN TABLE`. + +.. _databases: + +Databases +---------------- + +The ``databases`` data object is used for displaying database information, and is described in the following table: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_Id`` + - Shows the database's unique ID. + * - ``database_name`` + - Shows the database's name. + * - ``default_disk_chunk_size`` + - Reserved for internal use. + * - ``default_process_chunk_size`` + - Reserved for internal use. + * - ``rechunk_size`` + - Reserved for internal use. + * - ``storage_subchunk_size`` + - Reserved for internal use. + * - ``compression_chunk_size_threshold`` + - Reserved for internal use. + +.. _parameters: + +Parameters +------------- + +The ``parameters`` object is used for displaying all flags, providing the scope (default, cluster and session), description, default value and actual value. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``flag_name`` + - Shows the flag name + * - ``value`` + - Shows the current flag configured value + * - ``default_value`` + - Shows the flag default value + * - ``scope`` + - Shows whether flag configuration is session-based or cluster-based + * - ``description`` + - Describes the purpose of the flag + + + +.. _permissions: + +Permissions +---------------- + +The ``permissions`` data object is used for displaying permission information, such as roles (also known as **grantees**), and is described in the following tables: + +.. contents:: + :local: + :depth: 1 + +Permission Types +***************** + +The ``permission_types`` object identifies the permission names existing in the database. + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``permission_type_id`` + - Shows the permission type's ID. + * - ``name`` + - Shows the name of the permission type. + +Default Permissions +******************** + +The commands included in the **Default Permissions** section describe how to check the following default permissions: + +.. contents:: + :local: + :depth: 1 + +Default Table Permissions +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``sqream_catalog.table_default_permissions`` command shows the columns described below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the database that the default permission rule applies to. + * - ``schema_id`` + - Shows the schema that the rule applies to, or ``NULL`` if the ``ALTER`` statement does not specify a schema. + * - ``modifier_role_id`` + - Shows the role to apply the rule to. + * - ``getter_role_id`` + - Shows the role that the permission is granted to. + * - ``permission_type`` + - Shows the type of permission granted. + +Default Schema Permissions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``sqream_catalog.schema_default_permissions`` command shows the columns described below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the database that the default permission rule applies to. + * - ``modifier_role_id`` + - Shows the role to apply the rule to. + * - ``getter_role_id`` + - Shows the role that the permission is granted to. + * - ``permission_type`` + - Shows the type of permission granted. + * - ``getter_role_type`` + - Shows the type of role that is granted permissions. + +For an example of using the ``sqream_catalog.table_default_permissions`` command, see :ref:`Granting Default Table Permissions `. + +Table Permissions +****************** + +The ``table_permissions`` data object identifies all permissions granted to tables. Each role-permission combination displays one row. + +The following table describes the ``table_permissions`` data object: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the table. + * - ``table_id`` + - Shows the ID of the table the permission applies to. + * - ``role_id`` + - Shows the ID of the role granted permissions. + * - ``permission_type`` + - Identifies the permission type. + +Database Permissions +********************* + +The ``database_permissions`` data object identifies all permissions granted to databases. Each role-permission combination displays one row. + +The following table describes the ``database_permissions`` data object: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database the permission applies to + * - ``role_id`` + - Shows the ID of the role granted permissions. + * - ``permission_type`` + - Identifies the permission type. + +Schema Permissions +******************** + +The ``schema_permissions`` data object identifies all permissions granted to schemas. Each role-permission combination displays one row. + +The following table describes the ``schema_permissions`` data object: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the schema. + * - ``schema_id`` + - Shows the ID of the schema the permission applies to. + * - ``role_id`` + - Shows the ID of the role granted permissions. + * - ``permission_type`` + - Identifies the permission type. + + +.. _queries: + +Queries +---------------- + +The ``savedqueries`` data object identifies the saved queries in the database, as shown in the following table: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``name`` + - Shows the saved query name. + * - ``num_parameters`` + - Shows the number of parameters to be replaced at run-time. + +For more information, see :ref:`Saved Queries`. + +.. _roles: + +Roles +---------------- + +The ``roles`` data object is used for displaying role information, and is described in the following tables: + +.. contents:: + :local: + :depth: 1 + +Roles +*********** + +The ``roles`` data object identifies the roles in the database, as shown in the following table: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``role_id`` + - Shows the role's database-unique ID. + * - ``name`` + - Shows the role's name. + * - ``superuser`` + - Identifies whether the role is a superuser (``1`` - superuser, ``0`` - regular user). + * - ``login`` + - Identifies whether the role can be used to log in to SQream (``1`` - yes, ``0`` - no). + * - ``has_password`` + - Identifies whether the role has a password (``1`` - yes, ``0`` - no). + +Role Memberships +******************* + +The ``roles_memberships`` data object identifies the role memberships in the database, as shown below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``role_id`` + - Shows the role ID. + * - ``member_role_id`` + - Shows the ID of the parent role that this role inherits from. + * - ``inherit`` + - Identifies whether permissions are inherited (``1`` - yes, ``0`` - no). + * - ``admin`` + - Identifies whether role is admin (``1`` - yes, ``0`` - no). + +.. _schemas: + +Schemas +---------------- + +The ``schemas`` data object identifies all the database's schemas, as shown below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``schema_id`` + - Shows the schema's unique ID. + * - ``schema_name`` + - Shows the schema's name. + * - ``schema_owner`` + - Shows the name of the role that owns the schema. + * - ``rechunker_ignore`` + - Reserved for internal use. + +.. _tables: + +Tables +---------------- + +The ``tables`` data object is used for displaying table information, and is described in the following tables: + +.. contents:: + :local: + :depth: 1 + +Tables +*********** + +The ``tables`` data object identifies proper (**Comment** - *What does "proper" mean?*) SQream tables in the database, as shown in the following table: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the table. + * - ``table_id`` + - Shows the table's database-unique ID. + * - ``schema_name`` + - Shows the name of the schema containing the table. + * - ``table_name`` + - Shows the name of the table. + * - ``row_count_valid`` + - Identifies whether the ``row_count`` can be used. + * - ``row_count`` + - Shows the number of rows in the table. + * - ``rechunker_ignore`` + - Relevant for internal use. + +Foreign Tables +**************** + +The ``external_tables`` data object identifies foreign tables in the database, as shown below: + +.. list-table:: + :widths: 20 200 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the table. + * - ``table_id`` + - Shows the table's database-unique ID. + * - ``schema_name`` + - Shows the name of the schema containing the table. + * - ``table_name`` + - Shows the name of the table. + * - ``format`` + - Identifies the foreign data wrapper used. ``0`` for ``csv_fdw``, ``1`` for ``parquet_fdw``, ``2`` for ``orc_fdw``. + * - ``created`` + - Identifies the clause used to create the table. + +.. _views: + +Views +---------------- + +The ``views`` data object is used for displaying views in the database, as shown below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``view_id`` + - Shows the view's database-unique ID. + * - ``view_schema`` + - Shows the name of the schema containing the view. + * - ``view_name`` + - Shows the name of the view. + * - ``view_data`` + - Reserved for internal use. + * - ``view_query_text`` + - Identifies the ``AS`` clause used to create the view. + +.. _udfs: + +User Defined Functions +----------------------- + +The ``udf`` data object is used for displaying UDFs in the database, as shown below: + +.. list-table:: + :widths: 20 180 + :header-rows: 1 + + * - Column + - Description + * - ``database_name`` + - Shows the name of the database containing the view. + * - ``function_id`` + - Shows the UDF's database-unique ID. + * - ``function_name`` + - Shows the name of the UDF. diff --git a/reference/catalog_reference_examples.rst b/reference/catalog_reference_examples.rst new file mode 100644 index 000000000..f43f1b3fd --- /dev/null +++ b/reference/catalog_reference_examples.rst @@ -0,0 +1,62 @@ +.. _catalog_reference_examples: + +******** +Examples +******** + +.. contents:: + :local: + :depth: 1 + +Listing All Tables in a Database +-------------------------------- + +.. code-block:: psql + + master=> SELECT * FROM sqream_catalog.tables; + database_name | table_id | schema_name | table_name | row_count_valid | row_count | rechunker_ignore + --------------+----------+-------------+----------------+-----------------+-----------+----------------- + master | 1 | public | nba | true | 457 | 0 + master | 12 | public | cool_dates | true | 5 | 0 + master | 13 | public | cool_numbers | true | 9 | 0 + master | 27 | public | jabberwocky | true | 8 | 0 + +Listing All Schemas in a Database +--------------------------------- + +.. code-block:: psql + + master=> SELECT * FROM sqream_catalog.schemas; + schema_id | schema_name | rechunker_ignore + ----------+---------------+----------------- + 0 | public | false + 1 | secret_schema | false + + +Listing Columns and Their Types for a Specific Table +---------------------------------------------------- + +.. code-block:: postgres + + SELECT column_name, type_name + FROM sqream_catalog.columns + WHERE table_name='cool_animals'; + +Listing Delete Predicates +------------------------- + +.. code-block:: postgres + + SELECT t.table_name, d.* FROM + sqream_catalog.delete_predicates AS d + INNER JOIN sqream_catalog.tables AS t + ON d.table_id=t.table_id; + + +Listing Saved Queries +--------------------- + +.. code-block:: postgres + + SELECT * FROM sqream_catalog.savedqueries; + diff --git a/reference/catalog_reference_overview.rst b/reference/catalog_reference_overview.rst new file mode 100644 index 000000000..1df2a9af7 --- /dev/null +++ b/reference/catalog_reference_overview.rst @@ -0,0 +1,15 @@ +.. _catalog_reference_overview: + +********* +Overview +********* + +The SQreamDB database uses a schema called ``sqream_catalog`` that contains information about database objects such as tables, columns, views, and permissions. Some additional catalog tables are used primarily for internal analysis and may differ across SQreamDB versions. + +:ref:`catalog_reference_schema_information` + +:ref:`catalog_reference_catalog_tables` + +:ref:`catalog_reference_additonal_tables` + +:ref:`catalog_reference_examples` diff --git a/reference/catalog_reference_schema_information.rst b/reference/catalog_reference_schema_information.rst new file mode 100644 index 000000000..c706f3c07 --- /dev/null +++ b/reference/catalog_reference_schema_information.rst @@ -0,0 +1,55 @@ +.. _catalog_reference_schema_information: + +***************************************** +What Information Does the Schema Contain? +***************************************** + +The schema contains data management tables with information about structure and management of database elements, including tables, schemas, queries, and permissions, and physical storage and organization of data tables of extents, chunk columns, chunks, and delete predicates. + +Database Management Tables +--------------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Database Object + - Table + * - :ref:`Clustering Keys` + - ``clustering_keys`` + * - :ref:`Columns` + - ``columns``, ``external_table_columns`` + * - :ref:`Databases` + - ``databases`` + * - :ref:`Permissions` + - ``table_permissions``, ``database_permissions``, ``schema_permissions``, ``permission_types``, ``udf_permissions``, ``sqream_catalog.table_default_permissions`` + * - :ref:`Queries` + - ``saved_queries`` + * - :ref:`Roles` + - ``roles``, ``roles_memeberships`` + * - :ref:`Schemas` + - ``schemas`` + * - :ref:`Tables` + - ``tables``, ``external_tables`` + * - :ref:`Views` + - ``views`` + * - :ref:`User Defined Functions` + - ``user_defined_functions`` + +Data Storage and Organization Tables +--------------------------------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Database Object + - Table + * - Extents + - Shows ``extents`` + * - Chunk columns + - Shows ``chunks_columns`` + * - Chunks + - Shows ``chunks`` + * - Delete predicates + - Shows ``delete_predicates``. For more information, see :ref:`Deleting Data` \ No newline at end of file diff --git a/reference/cli/index.rst b/reference/cli/index.rst index d892ffea6..1000154dd 100644 --- a/reference/cli/index.rst +++ b/reference/cli/index.rst @@ -41,16 +41,7 @@ This topic contains the reference for these programs, as well as flags and confi * - :ref:`upgrade_storage ` - Upgrade metadata schemas when upgrading between major versions -.. list-table:: Docker utilities - :widths: auto - :header-rows: 1 - - * - Command - - Usage - * - :ref:`sqream_console ` - - Dockerized convenience wrapper for operations - * - :ref:`sqream_installer ` - - Dockerized installer + .. toctree:: :maxdepth: 1 @@ -58,8 +49,8 @@ This topic contains the reference for these programs, as well as flags and confi metadata_server sqreamd + multi_platform_cli sqream_console - sqream_installer server_picker sqream_storage sqream sql diff --git a/reference/cli/metadata_server.rst b/reference/cli/metadata_server.rst index 716237c11..89bf3cae7 100644 --- a/reference/cli/metadata_server.rst +++ b/reference/cli/metadata_server.rst @@ -1,46 +1,57 @@ .. _metadata_server_cli_reference: -************************* +*************** metadata_server -************************* +*************** SQream DB's cluster manager/coordinator is called ``metadata_server``. In general, you should not need to run ``metadata_server`` manually, but it is sometimes useful for testing. -This page serves as a reference for the options and parameters. - -Positional command line arguments -=================================== - -.. code-block:: console - - $ metadata_server [ [ ] ] +Command Line Arguments +====================== .. list-table:: - :widths: auto + :widths: 2 3 5 :header-rows: 1 * - Argument - Default - Description - * - Logging path - - Current directory - - Path to store metadata logs into - * - Listen port + * - ``--config`` + - ``/home/omert/.sqream/metadata_server_config.json`` + - The configuration file to use + * - ``--port`` - ``3105`` - - TCP listen port. If used, log path must be specified beforehand. + - The metadata server listening port + * - ``--log_path`` + - ``./metadata_server_logs`` + - The ``metadata_server`` log file output contains information about the activities and events related to the metadata server of a system. + * - ``--log4_config`` + - None + - Specifies the location of the configuration file for the ``Log4cxx`` logging library. + * - ``--num_deleters`` + - 1 + - Specifies the number of threads to use for the file reaper in a system or program. + * - ``--metadata_path`` + - ``<...sqreamd/leveldb>`` + - Specifies the path to the directory where metadata files are stored for a system or program. + * - ``--help`` + - None + - Used to display a help message or documentation for a particular program or command. + + Starting metadata server -============================ +======================== Starting temporarily ------------------------------ +-------------------- .. code-block:: console - $ nohup metadata_server & - $ MS_PID=$! + nohup metadata_server -config ~/.sqream/metadata_server_config.json & + MS_PID=$! Using ``nohup`` and ``&`` sends metadata server to run in the background. @@ -49,28 +60,28 @@ Using ``nohup`` and ``&`` sends metadata server to run in the background. * The default listening port is 3105 Starting temporarily with non-default port ------------------------------------------------- +------------------------------------------ To use a non-default port, specify the logging path as well. .. code-block:: console - $ nohup metadata_server /home/rhendricks/metadata_logs 9241 & - $ MS_PID=$! + nohup metadata_server --log_path=/home/rhendricks/metadata_logs --port=9241 & + MS_PID=$! Using ``nohup`` and ``&`` sends metadata server to run in the background. .. note:: * Logs are saved to the ``/home/rhendricks/metadata_logs`` directory. * The listening port is 9241 - + Stopping metadata server ----------------------------- +------------------------ To stop metadata server: .. code-block:: console - $ kill -9 $MS_PID + kill -9 $MS_PID .. tip:: It is safe to stop any SQream DB component at any time using ``kill``. No partial data or data corruption should occur when using this method to stop the process. diff --git a/reference/cli/multi_platform_cli.rst b/reference/cli/multi_platform_cli.rst new file mode 100644 index 000000000..69b71c893 --- /dev/null +++ b/reference/cli/multi_platform_cli.rst @@ -0,0 +1,486 @@ +.. _multi_platform_cli: + +************************* +Multi-Platform Sqream SQL +************************* + +SQreamDB comes with a built-in client for executing SQL statements either interactively or from the command-line. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +Sqream SQL requires Java 17 + +Installing Sqream SQL +===================== + +If you have a SQreamDB installation on your server, ``sqream sql`` can be found in the ``bin`` directory of your SQreamDB installation, under the name ``sqream``. + + +To run ``sqream sql`` on any other Linux host: + +#. Download the ``sqream sql`` tarball package from the :ref:`client_drivers` page. +#. Untar the package: ``tar xf sqream-sql-v2020.1.1_stable.x86_64.tar.gz`` +#. Start the client: + + .. code-block:: psql + + $ cd sqream-sql-v2020.1.1_stable.x86_64 + $ ./sqream sql --port=5000 --username=jdoe --databasename=master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> _ + + +Using Sqream SQL +================ + +By default, sqream sql runs in interactive mode. You can issue commands or SQL statements. + +Running Commands Interactively (SQL shell) +------------------------------------------ + +When starting sqream sql, after entering your password, you are presented with the SQL shell. + +To exit the shell, type ``\q`` or :kbd:`Ctrl-d`. + +.. code-block:: psql + + $ sqream sql --port=5000 --username=jdoe --databasename=master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> _ + +The database name shown means you are now ready to run statements and queries. + +Statements and queries are standard SQL, followed by a semicolon (``;``). Statement results are usually formatted as a valid CSV, +followed by the number of rows and the elapsed time for that statement. + +.. code-block:: psql + + master=> SELECT TOP 5 * FROM nba; + Avery Bradley ,Boston Celtics ,0,PG,25,6-2 ,180,Texas ,7730337 + Jae Crowder ,Boston Celtics ,99,SF,25,6-6 ,235,Marquette ,6796117 + John Holland ,Boston Celtics ,30,SG,27,6-5 ,205,Boston University ,\N + R.J. Hunter ,Boston Celtics ,28,SG,22,6-5 ,185,Georgia State ,1148640 + Jonas Jerebko ,Boston Celtics ,8,PF,29,6-10,231,\N,5000000 + 5 rows + time: 0.001185s + +.. note:: Null values are represented as \\N. + +When writing long statements and queries, it may be beneficial to use line-breaks. +The prompt for a multi-line statement will change from ``=>`` to ``.``, to alert users to the change. The statement will not execute until a semicolon is used. + + +.. code-block:: psql + :emphasize-lines: 13 + + $ sqream sql --port=5000 --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> SELECT "Age", + . AVG("Salary") + . FROM NBA + . GROUP BY 1 + . ORDER BY 2 ASC + . LIMIT 5 + . ; + 38,1840041 + 19,1930440 + 23,2034746 + 21,2067379 + 36,2238119 + 5 rows + time: 0.009320s + + +Executing Batch Scripts (``-f``) +-------------------------------- + +To run an SQL script, use the ``-f `` argument. + +For example, + +.. code-block:: console + + $ sqream sql --port=5000 --username=jdoe -d master -f sql_script.sql --results-only + +.. tip:: Output can be saved to a file by using redirection (``>``). + +Executing Commands Immediately (``-c``) +--------------------------------------- + +To run a statement from the console, use the ``-c `` argument. + +For example, + +.. code-block:: console + + $ sqream sql --port=5000 --username=jdoe -d nba -c "SELECT TOP 5 * FROM nba" + Avery Bradley ,Boston Celtics ,0,PG,25,6-2 ,180,Texas ,7730337 + Jae Crowder ,Boston Celtics ,99,SF,25,6-6 ,235,Marquette ,6796117 + John Holland ,Boston Celtics ,30,SG,27,6-5 ,205,Boston University ,\N + R.J. Hunter ,Boston Celtics ,28,SG,22,6-5 ,185,Georgia State ,1148640 + Jonas Jerebko ,Boston Celtics ,8,PF,29,6-10,231,\N,5000000 + 5 rows + time: 0.202618s + +.. tip:: Remove the timing and row count by passing the ``--results-only`` parameter + + +Examples +======== + +Starting a Regular Interactive Shell +------------------------------------ + +Connect to local server 127.0.0.1 on port 5000, to the default built-in database, `master`: + +.. code-block:: psql + + $ sqream sql --port=5000 --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + +Connect to local server 127.0.0.1 via the built-in load balancer on port 3108, to the default built-in database, `master`: + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + +Executing Statements in an Interactive Shell +-------------------------------------------- + +Note that all SQL commands end with a semicolon. + +Creating a new database and switching over to it without reconnecting: + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=oldmcd -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> create database farm; + executed + time: 0.003811s + master=> \c farm + farm=> + +.. code-block:: psql + + farm=> create table animals(id int not null, name text(30) not null, is_angry bool not null); + executed + time: 0.011940s + + farm=> insert into animals values(1,'goat',false); + executed + time: 0.000405s + + farm=> insert into animals values(4,'bull',true) ; + executed + time: 0.049338s + + farm=> select * from animals; + 1,goat ,0 + 4,bull ,1 + 2 rows + time: 0.029299s + +Executing SQL Statements from the Command Line +---------------------------------------------- + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals WHERE is_angry = true" + 4,bull ,1 + 1 row + time: 0.095941s + +.. _controlling_output: + +Controlling the Client Output +----------------------------- + +Two parameters control the dispay of results from the client: + +* ``--results-only`` - removes row counts and timing information +* ``--delimiter`` - changes the record delimiter + +Exporting SQL Query Results to CSV +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using the ``--results-only`` flag removes the row counts and timing. + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals" --results-only > file.csv + $ cat file.csv + 1,goat ,0 + 2,sow ,0 + 3,chicken ,0 + 4,bull ,1 + +Changing a CSV to a TSV +^^^^^^^^^^^^^^^^^^^^^^^ + +The ``--delimiter`` parameter accepts any printable character. + +.. tip:: To insert a tab, use :kbd:`Ctrl-V` followed by :kbd:`Tab ↹` in Bash. + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals" --delimiter ' ' > file.tsv + $ cat file.tsv + 1 goat 0 + 2 sow 0 + 3 chicken 0 + 4 bull 1 + + +Executing a Series of Statements From a File +-------------------------------------------- + +Assuming a file containing SQL statements (separated by semicolons): + +.. code-block:: console + + $ cat some_queries.sql + CREATE TABLE calm_farm_animals + ( id INT IDENTITY(0, 1), name TEXT(30) + ); + + INSERT INTO calm_farm_animals (name) + SELECT name FROM animals WHERE is_angry = false; + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -f some_queries.sql + executed + time: 0.018289s + executed + time: 0.090697s + +Connecting Using Environment Variables +-------------------------------------- + +You can save connection parameters as environment variables: + +.. code-block:: console + + $ export SQREAM_USER=sqream; + $ export SQREAM_DATABASE=farm; + $ sqream sql --port=3105 --clustered --username=$SQREAM_USER -d $SQREAM_DATABASE + +Connecting to a Specific Queue +------------------------------ + +When using the :ref:`dynamic workload manager` - connect to ``etl`` queue instead of using the default ``sqream`` queue. + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=mjordan -d master --service=etl + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + + +Operations and Flag References +============================== + +Command Line Arguments +---------------------- + +**Sqream SQL** supports the following command line arguments: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Argument + - Default + - Description + * - ``-c`` or ``--command`` + - None + - Changes the mode of operation to single-command, non-interactive. Use this argument to run a statement and immediately exit. + * - ``-f`` or ``--file`` + - None + - Changes the mode of operation to multi-command, non-interactive. Use this argument to run a sequence of statements from an external file and immediately exit. + * - ``--host`` + - ``127.0.0.1`` + - Address of the SQreamDB worker. + * - ``--port`` + - ``5000`` + - Sets the connection port. + * - ``--databasename`` or ``-d`` + - None + - Specifies the database name for queries and statements in this session. + * - ``--username`` + - None + - Username to connect to the specified database. + * - ``--password`` + - None + - Specify the password using the command line argument. If not specified, the client will prompt the user for the password. + * - ``--clustered`` + - False + - When used, the client connects to the load balancer, usually on port ``3108``. If not set, the client assumes the connection is to a standalone SQreamDB worker. + * - ``--service`` + - ``sqream`` + - :ref:`Service name (queue)` that statements will file into. + * - ``--results-only`` + - False + - Outputs results only, without timing information and row counts + * - ``--no-history`` + - False + - When set, prevents command history from being saved in ``~/.sqream/clientcmdhist`` + * - ``--delimiter`` + - ``,`` + - Specifies the field separator. By default, ``sqream sql`` outputs valid CSVs. Change the delimiter to modify the output to another delimited format (e.g. TSV, PSV). See the section :ref:`supported record delimiters` below for more information. + +.. tip:: Run ``$ sqream sql --help`` to see a full list of arguments + +.. _supported_record_delimiters: + +Supported Record Delimiters +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The supported record delimiters are printable ASCII values (32-126). + +* Recommended delimiters for use are: ``,``, ``|``, tab character. + +* The following characters are **not supported**: ``\``, ``N``, ``-``, ``:``, ``"``, ``\n``, ``\r``, ``.``, lower-case latin letters, digits (0-9) + +Meta-Commands +------------- + +* Meta-commands in Sqream SQL start with a backslash (``\``) + +.. note:: Meta commands do not end with a semicolon + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Command + - Example + - Description + * - ``\q`` or ``\quit`` + - .. code-block:: psql + + master=> \q + - Quit the client. (Same as :kbd:`Ctrl-d`) + * - ``\c `` or ``\connect `` + - .. code-block:: psql + + master=> \c fox + fox=> + - Changes the current connection to an alternate database + +Basic Commands +-------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-l` + - Clear the screen. + * - :kbd:`Ctrl-c` + - Terminate the current command. + * - :kbd:`Ctrl-z` + - Suspend/stop the command. + * - :kbd:`Ctrl-d` + - Quit SQream SQL + + + +Moving Around the Command Line +------------------------------ + +.. list-table:: + :widths: 17 83 + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-a` + - Goes to the beginning of the command line. + * - :kbd:`Ctrl-e` + - Goes to the end of the command line. + * - :kbd:`Ctrl-u` + - Deletes from cursor to the beginning of the command line. + * - :kbd:`Ctrl-k` + - Deletes from the cursor to the end of the command line. + * - :kbd:`Ctrl-w` + - Delete from cursor to beginning of a word. + * - :kbd:`Ctrl-y` + - Pastes a word or text that was cut using one of the deletion shortcuts (such as the one above) after the cursor. + * - :kbd:`Alt-b` + - Moves back one word (or goes to the beginning of the word where the cursor is). + * - :kbd:`Alt-f` + - Moves forward one word (or goes to the end of word the cursor is). + * - :kbd:`Alt-d` + - Deletes to the end of a word starting at the cursor. Deletes the whole word if the cursor is at the beginning of that word. + * - :kbd:`Alt-c` + - Capitalizes letters in a word starting at the cursor. Capitalizes the whole word if the cursor is at the beginning of that word. + * - :kbd:`Alt-u` + - Capitalizes from the cursor to the end of the word. + * - :kbd:`Alt-l` + - Makes lowercase from the cursor to the end of the word. + * - :kbd:`Ctrl-f` + - Moves forward one character. + * - :kbd:`Ctrl-b` + - Moves backward one character. + * - :kbd:`Ctrl-h` + - Deletes characters located before the cursor. + * - :kbd:`Ctrl-t` + - Swaps a character at the cursor with the previous character. + +Searching +--------- + +.. list-table:: + :widths: 17 83 + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-r` + - Searches the history backward. + * - :kbd:`Ctrl-g` + - Escapes from history-searching mode. + * - :kbd:`Ctrl-p` + - Searches the previous command in history. + * - :kbd:`Ctrl-n` + - Searches the next command in history. diff --git a/reference/cli/server_picker.rst b/reference/cli/server_picker.rst index b4869c181..ee3ad40fe 100644 --- a/reference/cli/server_picker.rst +++ b/reference/cli/server_picker.rst @@ -1,19 +1,16 @@ .. _server_picker_cli_reference: ************************* -server_picker +Server Picker ************************* -SQream DB's load balancer is called ``server_picker``. +SQreamDB's load balancer is called ``server_picker``. -This page serves as a reference for the options and parameters. +Command Line Arguments +======================== -Positional command line arguments -=================================== - -.. code-block:: console - - $ server_picker [ [ [ ] ] +Parameters +------------ .. list-table:: :widths: auto @@ -22,18 +19,40 @@ Positional command line arguments * - Argument - Default - Description - * - ``Metadata server address`` - - - - IP or hostname to an active :ref:`metadata server` - * - ``Metadata server port`` - - - - TCP port to an active :ref:`metadata server` - * - ``TCP listen port`` + * - ``--metadata_server_port`` + - ``3105`` + - The metadata server listening port + * - ``--metadata_server_ip`` + - ``127.0.0.1`` + - The metadata server IP + * - ``--port`` - ``3108`` - - TCP port for server picker to listen on - * - ``Metadata server port`` + - The server picker port + * - ``--ssl_port`` - ``3109`` - - SSL port for server picker to listen on + - The server picker ssl port + * - ``--log4_config`` + - ``/home/sqream/sqream3/etc/server_picker_log_properties`` + - The server picker log4 configuration file to use + * - ``--refresh_interval`` + - ``15`` + - Refresh interval time to check available nodes + * - ``--services`` + - None + - Lists services separated by comma + * - ``--help`` + - None + - Used to display a help message or documentation for a particular program or command + * - ``--log_path`` + - ``./server_picker_logs`` + - Configures the default location for the log file + +Example +--------- + +.. code-block:: console + + server_picker --metadata_server_ip=127.0.0.1 --metadata_server_port=3105 --port=3118 --ssl_port=3119 --services=sqream23,sqream0 --log4_config=/home/sqream/metadata_log_properties --refresh_interval=10 Starting server picker ============================ @@ -47,8 +66,8 @@ Assuming we have a :ref:`metadata server` listeni .. code-block:: console - $ nohup server_picker 127.0.0.1 3105 & - $ SP_PID=$! + nohup server_picker --metadata_server_ip=127.0.0.1 metadata_server_port=3105 & + SP_PID=$! Using ``nohup`` and ``&`` sends server picker to run in the background. @@ -59,8 +78,8 @@ Tell server picker to listen on port 2255 for unsecured connections, and port 22 .. code-block:: console - $ nohup server_picker 127.0.0.1 3105 2255 2266 & - $ SP_PID=$! + nohup server_picker --metadata_server_ip=127.0.0.1 --metadata_server_port=3105 --port=2255 --ssl_port=2266 & + SP_PID=$! Using ``nohup`` and ``&`` sends server picker to run in the background. @@ -69,6 +88,6 @@ Stopping server picker .. code-block:: console - $ kill -9 $SP_PID + kill -9 $SP_PID .. tip:: It is safe to stop any SQream DB component at any time using ``kill``. No partial data or data corruption should occur when using this method to stop the process. diff --git a/reference/cli/sqream_console.rst b/reference/cli/sqream_console.rst deleted file mode 100644 index 0fda5cfdc..000000000 --- a/reference/cli/sqream_console.rst +++ /dev/null @@ -1,451 +0,0 @@ -.. _sqream_console_cli_reference: - -********************************* -sqream-console -********************************* - -``sqream-console`` is an interactive shell designed to help manage a dockerized SQream DB installation. - -The console itself is a dockerized application. - -This page serves as a reference for the options and parameters. - -.. contents:: In this topic: - :local: - -Starting the console -====================== - -``sqream-console`` can be found in your SQream DB installation, under the name ``sqream-console``. - -Start the console by executing it from the shell - -.. code-block:: console - - $ ./sqream-console - .................................................................................................................... - - ███████╗ ██████╗ ██████╗ ███████╗ █████╗ ███╗ ███╗ ██████╗ ██████╗ ███╗ ██╗███████╗ ██████╗ ██╗ ███████╗ - ██╔════╝██╔═══██╗██╔══██╗██╔════╝██╔══██╗████╗ ████║ ██╔════╝██╔═══██╗████╗ ██║██╔════╝██╔═══██╗██║ ██╔════╝ - ███████╗██║ ██║██████╔╝█████╗ ███████║██╔████╔██║ ██║ ██║ ██║██╔██╗ ██║███████╗██║ ██║██║ █████╗ - ╚════██║██║▄▄ ██║██╔══██╗██╔══╝ ██╔══██║██║╚██╔╝██║ ██║ ██║ ██║██║╚██╗██║╚════██║██║ ██║██║ ██╔══╝ - ███████║╚██████╔╝██║ ██║███████╗██║ ██║██║ ╚═╝ ██║ ╚██████╗╚██████╔╝██║ ╚████║███████║╚██████╔╝███████╗███████╗ - ╚══════╝ ╚══▀▀═╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═══╝╚══════╝ ╚═════╝ ╚══════╝╚══════╝ - - .................................................................................................................... - - - Welcome to SQream Console ver 1.7.6, type exit to log-out - - usage: sqream [-h] [--settings] {master,worker,client,editor} ... - - Run SQream Cluster - - optional arguments: - -h, --help show this help message and exit - --settings sqream environment variables settings - - subcommands: - sqream services - - {master,worker,client,editor} - sub-command help - master start sqream master - worker start sqream worker - client operating sqream client - editor operating sqream statement editor - sqream-console> - -The console is now waiting for commands. - -The console is a wrapper around a standard linux shell. It supports commands like ``ls``, ``cp``, etc. - -All SQream DB-specific commands start with the keyword ``sqream``. - - -Operations and flag reference -=============================== - -Commands ------------------------ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Command - - Description - * - ``sqream --help`` - - Shows the initial usage information - * - ``sqream master`` - - Controls the master node's operations - * - ``sqream worker`` - - Controls workers' operations - * - ``sqream client`` - - Access to :ref:`sqream sql` - * - ``sqream editor`` - - Controls the statement editor's operations (web UI) - -.. _master_node: - -Master ------------- - -The master node contains the :ref:`metadata server` and the :ref:`load balancer`. - -Syntax -^^^^^^^^^^ - -.. code-block:: console - - sqream master - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Flag/command - - Description - * - ``--start [ --single-host ]`` - - - Starts the master node. - The ``--single-host`` modifier sets the mode to allow all containers to run on the same server. - - * - ``--stop [ --all ]`` - - - Stops the master node and all connected :ref:`workers`. - The ``--all`` modifier instructs the ``--stop`` command to stop all running services related to SQream DB - * - ``--list`` - - Shows a list of all active master nodes and their workers - * - ``-p `` - - Sets the port for the load balancer. Defaults to ``3108`` - * - ``-m `` - - Sets the port for the metadata server. Defaults to ``3105`` - -Common usage -^^^^^^^^^^^^^^^ - -Start master node -******************** - -.. code-block:: console - - sqream-console> sqream master --start - starting master server in single_host mode ... - sqream_single_host_master is up and listening on ports: 3105,3108 - -Start master node on different ports -******************************************* - -.. code-block:: console - - sqream-console> sqream master --start -p 4105 -m 4108 - starting master server in single_host mode ... - sqream_single_host_master is up and listening on ports: 4105,4108 - -Listing active master nodes and workers -*************************************************** - -.. code-block:: console - - sqream-console> sqream master --list - container name: sqream_single_host_worker_1, container id: de9b8aff0a9c - container name: sqream_single_host_worker_0, container id: c919e8fb78c8 - container name: sqream_single_host_master, container id: ea7eef80e038 - -Stopping all SQream DB workers and master -********************************************* - -.. code-block:: console - - sqream-console> sqream master --stop --all - shutting down 2 sqream services ... - sqream_editor stopped - sqream_single_host_worker_1 stopped - sqream_single_host_worker_0 stopped - sqream_single_host_master stopped - -.. _workers: - -Workers ------------- - -Workers are :ref:`SQream DB daemons`, that connect to the master node. - -Syntax -^^^^^^^^^^ - -.. code-block:: console - - sqream worker - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Flag/command - - Description - * - ``--start [ options [ ...] ]`` - - Starts worker nodes. See options table below. - * - ``--stop [ | --all ]`` - - - Stops the specified worker name. - The ``--all`` modifier instructs the ``--stop`` command to stop all running workers. - -Start options are specified consecutively, separated by spaces. - -.. list-table:: Start options - :widths: auto - :header-rows: 1 - - * - Option - - Description - * - ```` - - Specifies the number of workers to start - * - ``-j [ ...]`` - - Specifies configuration files to apply to each worker. When launching multiple workers, specify one file per worker, separated by spaces. - * - ``-p [ ...]`` - - Sets the ports to listen on. When launching multiple workers, specify one port per worker, separated by spaces. Defaults to 5000 - 5000+n. - * - ``-g [ ...]`` - - Sets the GPU ordinal to assign to each worker. When launching multiple workers, specify one GPU ordinal per worker, separated by spaces. Defaults to automatic allocation. - * - ``-m `` - - Sets the spool memory per node in gigabytes. - * - ``--master-host`` - - Sets the hostname for the master node. Defaults to ``localhost``. - * - ``--master-port`` - - Sets the port for the master node. Defaults to ``3105``. - * - ``--stand-alone`` - - For testing only: Starts a worker without connecting to the master node. - -Common usage -^^^^^^^^^^^^^^^ - -Start 2 workers -******************** - -After starting the master node, start workers: - -.. code-block:: console - - sqream-console> sqream worker --start 2 - started sqream_single_host_worker_0 on port 5000, allocated gpu: 0 - started sqream_single_host_worker_1 on port 5001, allocated gpu: 1 - -Stop a single worker -******************************************* - -To stop a single worker, find its name first: - -.. code-block:: console - - sqream-console> sqream master --list - container name: sqream_single_host_worker_1, container id: de9b8aff0a9c - container name: sqream_single_host_worker_0, container id: c919e8fb78c8 - container name: sqream_single_host_master, container id: ea7eef80e038 - -Then, issue a stop command: - -.. code-block:: console - - sqream-console> sqream worker --stop sqream_single_host_worker_1 - stopped sqream_single_host_worker_1 - -Start workers with a different spool size -********************************************** - -If no spool size is specified, the RAM is equally distributed among workers. -Sometimes a system engineer may wish to specify the spool size manually. - -This example starts two workers, with a spool size of 50GB per node: - -.. code-block:: console - - sqream-console> sqream worker --start 2 -m 50 - -Starting multiple workers on non-dedicated GPUs -**************************************************** - -By default, SQream DB workers assign one worker per GPU. However, a system engineer may wish to assign multiple workers per GPU, if the workload permits it. - -This example starts 4 workers on 2 GPUs, with 50GB spool each: - -.. code-block:: console - - sqream-console> sqream worker --start 2 -g 0 -m 50 - started sqream_single_host_worker_0 on port 5000, allocated gpu: 0 - started sqream_single_host_worker_1 on port 5001, allocated gpu: 0 - sqream-console> sqream worker --start 2 -g 1 -m 50 - started sqream_single_host_worker_2 on port 5002, allocated gpu: 1 - started sqream_single_host_worker_3 on port 5003, allocated gpu: 1 - -Overriding default configuration files -******************************************* - -It is possible to override default configuration settings by listing a configuration file for every worker. - -This example starts 2 workers on the same GPU, with modified configuration files: - -.. code-block:: console - - sqream-console> sqream worker --start 2 -g 0 -j /etc/sqream/configfile.json /etc/sqream/configfile2.json - -Client ------------- - -The client operation runs :ref:`sqream sql` in interactive mode. - -.. note:: The dockerized client is useful for testing and experimentation. It is not the recommended method for executing analytic queries. See more about connecting a :ref:`third party tool to SQream DB ` for data analysis. - -Syntax -^^^^^^^^^^ - -.. code-block:: console - - sqream client - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Flag/command - - Description - * - ``--master`` - - Connects to the master node via the load balancer - * - ``--worker`` - - Connects to a worker directly - * - ``--host `` - - Specifies the hostname to connect to. Defaults to ``localhost``. - * - ``--port ``, ``-p `` - - Specifies the port to connect to. Defaults to ``3108`` when used with ``-master``. - * - ``--user ``, ``-u `` - - Specifies the role's username to use - * - ``--password ``, ``-w `` - - Specifies the password to use for the role - * - ``--database ``, ``-d `` - - Specifies the database name for the connection. Defaults to ``master``. - -Common usage -^^^^^^^^^^^^^^^ - -Start a client -******************** - -Connect to default ``master`` database through the load balancer: - -.. code-block:: console - - sqream-console> sqream client --master -u sqream -w sqream - Interactive client mode - To quit, use ^D or \q. - - master=> _ - -Start a client to a specific worker -************************************** - -Connect to database ``raviga`` directly to a worker on port 5000: - -.. code-block:: console - - sqream-console> sqream client --worker -u sqream -w sqream -p 5000 -d raviga - Interactive client mode - To quit, use ^D or \q. - - raviga=> _ - -Start master node on different ports -******************************************* - -.. code-block:: console - - sqream-console> sqream master --start -p 4105 -m 4108 - starting master server in single_host mode ... - sqream_single_host_master is up and listening on ports: 4105,4108 - -Listing active master nodes and worker nodes -*************************************************** - -.. code-block:: console - - sqream-console> sqream master --list - container name: sqream_single_host_worker_1, container id: de9b8aff0a9c - container name: sqream_single_host_worker_0, container id: c919e8fb78c8 - container name: sqream_single_host_master, container id: ea7eef80e038 - -.. _start_editor: - -Editor ------------- - -The editor operation runs the web UI for the :ref:`SQream DB Statement Editor`. - -The editor can be used to run queries from a browser. - -Syntax -^^^^^^^^^^ - -.. code-block:: console - - sqream editor - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Flag/command - - Description - * - ``--start`` - - Start the statement editor - * - ``--stop`` - - Shut down the statement editor - * - ``--port ``, ``-p `` - - Specify a different port for the editor. Defaults to ``3000``. - -Common usage -^^^^^^^^^^^^^^^ - -Start the editor UI -********************** - -.. code-block:: console - - sqream-console> sqream editor --start - access sqream statement editor through Chrome http://192.168.0.100:3000 - -Stop the editor UI -********************** - -.. code-block:: console - - sqream-console> sqream editor --stop - sqream_editor stopped - - -Using the console to start SQream DB -============================================ - -The console is used to start and stop SQream DB components in a dockerized environment. - -Starting a SQream DB cluster for the first time -------------------------------------------------------- - -To start a SQream DB cluster, start the master node, followed by workers. - -The example below starts 2 workers, running on 2 dedicated GPUs. - -.. code-block:: console - - sqream-console> sqream master --start - starting master server in single_host mode ... - sqream_single_host_master is up and listening on ports: 3105,3108 - - sqream-console> sqream worker --start 2 - started sqream_single_host_worker_0 on port 5000, allocated gpu: 0 - started sqream_single_host_worker_1 on port 5001, allocated gpu: 1 - - sqream-console> sqream editor --start - access sqream statement editor through Chrome http://192.168.0.100:3000 - -SQream DB is now listening on port 3108 for any incoming statements. - -A user can also access the web editor (running on port ``3000`` on the SQream DB machine) to connect and run queries. \ No newline at end of file diff --git a/reference/cli/sqream_installer.rst b/reference/cli/sqream_installer.rst deleted file mode 100644 index cdd9e801a..000000000 --- a/reference/cli/sqream_installer.rst +++ /dev/null @@ -1,144 +0,0 @@ -.. _sqream_installer_cli_reference: - -********************************* -sqream-installer -********************************* - -``sqream-installer`` is an application that prepares and configures a dockerized SQream DB installation. - - -This page serves as a reference for the options and parameters. - -.. contents:: In this topic: - :local: - - -Operations and flag reference -=============================== - -Command line flags ------------------------ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Flag - - Description - * - ``-i`` - - Loads the docker images for installation - * - ``-k`` - - Load new licenses from the ``license`` subdirectory - * - ``-K`` - - Validate licenses - * - ``-f`` - - Force overwrite any existing installation **and data directories currently in use** - * - ``-c `` - - Specifies a path to read and store configuration files in. Defaults to ``/etc/sqream``. - * - ``-v `` - - Specifies a path to the storage cluster. The path is created if it does not exist. - * - ``-l `` - - Specifies a path to store system startup logs. Defaults to ``/var/log/sqream`` - * - ``-d `` - - Specifies a path to expose to SQream DB workers. To expose several paths, repeat the usage of this flag. - * - ``-s`` - - Shows system settings - * - ``-r`` - - Reset the system configuration. This flag can't be combined with other flags. - -Usage -============= - -Install SQream DB for the first time ----------------------------------------- - -Assuming license package tarball has been placed in the ``license`` subfolder. - -* The path where SQream DB will store data is ``/home/rhendricks/sqream_storage``. - -* Logs will be stored in /var/log/sqream - -* Source CSV, Parquet, and ORC files can be accessed from ``/home/rhendricks/source_data``. All other directory paths are hidden from the Docker container. - -.. code-block:: console - - # ./sqream-install -i -k -v /home/rhendricks/sqream_storage -l /var/log/sqream -c /etc/sqream -d /home/rhendricks/source_data - -.. note:: Installation commands should be run with ``sudo`` or root access. - -Modify exposed directories -------------------------------- - -To expose more directory paths for SQream DB to read and write data from, re-run the installer with additional directory flags. - -.. code-block:: console - - # ./sqream-install -d /home/rhendricks/more_source_data - -There is no need to specify the initial installation flags - only the modified exposed directory paths flag. - - -Install a new license package ----------------------------------- - -Assuming license package tarball has been placed in the ``license`` subfolder. - -.. code-block:: console - - # ./sqream-install -k - -View system settings ----------------------------- - -This information may be useful to identify problems accessing directory paths, or locating where data is stored. - -.. code-block:: console - - # ./sqream-install -s - SQREAM_CONSOLE_TAG=1.7.4 - SQREAM_TAG=2020.1 - SQREAM_EDITOR_TAG=3.1.0 - license_worker_0=[...] - license_worker_1=[...] - license_worker_2=[...] - license_worker_3=[...] - SQREAM_VOLUME=/home/rhendricks/sqream_storage - SQREAM_DATA_INGEST=/home/rhendricks/source_data - SQREAM_CONFIG_DIR=/etc/sqream/ - LICENSE_VALID=true - SQREAM_LOG_DIR=/var/log/sqream/ - SQREAM_USER=sqream - SQREAM_HOME=/home/sqream - SQREAM_ENV_PATH=/home/sqream/.sqream/env_file - PROCESSOR=x86_64 - METADATA_PORT=3105 - PICKER_PORT=3108 - NUM_OF_GPUS=8 - CUDA_VERSION=10.1 - NVIDIA_SMI_PATH=/usr/bin/nvidia-smi - DOCKER_PATH=/usr/bin/docker - NVIDIA_DRIVER=418 - SQREAM_MODE=single_host - - -.. _upgrade_with_docker: - -Upgrading to a new version of SQream DB ----------------------------------------------- - -When upgrading to a new version with Docker, most settings don't need to be modified. - -The upgrade process replaces the existing docker images with new ones. - -#. Obtain the new tarball, and untar it to an accessible location. Enter the newly extracted directory. - -#. - Install the new images - - .. code-block:: console - - # ./sqream-install -i - -#. The upgrade process will check for running SQream DB processes. If any are found running, the installer will ask to stop them in order to continue the upgrade process. Once all services are stopped, the new version will be loaded. - -#. After the upgrade, open :ref:`sqream_console_cli_reference` and restart the desired services. \ No newline at end of file diff --git a/reference/cli/sqream_sql.rst b/reference/cli/sqream_sql.rst index 54a38300d..ef865032b 100644 --- a/reference/cli/sqream_sql.rst +++ b/reference/cli/sqream_sql.rst @@ -1,81 +1,45 @@ .. _sqream_sql_cli_reference: -********************************* -Sqream SQL CLI Reference -********************************* +************** +Sqream SQL CLI +************** -SQream DB comes with a built-in client for executing SQL statements either interactively or from the command-line. +SQreamDB SQL Java-based CLI allows SQL statements to be executed interactively or using shell scripts. This CLI is cross-platform, meaning it can be executed on any operating system which Java supports. If you are not using Bash to manage and run your Java applications, please use the ``java -jar`` command to run this CLI. -This page serves as a reference for the options and parameters. Learn more about using SQream DB SQL with the CLI by visiting the :ref:`first_steps` tutorial. +.. note:: + For the old version of the SQream SQL (Haskell-based) CLI, see :ref:`Haskell CLI documentation` -.. contents:: In this topic: +.. contents:: :local: + :depth: 1 -Installing Sqream SQL -========================= +Before You Begin +================ -If you have a SQream DB installation on your server, ``sqream sql`` can be found in the ``bin`` directory of your SQream DB installation, under the name ``sqream``. +* It is essential that Java 8 installed. -.. note:: If you installed SQream DB via Docker, the command is named ``sqream-client sql``, and can be found in the same location as the console. +* Download the latest CLI from the :ref:`Client Driver Downloads page ` +* It is essential you have the Java home path configured in your ``sqream`` file: -.. versionchanged:: 2020.1 - As of version 2020.1, ``ClientCmd`` has been renamed to ``sqream sql``. - + #. Open the ``sqream`` file using any text editor. -To run ``sqream sql`` on any other Linux host: + #. Set the default path ``/usr/lib/jvm/jdk-8.0.0/bin/java`` to the local Java 8 path on your machine: -#. Download the ``sqream sql`` tarball package from the :ref:`client_drivers` page. -#. Untar the package: ``tar xf sqream-sql-v2020.1.1_stable.x86_64.tar.gz`` -#. Start the client: - - .. code-block:: psql - - $ cd sqream-sql-v2020.1.1_stable.x86_64 - $ ./sqream sql --port=5000 --username=jdoe --databasename=master - Password: - - Interactive client mode - To quit, use ^D or \q. - - master=> _ - -Troubleshooting Sqream SQL Installation -------------------------------------------- - -Upon running sqream sql for the first time, you may get an error ``error while loading shared libraries: libtinfo.so.5: cannot open shared object file: No such file or directory``. - -Solving this error requires installing the ncruses or libtinfo libraries, depending on your operating system. - -* Ubuntu: - - #. Install ``libtinfo``: - - ``$ sudo apt-get install -y libtinfo`` - #. Depending on your Ubuntu version, you may need to create a symbolic link to the newer libtinfo that was installed. - - For example, if ``libtinfo`` was installed as ``/lib/x86_64-linux-gnu/libtinfo.so.6.2``: - - ``$ sudo ln -s /lib/x86_64-linux-gnu/libtinfo.so.6.2 /lib/x86_64-linux-gnu/libtinfo.so.5`` - -* CentOS / RHEL: + .. code-block:: none - #. Install ``ncurses``: - - ``$ sudo yum install -y ncurses-libs`` - #. Depending on your RHEL version, you may need to create a symbolic link to the newer libtinfo that was installed. - - For example, if ``libtinfo`` was installed as ``/usr/lib64/libtinfo.so.6``: - - ``$ sudo ln -s /usr/lib64/libtinfo.so.6 /usr/lib64/libtinfo.so.5`` + if [[ "$@" =~ "access-token" ]]; then + JAVA_CMD="/usr/lib/jvm/jdk-8.0.0/bin/java" + else + JAVA_CMD="/usr/lib/jvm/java-1.8.0/bin/java" -Using Sqream SQL -================= +Using SQreamDB SQL +================== By default, sqream sql runs in interactive mode. You can issue commands or SQL statements. Running Commands Interactively (SQL shell) --------------------------------------------- +------------------------------------------ When starting sqream sql, after entering your password, you are presented with the SQL shell. @@ -139,7 +103,7 @@ The prompt for a multi-line statement will change from ``=>`` to ``.``, to alert Executing Batch Scripts (``-f``) ---------------------------------- +-------------------------------- To run an SQL script, use the ``-f `` argument. @@ -152,7 +116,7 @@ For example, .. tip:: Output can be saved to a file by using redirection (``>``). Executing Commands Immediately (``-c``) -------------------------------------------- +--------------------------------------- To run a statement from the console, use the ``-c `` argument. @@ -173,10 +137,10 @@ For example, Examples -=========== +======== Starting a Regular Interactive Shell ------------------------------------ +------------------------------------ Connect to local server 127.0.0.1 on port 5000, to the default built-in database, `master`: @@ -203,7 +167,7 @@ Connect to local server 127.0.0.1 via the built-in load balancer on port 3108, t master=>_ Executing Statements in an Interactive Shell ------------------------------------------------ +-------------------------------------------- Note that all SQL commands end with a semicolon. @@ -225,7 +189,7 @@ Creating a new database and switching over to it without reconnecting: .. code-block:: psql - farm=> create table animals(id int not null, name varchar(30) not null, is_angry bool not null); + farm=> create table animals(id int not null, name text(30) not null, is_angry bool not null); executed time: 0.011940s @@ -256,7 +220,7 @@ Executing SQL Statements from the Command Line .. _controlling_output: Controlling the Client Output ----------------------------------------- +----------------------------- Two parameters control the dispay of results from the client: @@ -264,7 +228,7 @@ Two parameters control the dispay of results from the client: * ``--delimiter`` - changes the record delimiter Exporting SQL Query Results to CSV -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Using the ``--results-only`` flag removes the row counts and timing. @@ -278,7 +242,7 @@ Using the ``--results-only`` flag removes the row counts and timing. 4,bull ,1 Changing a CSV to a TSV -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^ The ``--delimiter`` parameter accepts any printable character. @@ -303,7 +267,7 @@ Assuming a file containing SQL statements (separated by semicolons): $ cat some_queries.sql CREATE TABLE calm_farm_animals - ( id INT IDENTITY(0, 1), name VARCHAR(30) + ( id INT IDENTITY(0, 1), name TEXT(30) ); INSERT INTO calm_farm_animals (name) @@ -318,7 +282,7 @@ Assuming a file containing SQL statements (separated by semicolons): time: 0.090697s Connecting Using Environment Variables -------------------------------------- +-------------------------------------- You can save connection parameters as environment variables: @@ -329,7 +293,7 @@ You can save connection parameters as environment variables: $ sqream sql --port=3105 --clustered --username=$SQREAM_USER -d $SQREAM_DATABASE Connecting to a Specific Queue ------------------------------------ +------------------------------ When using the :ref:`dynamic workload manager` - connect to ``etl`` queue instead of using the default ``sqream`` queue. @@ -345,10 +309,10 @@ When using the :ref:`dynamic workload manager` - connect to `` Operations and Flag References -=============================== +============================== Command Line Arguments ------------------------ +---------------------- **Sqream SQL** supports the following command line arguments: @@ -361,19 +325,19 @@ Command Line Arguments - Description * - ``-c`` or ``--command`` - None - - Changes the mode of operation to single-command, non-interactive. Use this argument to run a statement and immediately exit. + - Changes the mode of operation to single-command, non-interactive. Use this argument to run a statement and immediately exit * - ``-f`` or ``--file`` - None - - Changes the mode of operation to multi-command, non-interactive. Use this argument to run a sequence of statements from an external file and immediately exit. - * - ``--host`` + - Changes the mode of operation to multi-command, non-interactive. Use this argument to run a sequence of statements from an external file and immediately exit + * - ``-h``, or``--host`` - ``127.0.0.1`` - - Address of the SQream DB worker. - * - ``--port`` + - Address of the SQreamDB worker + * - ``-p`` or ``--port`` - ``5000`` - Sets the connection port. - * - ``--databasename`` or ``-d`` + * - ``--databasename``, ``-d``, or ``database`` - None - - Specifies the database name for queries and statements in this session. + - Specifies the database name for queries and statements in this session * - ``--username`` - None - Username to connect to the specified database. @@ -382,10 +346,10 @@ Command Line Arguments - Specify the password using the command line argument. If not specified, the client will prompt the user for the password. * - ``--clustered`` - False - - When used, the client connects to the load balancer, usually on port ``3108``. If not set, the client assumes the connection is to a standalone SQream DB worker. - * - ``--service`` + - When used, the client connects to the load balancer, usually on port ``3108``. If not set, the client assumes the connection is to a standalone SQreamDB worker + * - ``-s`` or ``--service`` - ``sqream`` - - :ref:`Service name (queue)` that statements will file into. + - :ref:`Service name (queue)` that statements will file into * - ``--results-only`` - False - Outputs results only, without timing information and row counts @@ -394,14 +358,30 @@ Command Line Arguments - When set, prevents command history from being saved in ``~/.sqream/clientcmdhist`` * - ``--delimiter`` - ``,`` - - Specifies the field separator. By default, ``sqream sql`` outputs valid CSVs. Change the delimiter to modify the output to another delimited format (e.g. TSV, PSV). See the section :ref:`supported record delimiters` below for more information. + - Specifies the field separator. By default, ``sqream sql`` outputs valid CSVs. Change the delimiter to modify the output to another delimited format (e.g. TSV, PSV). See the section :ref:`supported record delimiters` below for more information + * - ``--chunksize`` + - 128 * 1024 (128 Kb) + - Network chunk size + * - ``--log`` or ``log-file`` + - False + - A log file will be created + * - ``--show-results`` + - True + - Determines whether or not results are shown + * - ``--ssl`` + - False + - Determines connection SSL + * - ``--table-view`` + - ``true`` + - Displays query results in a table view format with column headers. The display limit is set to 10,000 rows + .. tip:: Run ``$ sqream sql --help`` to see a full list of arguments .. _supported_record_delimiters: Supported Record Delimiters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^ The supported record delimiters are printable ASCII values (32-126). @@ -410,7 +390,7 @@ The supported record delimiters are printable ASCII values (32-126). * The following characters are **not supported**: ``\``, ``N``, ``-``, ``:``, ``"``, ``\n``, ``\r``, ``.``, lower-case latin letters, digits (0-9) Meta-Commands ----------------- +------------- * Meta-commands in Sqream SQL start with a backslash (``\``) @@ -436,10 +416,10 @@ Meta-Commands - Changes the current connection to an alternate database Basic Commands ------------------------ +-------------- .. list-table:: - :widths: 20 30 50 + :widths: auto :header-rows: 1 * - Command @@ -456,7 +436,7 @@ Basic Commands Moving Around the Command Line ---------------------------------- +------------------------------ .. list-table:: :widths: 17 83 @@ -498,7 +478,7 @@ Moving Around the Command Line - Swaps a character at the cursor with the previous character. Searching ------------- +--------- .. list-table:: :widths: 17 83 diff --git a/reference/cli/sqream_sql_haskell_cli.rst b/reference/cli/sqream_sql_haskell_cli.rst new file mode 100644 index 000000000..3f910801a --- /dev/null +++ b/reference/cli/sqream_sql_haskell_cli.rst @@ -0,0 +1,521 @@ +:orphan: + +.. _sqream_sql_haskell_cli: + +************************ +Sqream SQL Haskell CLI +************************ + +SQreamDB comes with a built-in client for executing SQL statements either interactively or from the command-line. + +.. contents:: + :local: + :depth: 1 + +Installing Sqream SQL +===================== + +If you have a SQreamDB installation on your server, ``sqream sql`` can be found in the ``bin`` directory of your SQreamDB installation, under the name ``sqream``. + + + +.. versionchanged:: 2020.1 + As of version 2020.1, ``ClientCmd`` has been renamed to ``sqream sql``. + + +To run ``sqream sql`` on any other Linux host: + +#. Download the ``sqream sql`` tarball package from the :ref:`client_drivers` page. +#. Untar the package: ``tar xf sqream-sql-v2020.1.1_stable.x86_64.tar.gz`` +#. Start the client: + + .. code-block:: psql + + $ cd sqream-sql-v2020.1.1_stable.x86_64 + $ ./sqream sql --port=5000 --username=jdoe --databasename=master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> _ + +Troubleshooting Sqream SQL Installation +--------------------------------------- + +Upon running sqream sql for the first time, you may get an error ``error while loading shared libraries: libtinfo.so.5: cannot open shared object file: No such file or directory``. + +Solving this error requires installing the ncruses or libtinfo libraries, depending on your operating system. + +* RHEL: + + #. Install ``ncurses``: + + ``$ sudo yum install -y ncurses-libs`` + #. Depending on your RHEL version, you may need to create a symbolic link to the newer libtinfo that was installed. + + For example, if ``libtinfo`` was installed as ``/usr/lib64/libtinfo.so.6``: + + ``$ sudo ln -s /usr/lib64/libtinfo.so.6 /usr/lib64/libtinfo.so.5`` + +Using SQreamDB SQL +================== + +By default, sqream sql runs in interactive mode. You can issue commands or SQL statements. + +Running Commands Interactively (SQL shell) +------------------------------------------ + +When starting sqream sql, after entering your password, you are presented with the SQL shell. + +To exit the shell, type ``\q`` or :kbd:`Ctrl-d`. + +.. code-block:: psql + + $ sqream sql --port=5000 --username=jdoe --databasename=master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> _ + +The database name shown means you are now ready to run statements and queries. + +Statements and queries are standard SQL, followed by a semicolon (``;``). Statement results are usually formatted as a valid CSV, +followed by the number of rows and the elapsed time for that statement. + +.. code-block:: psql + + master=> SELECT TOP 5 * FROM nba; + Avery Bradley ,Boston Celtics ,0,PG,25,6-2 ,180,Texas ,7730337 + Jae Crowder ,Boston Celtics ,99,SF,25,6-6 ,235,Marquette ,6796117 + John Holland ,Boston Celtics ,30,SG,27,6-5 ,205,Boston University ,\N + R.J. Hunter ,Boston Celtics ,28,SG,22,6-5 ,185,Georgia State ,1148640 + Jonas Jerebko ,Boston Celtics ,8,PF,29,6-10,231,\N,5000000 + 5 rows + time: 0.001185s + +.. note:: Null values are represented as \\N. + +When writing long statements and queries, it may be beneficial to use line-breaks. +The prompt for a multi-line statement will change from ``=>`` to ``.``, to alert users to the change. The statement will not execute until a semicolon is used. + + +.. code-block:: psql + :emphasize-lines: 13 + + $ sqream sql --port=5000 --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> SELECT "Age", + . AVG("Salary") + . FROM NBA + . GROUP BY 1 + . ORDER BY 2 ASC + . LIMIT 5 + . ; + 38,1840041 + 19,1930440 + 23,2034746 + 21,2067379 + 36,2238119 + 5 rows + time: 0.009320s + + +Executing Batch Scripts (``-f``) +-------------------------------- + +To run an SQL script, use the ``-f `` argument. + +For example, + +.. code-block:: console + + $ sqream sql --port=5000 --username=jdoe -d master -f sql_script.sql --results-only + +.. tip:: Output can be saved to a file by using redirection (``>``). + +Executing Commands Immediately (``-c``) +--------------------------------------- + +To run a statement from the console, use the ``-c `` argument. + +For example, + +.. code-block:: console + + $ sqream sql --port=5000 --username=jdoe -d nba -c "SELECT TOP 5 * FROM nba" + Avery Bradley ,Boston Celtics ,0,PG,25,6-2 ,180,Texas ,7730337 + Jae Crowder ,Boston Celtics ,99,SF,25,6-6 ,235,Marquette ,6796117 + John Holland ,Boston Celtics ,30,SG,27,6-5 ,205,Boston University ,\N + R.J. Hunter ,Boston Celtics ,28,SG,22,6-5 ,185,Georgia State ,1148640 + Jonas Jerebko ,Boston Celtics ,8,PF,29,6-10,231,\N,5000000 + 5 rows + time: 0.202618s + +.. tip:: Remove the timing and row count by passing the ``--results-only`` parameter + + +Examples +======== + +Starting a Regular Interactive Shell +------------------------------------ + +Connect to local server 127.0.0.1 on port 5000, to the default built-in database, `master`: + +.. code-block:: psql + + $ sqream sql --port=5000 --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + +Connect to local server 127.0.0.1 via the built-in load balancer on port 3108, to the default built-in database, `master`: + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=mjordan -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + +Executing Statements in an Interactive Shell +-------------------------------------------- + +Note that all SQL commands end with a semicolon. + +Creating a new database and switching over to it without reconnecting: + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=oldmcd -d master + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=> create database farm; + executed + time: 0.003811s + master=> \c farm + farm=> + +.. code-block:: psql + + farm=> create table animals(id int not null, name text(30) not null, is_angry bool not null); + executed + time: 0.011940s + + farm=> insert into animals values(1,'goat',false); + executed + time: 0.000405s + + farm=> insert into animals values(4,'bull',true) ; + executed + time: 0.049338s + + farm=> select * from animals; + 1,goat ,0 + 4,bull ,1 + 2 rows + time: 0.029299s + +Executing SQL Statements from the Command Line +---------------------------------------------- + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals WHERE is_angry = true" + 4,bull ,1 + 1 row + time: 0.095941s + +.. _controlling_output: + +Controlling the Client Output +----------------------------- + +Two parameters control the dispay of results from the client: + +* ``--results-only`` - removes row counts and timing information +* ``--delimiter`` - changes the record delimiter + +Exporting SQL Query Results to CSV +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using the ``--results-only`` flag removes the row counts and timing. + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals" --results-only > file.csv + $ cat file.csv + 1,goat ,0 + 2,sow ,0 + 3,chicken ,0 + 4,bull ,1 + +Changing a CSV to a TSV +^^^^^^^^^^^^^^^^^^^^^^^ + +The ``--delimiter`` parameter accepts any printable character. + +.. tip:: To insert a tab, use :kbd:`Ctrl-V` followed by :kbd:`Tab ↹` in Bash. + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -c "SELECT * FROM animals" --delimiter ' ' > file.tsv + $ cat file.tsv + 1 goat 0 + 2 sow 0 + 3 chicken 0 + 4 bull 1 + + +Executing a Series of Statements From a File +-------------------------------------------- + +Assuming a file containing SQL statements (separated by semicolons): + +.. code-block:: console + + $ cat some_queries.sql + CREATE TABLE calm_farm_animals + ( id INT IDENTITY(0, 1), name TEXT(30) + ); + + INSERT INTO calm_farm_animals (name) + SELECT name FROM animals WHERE is_angry = false; + +.. code-block:: console + + $ sqream sql --port=3105 --clustered --username=oldmcd -d farm -f some_queries.sql + executed + time: 0.018289s + executed + time: 0.090697s + +Connecting Using Environment Variables +-------------------------------------- + +You can save connection parameters as environment variables: + +.. code-block:: console + + $ export SQREAM_USER=sqream; + $ export SQREAM_DATABASE=farm; + $ sqream sql --port=3105 --clustered --username=$SQREAM_USER -d $SQREAM_DATABASE + +Connecting to a Specific Queue +------------------------------ + +When using the :ref:`dynamic workload manager` - connect to ``etl`` queue instead of using the default ``sqream`` queue. + +.. code-block:: psql + + $ sqream sql --port=3105 --clustered --username=mjordan -d master --service=etl + Password: + + Interactive client mode + To quit, use ^D or \q. + + master=>_ + + +Operations and Flag References +============================== + +Command Line Arguments +---------------------- + +**Sqream SQL** supports the following command line arguments: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Argument + - Default + - Description + * - ``-c`` or ``--command`` + - None + - Changes the mode of operation to single-command, non-interactive. Use this argument to run a statement and immediately exit + * - ``-f`` or ``--file`` + - None + - Changes the mode of operation to multi-command, non-interactive. Use this argument to run a sequence of statements from an external file and immediately exit + * - ``-h``, or``--host`` + - ``127.0.0.1`` + - Address of the SQreamDB worker + * - ``-p`` or ``--port`` + - ``5000`` + - Sets the connection port. + * - ``--databasename``, ``-d``, or ``database`` + - None + - Specifies the database name for queries and statements in this session + * - ``--username`` + - None + - Username to connect to the specified database. + * - ``--password`` + - None + - Specify the password using the command line argument. If not specified, the client will prompt the user for the password + * - ``--clustered`` + - False + - When used, the client connects to the load balancer, usually on port ``3108``. If not set, the client assumes the connection is to a standalone SQreamDB worker + * - ``-s`` or ``--service`` + - ``sqream`` + - :ref:`Service name (queue)` that statements will file into + * - ``--results-only`` + - False + - Outputs results only, without timing information and row counts + * - ``--no-history`` + - False + - When set, prevents command history from being saved in ``~/.sqream/clientcmdhist`` + * - ``--delimiter`` + - ``,`` + - Specifies the field separator. By default, ``sqream sql`` outputs valid CSVs. Change the delimiter to modify the output to another delimited format (e.g. TSV, PSV). See the section :ref:`supported record delimiters` below for more information + * - ``--chunksize`` + - 128 * 1024 (128 Kb) + - Network chunk size + * - ``--log`` or ``log-file`` + - False + - A log file will be created + * - ``--show-results`` + - True + - Determines whether or not results are shown + * - ``--ssl`` + - False + - Determines connection SSL + * - ``--table-view`` + - ``true`` + - Displays query results in a table view format with column headers. The display limit is set to 10,000 rows + + +.. tip:: Run ``$ sqream sql --help`` to see a full list of arguments + +.. _supported_record_delimiters: + +Supported Record Delimiters +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The supported record delimiters are printable ASCII values (32-126). + +* Recommended delimiters for use are: ``,``, ``|``, tab character. + +* The following characters are **not supported**: ``\``, ``N``, ``-``, ``:``, ``"``, ``\n``, ``\r``, ``.``, lower-case latin letters, digits (0-9) + +Meta-Commands +------------- + +* Meta-commands in Sqream SQL start with a backslash (``\``) + +.. note:: Meta commands do not end with a semicolon + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Command + - Example + - Description + * - ``\q`` or ``\quit`` + - .. code-block:: psql + + master=> \q + - Quit the client. (Same as :kbd:`Ctrl-d`) + * - ``\c `` or ``\connect `` + - .. code-block:: psql + + master=> \c fox + fox=> + - Changes the current connection to an alternate database + +Basic Commands +-------------- + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-l` + - Clear the screen. + * - :kbd:`Ctrl-c` + - Terminate the current command. + * - :kbd:`Ctrl-z` + - Suspend/stop the command. + * - :kbd:`Ctrl-d` + - Quit SQream SQL + + + +Moving Around the Command Line +------------------------------ + +.. list-table:: + :widths: 17 83 + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-a` + - Goes to the beginning of the command line. + * - :kbd:`Ctrl-e` + - Goes to the end of the command line. + * - :kbd:`Ctrl-u` + - Deletes from cursor to the beginning of the command line. + * - :kbd:`Ctrl-k` + - Deletes from the cursor to the end of the command line. + * - :kbd:`Ctrl-w` + - Delete from cursor to beginning of a word. + * - :kbd:`Ctrl-y` + - Pastes a word or text that was cut using one of the deletion shortcuts (such as the one above) after the cursor. + * - :kbd:`Alt-b` + - Moves back one word (or goes to the beginning of the word where the cursor is). + * - :kbd:`Alt-f` + - Moves forward one word (or goes to the end of word the cursor is). + * - :kbd:`Alt-d` + - Deletes to the end of a word starting at the cursor. Deletes the whole word if the cursor is at the beginning of that word. + * - :kbd:`Alt-c` + - Capitalizes letters in a word starting at the cursor. Capitalizes the whole word if the cursor is at the beginning of that word. + * - :kbd:`Alt-u` + - Capitalizes from the cursor to the end of the word. + * - :kbd:`Alt-l` + - Makes lowercase from the cursor to the end of the word. + * - :kbd:`Ctrl-f` + - Moves forward one character. + * - :kbd:`Ctrl-b` + - Moves backward one character. + * - :kbd:`Ctrl-h` + - Deletes characters located before the cursor. + * - :kbd:`Ctrl-t` + - Swaps a character at the cursor with the previous character. + +Searching +--------- + +.. list-table:: + :widths: 17 83 + :header-rows: 1 + + * - Command + - Description + * - :kbd:`Ctrl-r` + - Searches the history backward. + * - :kbd:`Ctrl-g` + - Escapes from history-searching mode. + * - :kbd:`Ctrl-p` + - Searches the previous command in history. + * - :kbd:`Ctrl-n` + - Searches the next command in history. \ No newline at end of file diff --git a/reference/cli/upgrade_storage.rst b/reference/cli/upgrade_storage.rst index c95ef25f3..bf9e8554f 100644 --- a/reference/cli/upgrade_storage.rst +++ b/reference/cli/upgrade_storage.rst @@ -14,25 +14,40 @@ Running upgrade_storage ``upgrade_storage`` can be found in the ``bin`` directory of your SQream DB installation. -Command line arguments -========================== - -``upgrade_storage`` contains one positional argument: - -.. code-block:: console - - $ upgrade_storage +Command line arguments and options +---------------------------------- .. list-table:: :widths: auto :header-rows: 1 - * - Argument - - Required + * - Parameter + - Parameter Type - Description - * - Storage path - - ✓ - - Full path to a valid storage cluster + * - ``storage_path`` + - Argument + - Full path to a valid storage cluster. + * - ``--storage_version`` + - Option + - Displays your current storage version. + * - ``--check_predicates=0`` + - Option + - Allows the upgrade process to proceed even if there are predicates marked for deletion. + + +Syntax +------ + +.. code-block:: console + + $ upgrade_storage [--check_predicates=0] + + +.. code-block:: console + + $ upgrade_storage [--storage_version] + + Results and error codes ======================== @@ -51,7 +66,7 @@ Results and error codes - ``no need to upgrade`` - Storage doesn't need an upgrade * - Failure: can't read storage - - ``levelDB is in use by another application`` + - ``RocksDB is in use by another application`` - Check permissions, and ensure no SQream DB workers or :ref:`metadata_server ` are running when performing this operation. @@ -64,7 +79,7 @@ Upgrade SQream DB's storage cluster .. code-block:: console $ ./upgrade_storage /home/rhendricks/raviga_database - get_leveldb_version path{/home/rhendricks/raviga_database} + get_rocksdb_version path{/home/rhendricks/raviga_database} current storage version 23 upgrade_v24 upgrade_storage to 24 @@ -75,7 +90,7 @@ Upgrade SQream DB's storage cluster upgrade_v26 upgrade_storage to 26 upgrade_storage to 26 - Done - validate_leveldb + validate_rocksdb storage has been upgraded successfully to version 26 This message confirms that the cluster has already been upgraded correctly. diff --git a/reference/configuration.rst b/reference/configuration.rst deleted file mode 100644 index bf487496e..000000000 --- a/reference/configuration.rst +++ /dev/null @@ -1,5 +0,0 @@ -.. _configuration_reference: - -************************* -Configuration -************************* diff --git a/reference/index.rst b/reference/index.rst index f1a4a600e..2dffa1025 100644 --- a/reference/index.rst +++ b/reference/index.rst @@ -1,13 +1,13 @@ .. _reference: -************************* -Reference Guides -************************* +********** +References +********** The **Reference Guides** section provides reference for using SQream DB's interfaces and SQL features. .. toctree:: - :maxdepth: 5 + :maxdepth: 1 :caption: In this section: :glob: @@ -15,4 +15,3 @@ The **Reference Guides** section provides reference for using SQream DB's interf catalog_reference cli/index sql_feature_support - xxconfiguration diff --git a/reference/sql/sql_functions/aggregate_functions/avg.rst b/reference/sql/sql_functions/aggregate_functions/avg.rst index be0294d46..9176b33b3 100644 --- a/reference/sql/sql_functions/aggregate_functions/avg.rst +++ b/reference/sql/sql_functions/aggregate_functions/avg.rst @@ -36,13 +36,13 @@ Arguments Returns ============ -Return type is dependant on the argument. +The return type depends on the argument. * For ``TINYINT``, ``SMALLINT`` and ``INT``, the return type is ``INT``. * For ``BIGINT``, the return type is ``BIGINT``. -* For ``REAL``, the return type is ``REAL`` +* For ``REAL``, the return type is ``DOUBLE`` * For ``DOUBLE``, rhe return type is ``DOUBLE`` @@ -62,14 +62,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/corr.rst b/reference/sql/sql_functions/aggregate_functions/corr.rst index 212ab89d2..5963c835f 100644 --- a/reference/sql/sql_functions/aggregate_functions/corr.rst +++ b/reference/sql/sql_functions/aggregate_functions/corr.rst @@ -51,14 +51,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/count.rst b/reference/sql/sql_functions/aggregate_functions/count.rst index 803e529c0..15e4de46a 100644 --- a/reference/sql/sql_functions/aggregate_functions/count.rst +++ b/reference/sql/sql_functions/aggregate_functions/count.rst @@ -1,13 +1,14 @@ .. _count: -************************** +***** COUNT -************************** +***** The ``COUNT`` function returns the count of numeric values, or only the distinct values. Syntax -========== +====== + The following is the correct syntax for using the ``COUNT`` function as an **aggregate**: .. code-block:: postgres @@ -25,7 +26,8 @@ The following is the correct syntax for using the ``COUNT`` function as a **wind ) Arguments -============ +========= + The following table describes the ``COUNT`` arguments: .. list-table:: @@ -42,12 +44,14 @@ The following table describes the ``COUNT`` arguments: - Specifies that the operation should operate only on unique values Returns -============ +======= + * The ``COUNT`` function returns ``BIGINT``. Notes -======= +===== + The following notes apply to the ``COUNT`` function: * When all rows contain ``NULL`` values, the function returns ``NULL``. @@ -60,21 +64,22 @@ The following notes apply to the ``COUNT`` function: Examples -=========== +======== + The examples in this section are based on a table named ``nba``, structured as follows: .. code-block:: postgres CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); @@ -92,7 +97,8 @@ This section includes the following examples: :depth: 1 Counting Rows in a Table ---------------------------- +------------------------ + This example shows how to count rows in a table: .. code-block:: psql @@ -103,7 +109,8 @@ This example shows how to count rows in a table: 457 Counting Distinct Values in a Table ----------------------------------- +----------------------------------- + This example shows how to count distinct values in a table: The following structures generate the same result: @@ -125,6 +132,7 @@ The following structures generate the same result: Combining COUNT with Other Aggregates ------------------------------------- + This example shows how to combine the ``COUNT`` function with other aggregates: .. code-block:: psql diff --git a/reference/sql/sql_functions/aggregate_functions/covar_pop.rst b/reference/sql/sql_functions/aggregate_functions/covar_pop.rst index c0d7cd35d..24300b5d7 100644 --- a/reference/sql/sql_functions/aggregate_functions/covar_pop.rst +++ b/reference/sql/sql_functions/aggregate_functions/covar_pop.rst @@ -55,14 +55,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/covar_samp.rst b/reference/sql/sql_functions/aggregate_functions/covar_samp.rst index 29d7b7493..2c8451023 100644 --- a/reference/sql/sql_functions/aggregate_functions/covar_samp.rst +++ b/reference/sql/sql_functions/aggregate_functions/covar_samp.rst @@ -56,14 +56,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/index.rst b/reference/sql/sql_functions/aggregate_functions/index.rst index 9bf0527d6..ffd41b538 100644 --- a/reference/sql/sql_functions/aggregate_functions/index.rst +++ b/reference/sql/sql_functions/aggregate_functions/index.rst @@ -1,36 +1,34 @@ .. _aggregate_functions: -******************** +******************* Aggregate Functions -******************** +******************* Overview -=========== +======== Aggregate functions perform calculations based on a set of values and return a single value. Most aggregate functions ignore null values. Aggregate functions are often used with the ``GROUP BY`` clause of the :ref:`select` statement. Available Aggregate Functions -=============== -The following list shows the available aggregate functions: +============================= +The following list shows the available aggregate functions: -.. toctree:: - :maxdepth: 1 - :glob: +.. hlist:: + :columns: 2 - - avg - corr - count - covar_pop - covar_samp - max - min - mode - percentile_cont - percentile_disc - stddev_pop - stddev_samp - sum - var_pop - var_samp + * :ref:`AVG` + * :ref:`CORR` + * :ref:`COUNT` + * :ref:`COVAR_POP` + * :ref:`COVAR_SAMP` + * :ref:`MAX` + * :ref:`MIN` + * :ref:`MODE` + * :ref:`PERCENTILE_CONT` + * :ref:`PERCENTILE_DISC` + * :ref:`STDDEV_POP` + * :ref:`STDDEV_SAMP` + * :ref:`SUM` + * :ref:`VAR_POP` + * :ref:`VAR_SAMP` \ No newline at end of file diff --git a/reference/sql/sql_functions/aggregate_functions/max.rst b/reference/sql/sql_functions/aggregate_functions/max.rst index 529e2230d..994a3aaca 100644 --- a/reference/sql/sql_functions/aggregate_functions/max.rst +++ b/reference/sql/sql_functions/aggregate_functions/max.rst @@ -53,14 +53,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/min.rst b/reference/sql/sql_functions/aggregate_functions/min.rst index d488d87fc..dd4d39177 100644 --- a/reference/sql/sql_functions/aggregate_functions/min.rst +++ b/reference/sql/sql_functions/aggregate_functions/min.rst @@ -53,14 +53,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/mode.rst b/reference/sql/sql_functions/aggregate_functions/mode.rst index a4675b659..b57186a41 100644 --- a/reference/sql/sql_functions/aggregate_functions/mode.rst +++ b/reference/sql/sql_functions/aggregate_functions/mode.rst @@ -1,8 +1,9 @@ .. _mode: -************************** +**** MODE -************************** +**** + The **MODE** function returns the most common value in the selected column. If there are no repeating values, or if there is the same frequency of multiple values, this function returns the top value based on the ``ORDER BY`` clause. The **MODE** function is commonly used with the following functions: @@ -11,7 +12,8 @@ The **MODE** function is commonly used with the following functions: * `PERCENTILE_DISC `_ function Syntax -======== +====== + The following is the correct syntax for the ``MODE`` function: .. code-block:: postgres @@ -19,38 +21,50 @@ The following is the correct syntax for the ``MODE`` function: MODE() WITHIN GROUP (ORDER BY column) Example -======== +======= + The example in this section is based on the ``players`` table below: .. list-table:: - :widths: 33 33 33 + :widths: auto :header-rows: 1 - -+-----------------+----------+-----------+ -| **Player_Name** | **Team** | **Score** | -+-----------------+----------+-----------+ -| T_Tock | Blue | 13 | -+-----------------+----------+-----------+ -| N_Stein | Blue | 20 | -+-----------------+----------+-----------+ -| F_Dirk | Blue | 20 | -+-----------------+----------+-----------+ -| Y_Hyung | Blue | 10 | -+-----------------+----------+-----------+ -| A_Rodrick | Blue | 13 | -+-----------------+----------+-----------+ -| R_Evans | Red | 55 | -+-----------------+----------+-----------+ -| C_Johnston | Red | 20 | -+-----------------+----------+-----------+ -| K_Stoll | Red | 25 | -+-----------------+----------+-----------+ -| J_Loftus | Red | 22 | -+-----------------+----------+-----------+ -| L_Ellis | Red | 7 | -+-----------------+----------+-----------+ -| G_Elroy | Red | 23 | -+-----------------+----------+-----------+ + + * - Player_Name + - Team + - Score + * - T_Tock + - Blue + - 13 + * - N_Stein + - Blue + - 20 + * - F_Dirk + - Blue + - 20 + * - Y_Hyung + - Blue + - 10 + * - A_Rodrick + - Blue + - 13 + * - R_Evans + - Red + - 55 + * - C_Johnston + - Red + - 20 + * - K_Stoll + - Red + - 25 + * - J_Loftus + - Red + - 22 + * - L_Ellis + - Red + - 7 + * - G_Elroy + - Red + - 23 The following is an example of the ``MODE`` function: diff --git a/reference/sql/sql_functions/aggregate_functions/stddev_pop.rst b/reference/sql/sql_functions/aggregate_functions/stddev_pop.rst index 5a8a7e677..8687c0e76 100644 --- a/reference/sql/sql_functions/aggregate_functions/stddev_pop.rst +++ b/reference/sql/sql_functions/aggregate_functions/stddev_pop.rst @@ -58,14 +58,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/stddev_samp.rst b/reference/sql/sql_functions/aggregate_functions/stddev_samp.rst index 0328e2241..81c7a1f51 100644 --- a/reference/sql/sql_functions/aggregate_functions/stddev_samp.rst +++ b/reference/sql/sql_functions/aggregate_functions/stddev_samp.rst @@ -62,14 +62,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/aggregate_functions/sum.rst b/reference/sql/sql_functions/aggregate_functions/sum.rst index e8f648894..f33f0521e 100644 --- a/reference/sql/sql_functions/aggregate_functions/sum.rst +++ b/reference/sql/sql_functions/aggregate_functions/sum.rst @@ -1,14 +1,13 @@ .. _sum: -************************** +*** SUM -************************** +*** Returns the sum of numeric values, or only the distinct values. Syntax -========== - +====== .. code-block:: postgres @@ -23,7 +22,7 @@ Syntax ) Arguments -============ +========= .. list-table:: :widths: auto @@ -37,27 +36,27 @@ Arguments - Specifies that the operation should operate only on unique values Returns -============ +======= -Return type is dependant on the argument. +Return type is dependent on the argument. -* For ``TINYINT``, ``SMALLINT`` and ``INT``, the return type is ``INT``. +* For ``TINYINT``, ``SMALLINT`` and ``INT``, the return type is ``INT`` -* For ``BIGINT``, the return type is ``BIGINT``. +* For ``BIGINT``, the return type is ``BIGINT`` -* For ``REAL``, the return type is ``REAL`` +* For ``REAL``, the return type is ``DOUBLE`` * For ``DOUBLE``, rhe return type is ``DOUBLE`` Notes -======= +===== * ``NULL`` values are ignored * Because ``SUM`` returns the same data type, it can very quickly overflow on large data sets. If the SUM is over 2\ :sup:`31` for example, up-cast to a larger type like ``BIGINT``: ``SUM(expr :: BIGINT)`` Examples -=========== +======== For these examples, assume a table named ``nba``, with the following structure: @@ -65,14 +64,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); @@ -85,7 +84,7 @@ Here's a peek at the table contents (:download:`Download nba.csv CREATE TABLE bit(b1 int, b2 int, b3 int); - executed - - master=> INSERT INTO bit VALUES (1,2,3), (2, 4, 6), (4, 2, 6), (2, 8, 16), (null, null, 64), (5, 3, 1), (6, 1, 0); - executed + + master=> CREATE OR REPLACE TABLE t (xtinyInt tinyInt, xsmallInt smallInt, xint int, xbigInt bigInt); + + master=> INSERT INTO t VALUES (1,1,1,1); + + master=> SELECT ~xtinyInt, ~xsmallInt, ~xint, ~xbigInt from t; - SELECT b1, b2, b3, ~b1, ~b2, ~b3 FROM bit; - b1 | b2 | b3 | ?column? | ?column?0 | ?column?1 - ---+----+----+----------+-----------+---------- - 1 | 2 | 3 | -2 | -3 | -4 - 2 | 4 | 6 | -3 | -5 | -7 - 4 | 2 | 6 | -5 | -3 | -7 - 2 | 8 | 16 | -3 | -9 | -17 - | | 64 | | | -65 - 5 | 3 | 1 | -6 | -4 | -2 - 6 | 1 | 0 | -7 | -2 | -1 + ?column? | ?column?0 | ?column?1 | ?column?2 | + ---------+-----------+-----------+-----------+ + 254 | -2 | -2 | -2 | diff --git a/reference/sql/sql_functions/scalar_functions/conditionals/is_ascii.rst b/reference/sql/sql_functions/scalar_functions/conditionals/is_ascii.rst index bb9e3b2f9..3f2b5fb86 100644 --- a/reference/sql/sql_functions/scalar_functions/conditionals/is_ascii.rst +++ b/reference/sql/sql_functions/scalar_functions/conditionals/is_ascii.rst @@ -45,20 +45,19 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE dictionary (id INT NOT NULL, fw TEXT(30), en VARCHAR(30)); - - INSERT INTO dictionary VALUES (1, '行こう', 'Let''s go'), (2, '乾杯', 'Cheers'), (3, 'L''chaim', 'Cheers'); + CREATE TABLE dictionary (id INT NOT NULL, text TEXT); + INSERT INTO dictionary VALUES (1, '行こう'), (2, '乾杯'), (3, 'L''chaim'); + SELECT id, text, IS_ASCII(text) FROM dictionary; IS NULL ----------- .. code-block:: psql - m=> SELECT id, en, fw, IS_ASCII(fw) FROM dictionary; - id | en | fw | is_ascii - ---+----------+----------+--------- - 1 | Let's go | 行こう | false - 2 | Cheers | 乾杯 | false - 3 | Cheers | L'chaim | true + id | text | is_ascii + ---+----------+---------- + 1 | 行こう | false + 2 | 乾杯 | false + 3 | L'chaim | true diff --git a/reference/sql/sql_functions/scalar_functions/conditionals/is_null.rst b/reference/sql/sql_functions/scalar_functions/conditionals/is_null.rst index c99f4e7d1..94f8605f7 100644 --- a/reference/sql/sql_functions/scalar_functions/conditionals/is_null.rst +++ b/reference/sql/sql_functions/scalar_functions/conditionals/is_null.rst @@ -40,7 +40,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE t (id INT NOT NULL, name VARCHAR(30), weight INT); + CREATE TABLE t (id INT NOT NULL, name TEXT(30), weight INT); INSERT INTO t VALUES (1, 'Kangaroo', 120), (2, 'Koala', 20), (3, 'Wombat', 60) ,(4, 'Kappa', NULL),(5, 'Echidna', 8),(6, 'Chupacabra', NULL) diff --git a/reference/sql/sql_functions/scalar_functions/conditionals/is_table_exists.rst b/reference/sql/sql_functions/scalar_functions/conditionals/is_table_exists.rst new file mode 100644 index 000000000..9531f8e56 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/conditionals/is_table_exists.rst @@ -0,0 +1,50 @@ +.. _is_table_exists: + +************************** +IS TABLE EXISTS +************************** + +The ``IS TABLE EXISTS`` check whether a table exists in a specified schema within the database. + +Syntax +========== + +.. code-block:: postgres + + SELECT is_table_exists(<'schema_name'>, <'table_name'>) + + +Arguments +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The schema to search for the table within + * - ``table_name`` + - The name of the table to check for existence + +Returns +======= + +* ``1`` if table exists +* ``0`` if table does not exist + + +Example +======== + +.. code-block:: psql + + SELECT is_table_exists('public', 'my_table'); + + ----- + 1 + + + + diff --git a/reference/sql/sql_functions/scalar_functions/conditionals/is_view_exists.rst b/reference/sql/sql_functions/scalar_functions/conditionals/is_view_exists.rst new file mode 100644 index 000000000..d98408683 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/conditionals/is_view_exists.rst @@ -0,0 +1,50 @@ +.. _is_view_exists: + +************************** +IS VIEW EXISTS +************************** + +The ``IS VIEW EXISTS`` check whether a view exists in a specified schema within the database. + +Syntax +========== + +.. code-block:: postgres + + SELECT is_view_exists(<'schema_name'>, <'view_name'>) + + +Arguments +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The schema to search for the view within + * - ``table_name`` + - The name of the view to check for existence + +Returns +======= + +* ``1`` if view exists +* ``0`` if view does not exist + + +Example +======== + +.. code-block:: psql + + SELECT is_view_exists('public', 'my_view'); + + ----- + 1 + + + + diff --git a/reference/sql/sql_functions/scalar_functions/conversion/chr.rst b/reference/sql/sql_functions/scalar_functions/conversion/chr.rst new file mode 100644 index 000000000..b98ea5c45 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/conversion/chr.rst @@ -0,0 +1,63 @@ +.. _chr: + +*** +CHR +*** + +The ``CHR`` function takes an integer parameter representing the ASCII code and returns the corresponding character. + +Syntax +====== + +.. code-block:: postgres + + CHR(int) + +Argument +======== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Argument + - Description + * - ``int`` + - Integer argument that represents the ASCII code of the character you want to retrieve + + +Returns +======= + +Returns the ASCII character representation of the supplied integer. + + +Example +======= + +Create the following table: + +.. code-block:: postgres + + CREATE OR REPLACE TABLE t(x INT NOT NULL); + + INSERT INTO t (x) + VALUES (72), (101), (108), (108), (111); + +Execute the ``CHR`` function: + +.. code-block:: postgres + + SELECT CHR(x) FROM t; + +Output: + +.. code-block:: postgres + + CHR | + ------+ + H | + e | + l | + l | + o | \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/conversion/is_castable.rst b/reference/sql/sql_functions/scalar_functions/conversion/is_castable.rst new file mode 100644 index 000000000..8e54c4c0a --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/conversion/is_castable.rst @@ -0,0 +1,130 @@ +.. _is_castable: + +************ +IS CASTABLE +************ + +The ``IsCastable`` function checks whether a data type cast operation is supported for any given row. If the cast is not supported, the ``CASE`` statement handles the exception by providing an alternative. + +.. tip:: + + See :ref:`supported casts table ` + +Syntax +====== + +.. code-block:: sql + + -- Checking if a cast is supported for literal value: + + SELECT + IsCastable( + BOOL + | TINYINT + | SMALLINT + | INT + | BIGINT + | REAL + | DOUBLE + | FLOAT + | TEXT + | NUMERIC + | DATE + | DATETIME + | ARRAY + , BOOL + | TINYINT + | SMALLINT + | INT + | BIGINT + | REAL + | DOUBLE + | FLOAT + | TEXT + | NUMERIC + | DATE + | DATETIME + | ARRAY + ) + + -- Checking if cast is supported for columns: + + SELECT + IsCastable( + , + BOOL + | TINYINT + | SMALLINT + | INT + | BIGINT + | REAL + | DOUBLE + | FLOAT + | TEXT + | NUMERIC + | DATE + | DATETIME + | ARRAY + ) + FROM + ; + + -- Returns query result set + + SELECT + , + CASE + WHEN IsCastable( + , + BOOL + | TINYINT + | SMALLINT + | INT + | BIGINT + | REAL + | DOUBLE + | TEXT + | NUMERIC + | DATE + | DATETIME + ) + THEN + :: + BOOL + | TINYINT + | SMALLINT + | INT + | BIGINT + | REAL + | DOUBLE + | TEXT + | NUMERIC + | DATE + | DATETIME + ELSE + END + FROM + ; + +Return +======= + +``IsCastable`` returns: + +* 1 when the cast is supported +* 0 if the cast is not supported +* Your query result set if used within a ``CASE`` statement + +Example +======= + +.. code-block:: sql + + SELECT number, + CASE + WHEN IsCastable(number, DOUBLE) THEN number :: DOUBLE + ELSE NULL + END + FROM + my_numbers; + diff --git a/reference/sql/sql_functions/scalar_functions/conversion/to_hex.rst b/reference/sql/sql_functions/scalar_functions/conversion/to_hex.rst index f3cf6fb82..68c695f09 100644 --- a/reference/sql/sql_functions/scalar_functions/conversion/to_hex.rst +++ b/reference/sql/sql_functions/scalar_functions/conversion/to_hex.rst @@ -7,14 +7,14 @@ TO_HEX Converts an integer to a hexadecimal representation. Syntax -========== +====== .. code-block:: postgres - TO_HEX( expr ) --> VARCHAR + TO_HEX( expr ) --> TEXT Arguments -============ +========= .. list-table:: :widths: auto @@ -23,36 +23,67 @@ Arguments * - Parameter - Description * - ``expr`` - - An integer expression + - This function accepts ``INT`` and ``BIGINT`` expressions Returns -============ - -* Representation of the hexadecimal number of type ``VARCHAR``. +======= +If the input number is of type ``INT``, the return string will be 10 characters long (8 characters for the digits and 2 characters for the "0x" prefix). If the input number is of type ``BIGINT``, the return string will be 18 characters long (16 characters for the digits and 2 characters for the "0x" prefix). Examples -=========== +======== -For these examples, consider the following table and contents: +``BIGINT`` data type +-------------------- .. code-block:: postgres CREATE TABLE cool_numbers(number BIGINT NOT NULL); + +.. code-block:: postgres + + INSERT INTO cool_numbers VALUES (-42), (3735928559), (666), (3135097598), (3221229823); + +.. code-block:: postgres + + SELECT TO_HEX(number) FROM cool_numbers; + +Output: - INSERT INTO cool_numbers VALUES (42), (3735928559), (666), (3135097598), (3221229823); +.. code-block:: none + to_hex + ------------------ + 0xffffffffffffffd6 + 0x00000000deadbeef + 0x000000000000029a + 0x00000000baddcafe + 0x00000000c00010ff + + +``INT`` data type +----------------- + +.. code-block:: postgres + + CREATE TABLE cool_numbers(number INT NOT NULL); + +.. code-block:: postgres + + INSERT INTO cool_numbers VALUES (-42), (373592855), (666), (313509759), (322122982); + +.. code-block:: postgres -Convert numbers to hexadecimal -------------------------------------- + SELECT TO_HEX(number) FROM cool_numbers; + +Output: -.. code-block:: psql +.. code-block:: none - master=> SELECT TO_HEX(number) FROM cool_numbers; - to_hex - ------------------ - 0x000000000000002a - 0x00000000deadbeef - 0x000000000000029a - 0x00000000baddcafe - 0x00000000c00010ff + to_hex + ---------- + 0xffffffd6 + 0x16449317 + 0x0000029a + 0x12afc77f + 0x133334e6 \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/current_timestamp2.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/current_timestamp2.rst new file mode 100644 index 000000000..c47947ca6 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/current_timestamp2.rst @@ -0,0 +1,63 @@ +.. _current_timestamp2: + +************************** +CURRENT_TIMESTAMP2 +************************** + +Returns the current date and time of the system. + +.. note:: This function has a special ANSI SQL form and can be called without parentheses. + +Syntax +========== + +.. code-block:: postgres + + CURRENT_TIMESTAMP2() --> DATETIME2 + + CURRENT_TIMESTAMP2 --> DATETIME2 + +Arguments +============ + +None + +Returns +============ + +The current system date and time, with type ``DATETIME2``. + +Notes +======== + +* This function has a special ANSI SQL form and can be called without parentheses. + +* Aliases to this function include :ref:`SYSDATE2` and :ref:`GETDATE2`. + +* To get the date only, see :ref:`CURRENT_DATE`. + +Examples +=========== + +Get the current system date and time +---------------------------------------- + +.. code-block:: psql + + master=> SELECT CURRENT_TIMESTAMP2, CURRENT_TIMESTAMP2(), SYSDATE2, GETDATE2(); + getdate0 | getdate1 | getdate2 | getdate3 + -------------------------------------+--------------------------------------+--------------------------------------+------------------------------------- + 2019-12-07 23:04:26.300032671 +02:00 | 2019-12-07 23:04:26.300032671 +02:00 | 2019-12-07 23:04:26.300032671 +02:00 | 2019-12-07 23:04:26.300032671 +02:00 + + +Find events that happen before this month +-------------------------------------------- + +We will use :ref:`TRUNC` to get the date at the beginning of this month, and then filter. + +.. code-block:: psql + + master=> SELECT COUNT(*) FROM cool_dates WHERE dt <= TRUNC(CURRENT_TIMESTAMP2, month); + count + ----- + 5 diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/dateadd.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/dateadd.rst index cff5268e6..25c8ccaab 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/dateadd.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/dateadd.rst @@ -4,7 +4,7 @@ DATEADD ************************** -Adds or subtracts an interval to ``DATE`` or ``DATETIME`` value. +Adds or subtracts an interval to ``DATE`` , ``DATETIME`` or ``DATETIME2`` value. .. note:: SQream DB does not support the ``INTERVAL`` ANSI syntax. Use ``DATEADD`` to add or subtract date intervals. @@ -26,6 +26,8 @@ Syntax | MINUTE | MI | N | SECOND | SS | S | MILLISECOND | MS + | MICROSECOND | MU + | NANOSECOND | NS Arguments ============ @@ -41,7 +43,7 @@ Arguments * - ``number`` - An integer expression * - ``date_expr`` - - A ``DATE`` or ``DATETIME`` expression + - A ``DATE`` , ``DATETIME`` or ``DATETIME2`` expression Valid date parts @@ -81,6 +83,12 @@ Valid date parts * - ``MILLISECOND`` - ``MS`` - Milliseconds (0-999) + * - ``MICROSECOND`` + - ``MU`` + - Microseconds (0-999) + * - ``NANOSECOND`` + - ``NS`` + - Nanoseconds (0-999) .. note:: * The first day of the week is Sunday, when used with ``weekday``. @@ -90,6 +98,8 @@ Returns * If ``HOUR``, ``MINUTE``, ``SECOND``, or ``MILLISECOND`` are added to a ``DATE``, the return type will be ``DATETIME``. +* If ``MICROSECOND`` or ``NANOSECOND`` are added to a ``DATE`` or ``DATETIME``, the return type will be ``DATETIME2``. + * For all other date parts, the return type is the same as the argument supplied. Notes @@ -106,7 +116,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/datediff.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/datediff.rst index 5c91a88d9..573312e89 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/datediff.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/datediff.rst @@ -4,7 +4,7 @@ DATEDIFF ************************** -Calculates the difference between to ``DATE`` or ``DATETIME`` expressions, in terms of a specific date part. +Calculates the difference between two ``DATE`` , ``DATETIME`` or ``DATETIME2`` expressions, in terms of a specific date part. .. note:: Results are given in integers, rather than ``INTERVAL``, which SQream DB does not support. @@ -26,6 +26,8 @@ Syntax | MINUTE | MI | N | SECOND | SS | S | MILLISECOND | MS + | MICROSECOND | MU + | NANOSECOND | NS Arguments ============ @@ -39,7 +41,7 @@ Arguments * - ``interval`` - An interval representing a date part. See the table below or the syntax reference above for valid date parts * - ``date_expr1``, ``date_expr2`` - - A ``DATE`` or ``DATETIME`` expression. The function calculates ``date_expr2 - date_expr1``. + - A ``DATE``, ``DATETIME`` or ``DATETIME2`` expression. The function calculates ``date_expr2 - date_expr1``. Valid date parts @@ -79,6 +81,12 @@ Valid date parts * - ``MILLISECOND`` - ``MS`` - Milliseconds (0-999) + * - ``MICROSECOND`` + - ``MU`` + - Microseconds (0-999) + * - ``NANOSECOND`` + - ``NS`` + - Nanoseconds (0-999) Returns @@ -100,11 +108,10 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') - , ('Vesuvius erupts', '79-08-24', '79-08-24 13:00:00.000') , ('1997 begins', '1997-01-01', '1997-01-01') , ('1997 ends', '1997-12-31','1997-12-31 23:59:59.999'); @@ -120,11 +127,10 @@ In years master=> SELECT d AS original_date, DATEDIFF(YEAR, CURRENT_DATE, d) AS "was ... years ago" FROM cool_dates; original_date | was ... years ago --------------+------------------ - 1955-11-05 | -64 - 1985-10-26 | -34 - 0079-08-24 | -1940 - 1997-01-01 | -22 - 1997-12-31 | -22 + 1955-11-05 | -70 + 1985-10-26 | -40 + 1997-01-01 | -28 + 1997-12-31 | -28 In days ^^^^^^^^^^^^^ @@ -136,7 +142,6 @@ In days --------------+----------------- 1955-11-05 | -23408 1985-10-26 | -12460 - 0079-08-24 | -708675 1997-01-01 | -8375 1997-12-31 | -8011 @@ -155,6 +160,5 @@ In hours --------------------+---------------------+------------------ 2019-12-07 22:35:50 | 1955-11-05 01:21:00 | -561813 2019-12-07 22:35:50 | 1985-10-26 01:22:00 | -299061 - 2019-12-07 22:35:50 | 0079-08-24 13:00:00 | -17008209 2019-12-07 22:35:50 | 1997-01-01 00:00:00 | -201022 2019-12-07 22:35:50 | 1997-12-31 23:59:59 | -192263 diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/datepart.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/datepart.rst index 8a43a1472..8c3ccb0f3 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/datepart.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/datepart.rst @@ -4,7 +4,7 @@ DATEPART ************************** -Extracts a date or time part from a ``DATE`` or ``DATETIME`` value. +Extracts a date or time part from a ``DATE``, ``DATETIME`` or ``DATETIME2`` value. .. note:: SQream DB also supports the ANSI :ref:`EXTRACT` syntax. @@ -27,6 +27,8 @@ Syntax | MINUTE | MI | N | SECOND | SS | S | MILLISECOND | MS + | MICROSECOND | MU + | NANOSECOND | NS Arguments ============ @@ -40,7 +42,7 @@ Arguments * - ``interval`` - An interval representing a date part. See the table below or the syntax reference above for valid date parts * - ``date_expr`` - - A ``DATE`` or ``DATETIME`` expression + - A ``DATE``, ``DATETIME`` or ``DATETIME2`` expression Valid date parts @@ -86,6 +88,12 @@ Valid date parts * - ``MILLISECOND`` - ``MS`` - Milliseconds (0-999) + * - ``MICROSECOND`` + - ``MU`` + - Microseconds (0-999) + * - ``NANOSECOND`` + - ``NS`` + - Nanoseconds (0-999) .. note:: * The first day of the week is Sunday, when used with ``WEEKDAY``. @@ -95,12 +103,6 @@ Returns * An integer representing the date part value -Notes -======== - -* All date parts work on a ``DATETIME``. - -* The ``HOUR``, ``MINUTE``, ``SECOND``, and ``MILLISECOND`` date parts work only on ``DATETIME``. Using them on ``DATE`` will result in an error. Examples =========== @@ -109,7 +111,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/eomonth.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/eomonth.rst index 92e3f7940..04861df7e 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/eomonth.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/eomonth.rst @@ -4,9 +4,9 @@ EOMONTH ************************** -Returns a ``DATE`` or ``DATETIME`` value, reset to midnight on the last day of the month. +Returns a ``DATE`` , ``DATETIME`` or ``DATETIME2`` value, reset to midnight on the last day of the month. -.. note:: This function is provided for SQL Server compatability. +.. note:: This function is provided for SQL Server compatibility. Syntax ========== @@ -26,7 +26,7 @@ Arguments * - Parameter - Description * - ``date_expr`` - - A ``DATE`` or ``DATETIME`` expression + - A ``DATE``, ``DATETIME`` or ``DATETIME2`` expression Returns @@ -48,7 +48,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/extract.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/extract.rst index 2fd79ca86..c3eb31412 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/extract.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/extract.rst @@ -4,7 +4,7 @@ EXTRACT ************************** -Extracts a date or time part from a ``DATE`` or ``DATETIME`` value. +Extracts a date or time part from a ``DATE`` , ``DATETIME` or ``DATETIME2`` value. .. note:: SQream DB also supports the SQL Server :ref:`DATEPART` syntax, which contains more date parts for use. @@ -25,6 +25,8 @@ Syntax | MINUTE | SECOND | MILLISECONDS + | MICROSECOND | MU + | NANOSECOND | NS Arguments ============ @@ -38,7 +40,7 @@ Arguments * - ``interval`` - An interval representing a date part. See the table below or the syntax reference above for valid date parts * - ``date_expr`` - - A ``DATE`` or ``DATETIME`` expression + - A ``DATE`` , ``DATETIME`` or ``DATETIME2`` expression Valid date parts @@ -68,6 +70,12 @@ Valid date parts - Seconds (0.0-59.0) * - ``MILLISECONDS`` - Milliseconds (0.0-999.0) + * - ``MICROSECOND`` + - ``MU`` + - Microseconds (0-999) + * - ``NANOSECOND`` + - ``NS`` + - Nanoseconds (0-999) Returns ============ @@ -77,7 +85,9 @@ Returns Notes ======== -* The ``HOUR``, ``MINUTE``, ``SECOND``, and ``MILLISECOND`` date parts work only on ``DATETIME``. Using them on ``DATE`` will result in an error. +* The ``HOUR``, ``MINUTE``, ``SECOND``, and ``MILLISECOND`` date parts work on ``DATETIME`` or ``DATETIME2``. Using them on ``DATE`` will result in an error. +* The ``MICROSECOND`` and ``NANOSECOND`` date parts work only on ``DATETIME2``. Using them on ``DATE`` or ``DATETIME`` will result in an error. + Examples =========== @@ -86,7 +96,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') diff --git a/reference/sql/sql_functions/scalar_functions/date_and_time/trunc.rst b/reference/sql/sql_functions/scalar_functions/date_and_time/trunc.rst index d9d791cc3..628dc9ea0 100644 --- a/reference/sql/sql_functions/scalar_functions/date_and_time/trunc.rst +++ b/reference/sql/sql_functions/scalar_functions/date_and_time/trunc.rst @@ -1,14 +1,15 @@ .. _date_trunc: ************************** -TRUNC +Date and Time TRUNC ************************** -Truncates a ``DATE`` or ``DATETIME`` value to a specified resolution. +Truncates a ``DATE`` , ``DATETIME`` or ``DATETIME2`` value to a specified resolution. For example, truncating a ``DATE`` down to the nearest month returns the date of the first day of the month. -.. note:: This function is overloaded. The function :ref:`TRUNC` can also round numbers towards zero. +.. note:: * This function is overloaded. The function :ref:`TRUNC` can also round numbers towards zero. + * Specifying the ``NANOSECOND`` interval with the ``TRUNC`` function is redundant, as there is nothing smaller than nanoseconds. Syntax ========== @@ -27,6 +28,8 @@ Syntax | MINUTE | MI | N | SECOND | SS | S | MILLISECOND | MS + | MICROSECOND | MU + | NANOSECOND | NS Arguments ============ @@ -38,7 +41,7 @@ Arguments * - Parameter - Description * - ``date_expr`` - - A ``DATE`` or ``DATETIME`` expression + - A ``DATE`` , ``DATETIME`` or ``DATETIME2`` expression * - ``interval`` - An interval representing a date part. See the table below or the syntax reference above for valid date parts. If not specified, sets the value to to midnight and returns a ``DATETIME``. @@ -80,6 +83,12 @@ Valid date parts * - ``MILLISECOND`` - ``MS`` - Milliseconds (0-999) + * - ``MICROSECOND`` + - ``MU`` + - Microseconds (0-999) + * - ``NANOSECOND`` + - ``NS`` + - Nanoseconds (0-999) Returns ============ @@ -89,11 +98,13 @@ If no date part is specified, the return type is ``DATETIME``. Otherwise, the re Notes ======== -* All date parts work on a ``DATETIME``. +* All date parts work on a ``DATETIME2``. -* The ``HOUR``, ``MINUTE``, ``SECOND``, and ``MILLISECOND`` date parts work only on ``DATETIME``. Using them on ``DATE`` will result in an error. +* The ``HOUR``, ``MINUTE``, ``SECOND``, and ``MILLISECOND`` date parts work on ``DATETIME`` or ``DATETIME2``. Using them on ``DATE`` will result in an error. -* If no date part is specified, the ``DATE`` or ``DATETIME`` value will be set to midnight on the date value. See examples below +* The ``MICROSECOND`` and ``NANOSECOND`` date parts work only on ``DATETIME2``. Using them on ``DATE`` or ``DATETIME`` will result in an error. + +* If no date part is specified, the ``DATE``, ``DATETIME`` or ``DATETIME2`` value will be set to midnight on the date value. See examples below * See also :ref:`EOMONTH` to find the last day of the month. @@ -104,7 +115,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE cool_dates(name VARCHAR(40), d DATE, dt DATETIME); + CREATE TABLE cool_dates(name TEXT(40), d DATE, dt DATETIME); INSERT INTO cool_dates VALUES ('Marty McFly goes back to this time','1955-11-05','1955-11-05 01:21:00.000') , ('Marty McFly came from this time', '1985-10-26', '1985-10-26 01:22:00.000') diff --git a/reference/sql/sql_functions/scalar_functions/index.rst b/reference/sql/sql_functions/scalar_functions/index.rst index 1a7d639b3..faeee6bd1 100644 --- a/reference/sql/sql_functions/scalar_functions/index.rst +++ b/reference/sql/sql_functions/scalar_functions/index.rst @@ -1,20 +1,91 @@ .. _scalar_functions: -**************** -Built-In Scalar functions -**************** +*************************** +Built-In Scalar Functions +*************************** -Built-in scalar functions return one value per call. +The **Built-In Scalar Functions** page describes functions that return one value per call: - -.. toctree:: - :maxdepth: 1 - :caption: Built-in scalar functions - :glob: - - bitwise/* - conditionals/* - conversion/* - date_and_time/* - numeric/* - string/* \ No newline at end of file +.. hlist:: + :columns: 5 + + * :ref:`bitwise_and` + * :ref:`bitwise_not` + * :ref:`bitwise_or` + * :ref:`bitwise_shift_left` + * :ref:`bitwise_shift_right` + * :ref:`bitwise_xor` + * :ref:`between` + * :ref:`case` + * :ref:`coalesce` + * :ref:`decode` + * :ref:`in` + * :ref:`is_ascii` + * :ref:`is_null` + * :ref:`isnull` + * :ref:`from_unixts` + * :ref:`to_hex` + * :ref:`to_unixts` + * :ref:`curdate` + * :ref:`current_date` + * :ref:`current_timestamp` + * :ref:`current_timestamp2` + * :ref:`dateadd` + * :ref:`datediff` + * :ref:`datepart` + * :ref:`eomonth` + * :ref:`extract` + * :ref:`getdate` + * :ref:`sysdate` + * :ref:`trunc` + * :ref:`abs` + * :ref:`acos` + * :ref:`asin` + * :ref:`atan` + * :ref:`atn2` + * :ref:`ceiling` + * :ref:`cos` + * :ref:`cot` + * :ref:`crc64` + * :ref:`degrees` + * :ref:`exp` + * :ref:`floor` + * :ref:`log` + * :ref:`log10` + * :ref:`mod` + * :ref:`pi` + * :ref:`power` + * :ref:`radians` + * :ref:`round` + * :ref:`sin` + * :ref:`sqrt` + * :ref:`square` + * :ref:`tan` + * :ref:`trunc` + * :ref:`char_length` + * :ref:`charindex` + * :ref:`concat` + * :ref:`isprefixof` + * :ref:`left` + * :ref:`len` + * :ref:`like` + * :ref:`lower` + * :ref:`ltrim` + * :ref:`octet_length` + * :ref:`patindex` + * :ref:`regexp_count` + * :ref:`regexp_instr` + * :ref:`regexp_replace` + * :ref:`regexp_substr` + * :ref:`repeat` + * :ref:`replace` + * :ref:`reverse` + * :ref:`right` + * :ref:`rlike` + * :ref:`rtrim` + * :ref:`substring` + * :ref:`trim` + * :ref:`upper` + * :ref:`select_ascii` + * :ref:`sign` + * :ref:`chr` \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/numeric/atan.rst b/reference/sql/sql_functions/scalar_functions/numeric/atan.rst index d730e8dfc..6a500383a 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/atan.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/atan.rst @@ -29,7 +29,7 @@ Arguments Returns ============ -Always returns a floating point result of the inverse tangent, in radians. +When using the ``ATAN`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/atn2.rst b/reference/sql/sql_functions/scalar_functions/numeric/atn2.rst index f727a9abe..e7a356338 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/atn2.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/atn2.rst @@ -35,7 +35,7 @@ Arguments Returns ============ -Always returns a floating point result of the inverse tangent, in radians. +When using the ``ATN2`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/ceiling.rst b/reference/sql/sql_functions/scalar_functions/numeric/ceiling.rst index 2f4e2d988..8c48420fd 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/ceiling.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/ceiling.rst @@ -15,7 +15,7 @@ Syntax CEILING( expr ) - CEIL ( expr ) --> DOUBLE + CEIL ( expr ) Arguments ============ @@ -32,9 +32,8 @@ Arguments Returns ============ -* ``CEIL`` Always returns a floating point result. +``Real`` arguments are automatically cast to ``double`` precision. -* ``CEILING`` returns the same type as the argument supplied. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/cos.rst b/reference/sql/sql_functions/scalar_functions/numeric/cos.rst index 5861b7bc8..dbd322488 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/cos.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/cos.rst @@ -4,7 +4,7 @@ COS ************************** -Returns the cosine value of a numeric expression +Returns the cosine value of a numeric expression. Syntax ========== @@ -29,7 +29,7 @@ Arguments Returns ============ -Always returns a floating point result of the cosine. +When using the ``COS`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/crc64.rst b/reference/sql/sql_functions/scalar_functions/numeric/crc64.rst index 7dbd6ddf1..de3693aa2 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/crc64.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/crc64.rst @@ -1,22 +1,20 @@ .. _crc64: -************************** +***** CRC64 -************************** +***** -Calculates the CRC-64 hash of a text expression +The ``CRC64`` function calculates the CRC-64 hash of a text expression. Syntax -========== +====== .. code-block:: postgres CRC64( expr ) --> BIGINT - - CRC64_JOIN( expr ) --> BIGINT Arguments -============ +========= .. list-table:: :widths: auto @@ -25,34 +23,22 @@ Arguments * - Parameter - Description * - ``expr`` - - Text expression (``VARCHAR``, ``TEXT``) + - Text expression (``TEXT``) Returns -============ - -Returns a CRC-64 hash of the text input, of type ``BIGINT``. - -Notes ======= -* If the input value is NULL, the result is NULL. - -* The ``CRC64_JOIN`` can be used with ``VARCHAR`` only. It can not be used with ``TEXT``. +Returns a CRC-64 hash of the text input, of type ``BIGINT``. -* The ``CRC64_JOIN`` variant ignores leading whitespace when used as a ``JOIN`` key. +If the input value is ``NULL``, the result is ``NULL``. Examples -=========== - -Calculate a CRC-64 hash of a string ---------------------------------------- +======== -.. code-block:: psql +.. code-block:: sql - numbers=> SELECT CRC64(x) FROM - . (VALUES ('This is a relatively long text string, that can be converted to a shorter hash' :: varchar(80))) - . as t(x); - crc64 - -------------------- - -9085161068710498500 + SELECT CRC64(x) FROM (VALUES ('This is a relatively long text string, that can be converted to a shorter hash' :: text)) as t(x); + +.. code-block:: none + -8397827068206190216 \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/numeric/degrees.rst b/reference/sql/sql_functions/scalar_functions/numeric/degrees.rst index d1d806792..7ef7b3c2c 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/degrees.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/degrees.rst @@ -30,7 +30,7 @@ Arguments Returns ============ -Always returns a floating point result of the value in degrees. +When using the ``DEGREES`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/exp.rst b/reference/sql/sql_functions/scalar_functions/numeric/exp.rst index 16116614e..abebebbe0 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/exp.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/exp.rst @@ -4,7 +4,7 @@ EXP ************************** -Returns the natural exponent value of a numeric expression (*e*\ :sup:`x`) +Returns the natural exponent value of a numeric expression (*e*\ :sup:`x`). See also :ref:`log`. @@ -30,7 +30,7 @@ Arguments Returns ============ -Always returns a floating point result. +When using the ``EXP`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/floor.rst b/reference/sql/sql_functions/scalar_functions/numeric/floor.rst index 56dcac0a2..e528dcc5d 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/floor.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/floor.rst @@ -32,6 +32,8 @@ Returns Returns the same type as the argument supplied. +When using the ``FLOOR`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. + Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/log.rst b/reference/sql/sql_functions/scalar_functions/numeric/log.rst index 3a1b9156d..4baa0f7a7 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/log.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/log.rst @@ -32,7 +32,7 @@ Arguments Returns ============ -Always returns a floating point result. +When using the ``LOG`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/log10.rst b/reference/sql/sql_functions/scalar_functions/numeric/log10.rst index 99b8311e3..844373aca 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/log10.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/log10.rst @@ -30,7 +30,7 @@ Arguments Returns ============ -Always returns a floating point result. +When using the ``LOG10`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/power.rst b/reference/sql/sql_functions/scalar_functions/numeric/power.rst index 2653f5510..66d379c62 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/power.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/power.rst @@ -66,7 +66,7 @@ On floating point .. code-block:: psql - numbers=> SELECT POWER(3.0,x) FROM (VALUES (1), (2), (3), (4), (5)) AS t(x); + numbers=> SELECT POWER(3.0::double,x) FROM (VALUES (1), (2), (3), (4), (5)) AS t(x); power ----- 3.0 diff --git a/reference/sql/sql_functions/scalar_functions/numeric/radians.rst b/reference/sql/sql_functions/scalar_functions/numeric/radians.rst index 4d6859de8..e79b6b20b 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/radians.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/radians.rst @@ -30,7 +30,7 @@ Arguments Returns ============ -Always returns a floating point result of the value in radians. +When using the ``RADIANS`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/round.rst b/reference/sql/sql_functions/scalar_functions/numeric/round.rst index 842224edd..b45549309 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/round.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/round.rst @@ -1,10 +1,12 @@ .. _round: -************************** +********** ROUND -************************** +********** -Rounds a numeric expression to the nearest precision. +Rounds a numeric expression to the nearest precision. + +Supported data types: ``INT``, ``TINYINT``, ``SMALLINT``, ``BIGINT``, ``REAL``, ``DOUBLE``, ``NUMERIC``. See also :ref:`ceiling`, :ref:`floor`. @@ -32,12 +34,12 @@ Arguments Returns ============ -Always returns a floating point result. +``REAL`` arguments are automatically cast to double precision, while all other supported data types retain the supplied data type. Notes ======= -* If the input value is NULL, the result is NULL. +If the input value is ``NULL``, the result is ``NULL``. Examples =========== @@ -47,7 +49,7 @@ Rounding to the nearest integer .. code-block:: psql - numbers=> SELECT ROUND(x) FROM (VALUES (0.0001), (PI()), (-2.718281), (500.1234), (0.5), (1.5)) as t(x); + SELECT ROUND(x) FROM (VALUES (0.0001), (PI()), (-2.718281), (500.1234), (0.5), (1.5)) as t(x); round ------ 0 @@ -62,7 +64,7 @@ Rounding to 2 digits after the decimal point .. code-block:: psql - numbers=> SELECT ROUND(x,2) FROM (VALUES (0.0001), (PI()), (-2.718281), (500.1234)) as t(x); + SELECT ROUND(x,2) FROM (VALUES (0.0001), (PI()), (-2.718281), (500.1234)) as t(x); round ------- 0 @@ -75,7 +77,7 @@ Rounding to 2 digits after the decimal point .. code-block:: psql - numbers=> SELECT FLOOR(x), CEIL(x), ROUND(x) + SELECT FLOOR(x), CEIL(x), ROUND(x) . FROM (VALUES (0.0001), (-0.0001) . , (PI()), (-2.718281), (500.1234)) as t(x); floor | ceil | round diff --git a/reference/sql/sql_functions/scalar_functions/numeric/sign.rst b/reference/sql/sql_functions/scalar_functions/numeric/sign.rst new file mode 100644 index 000000000..a1f835c4b --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/numeric/sign.rst @@ -0,0 +1,58 @@ +.. _sign: + +**** +SIGN +**** + +The ``SIGN`` function takes a single argument, which can be any numeric data type such as INTEGER, FLOAT, or DECIMAL, and returns a value of -1, 0, or 1, depending on the sign of the input argument. + + + +Syntax +====== + +.. code-block:: postgres + + SELECT SIGN(expr); + +Arguments +========= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``numeric_expression`` + - Numeric data types + +Return +====== + +The ``SIGN`` function returns the same data type as inserted, with the exception of ``REAL``, which is converted to ``DOUBLE``. + +Depending on the sign of the input argument, the return is: + +* -1, if the input expression is negative + +* 0, if the input expression is zero + +* 1, if the input expression is positive + + + +Example +======= + +.. code-block:: postgres + + SELECT SIGN(-10), SIGN(0), SIGN(5) ; + +Output: + +.. code-block:: none + + sign | sign0 | sign1 + -----+------+------- + -1 | 0 | 1 diff --git a/reference/sql/sql_functions/scalar_functions/numeric/sqrt.rst b/reference/sql/sql_functions/scalar_functions/numeric/sqrt.rst index cda10c4e9..41654db67 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/sqrt.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/sqrt.rst @@ -60,7 +60,7 @@ For these examples, consider the following table and contents: 1.4142135623730951 .. note:: - * ``SQRT`` always returns a floating point result. + * Some clients may show fewer digits. See your client settings to change the precision shown. Replacing negative values with ``NULL`` diff --git a/reference/sql/sql_functions/scalar_functions/numeric/square.rst b/reference/sql/sql_functions/scalar_functions/numeric/square.rst index edf7b11ab..d8b7cd585 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/square.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/square.rst @@ -30,7 +30,7 @@ Arguments Returns ============ -Always returns a floating point result +When using the ``SQUARE`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/tan.rst b/reference/sql/sql_functions/scalar_functions/numeric/tan.rst index 7a6915d3b..88e332d5a 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/tan.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/tan.rst @@ -29,7 +29,7 @@ Arguments Returns ============ -Always returns a floating point result of the tangent. +When using the ``TAN`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/numeric/trunc.rst b/reference/sql/sql_functions/scalar_functions/numeric/trunc.rst index 322b1274b..75dc684b7 100644 --- a/reference/sql/sql_functions/scalar_functions/numeric/trunc.rst +++ b/reference/sql/sql_functions/scalar_functions/numeric/trunc.rst @@ -6,7 +6,7 @@ TRUNC Rounds a number to its integer representation towards 0. -.. note:: This function is overloaded. The function :ref:`TRUNC` can also modify the precision of ``DATE`` and ``DATETIME`` values. +.. note:: This function is overloaded. The function :ref:`TRUNC` can also modify the precision of ``DATE``, ``DATETIME`` and ``DATETIME2`` values. See also :ref:`ROUND`. @@ -34,6 +34,8 @@ Returns Returns the same type as the argument supplied. +When using the ``TRUNC`` floating point number scalar function, ``real`` arguments are automatically cast to ``double`` precision. + Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/string/char_length.rst b/reference/sql/sql_functions/scalar_functions/string/char_length.rst index f89c397ab..25e684c82 100644 --- a/reference/sql/sql_functions/scalar_functions/string/char_length.rst +++ b/reference/sql/sql_functions/scalar_functions/string/char_length.rst @@ -1,28 +1,25 @@ .. _char_length: -************************** -CHAR_LENGTH -************************** +****************************** +CHARACTER_LENGTH / CHAR_LENGTH +****************************** Calculates the number of characters in a string. .. note:: - - * This function is supported on ``TEXT`` only. - + * To get the length in bytes, see :ref:`octet_length`. - * For ``VARCHAR`` strings, the octet length is the number of characters. Use :ref:`len` instead. - Syntax -========== +====== .. code-block:: postgres - CHAR_LEN( text_expr ) --> INT + CHAR_LENGTH( text_expr ) --> INT + CHARACTER_LENGTH( text_expr ) --> INT Arguments -============ +========= .. list-table:: :widths: auto @@ -34,19 +31,19 @@ Arguments - ``TEXT`` expression Returns -============ +======= -Returns an integer containing the number of characters in the string. +Return an integer containing the number of characters in the string. Notes -======= +===== * To get the length in bytes, see :ref:`octet_length` * If the value is NULL, the result is NULL. Examples -=========== +======== For these examples, consider the following table and contents: @@ -59,11 +56,11 @@ For these examples, consider the following table and contents: , ('אבגדהוזחטיכלמנסעפצקרשת'); Length in characters and bytes of strings --------------------------------------------------- +----------------------------------------- ASCII characters take up 1 byte per character, while Thai takes up 3 bytes and Hebrew takes up 2 bytes. -Unlike :ref:`len`, ``CHAR_LENGTH`` preserves the trailing whitespaces. +Unlike :ref:`len`, ``CHARACTER_LENGTH`` and ``CHAR_LENGTH`` preserve the trailing white spaces. .. code-block:: psql diff --git a/reference/sql/sql_functions/scalar_functions/string/charindex.rst b/reference/sql/sql_functions/scalar_functions/string/charindex.rst index fa9c89027..893332d7d 100644 --- a/reference/sql/sql_functions/scalar_functions/string/charindex.rst +++ b/reference/sql/sql_functions/scalar_functions/string/charindex.rst @@ -1,22 +1,23 @@ .. _charindex: -************************** +********* CHARINDEX -************************** +********* + +``CHARINDEX`` is a 1-based indexing function (both input and output) that returns the starting position of a specified substring within a given string. -Returns the starting position of a string inside another string. See also :ref:`patindex`, :ref:`regexp_instr`. Syntax -========== +====== -.. code-block:: postgres +.. code-block:: sql - CHARINDEX ( needle_string_expr , haystack_string_expr [ , start_location ] ) + CHARINDEX ( needle_string_expr , haystack_string_expr [ , start_location ] ) -Arguments -============ +Parameters +========== .. list-table:: :widths: auto @@ -29,48 +30,74 @@ Arguments * - ``haystack_string_expr`` - String to search within * - ``start_location`` - - An integer at which the search starts. This value is optinoal and when not supplied, the search starts at the beggining of ``needle_string_expr`` + - An integer at which the search starts. This value is optional and when not supplied, the search starts at the beggining of ``needle_string_expr`` Returns -============ +======= Integer start position of a match, or 0 if no match was found. -Notes -======= - -* If the value is NULL, the result is NULL. +If one of the parameters is NULL, then the return value is NULL. +Empty string search returns 0. Examples -=========== +======== + +For these examples, consider the following table: + +.. code-block:: none -For these examples, consider the following table and contents: + id | username | email | password + ----+-------------+-------------------------+----------- + 1 | john_doe | john.doe@example.com | password1 + ----+-------------+-------------------------+----------- + 2 | jane_doe | jane.doe@example.com | password2 + ----+-------------+-------------------------+----------- + 3 | bob_smith | bob.smith@example.com | password3 + ----+-------------+-------------------------+----------- + 4 | susan_jones | susan.jones@example.com | password4 + + +.. code-block:: sql + + SELECT CHARINDEX('doe', username) FROM users; + +Output: + +.. code-block:: none + + charindex| + ---------+ + 6 | + 6 | + 0 | + 0 | + +.. code-block:: sql -.. code-block:: postgres + SELECT CHARINDEX('doe', username, 10) FROM users; - CREATE TABLE jabberwocky(line VARCHAR(50)); +Output: - INSERT INTO jabberwocky VALUES - ('''Twas brillig, and the slithy toves '), (' Did gyre and gimble in the wabe: ') - ,('All mimsy were the borogoves, '), (' And the mome raths outgrabe. ') - ,('"Beware the Jabberwock, my son! '), (' The jaws that bite, the claws that catch! ') - ,('Beware the Jubjub bird, and shun '), (' The frumious Bandersnatch!" '); +.. code-block:: none + charindex| + ---------+ + 0 | + 0 | + 0 | + 0 | -Using ``CHARINDEX`` ------------------------------------------ +.. code-block:: sql -.. code-block:: psql + SELECT CHARINDEX('jane_doe', username, -10) FROM users; + +.. code-block:: none - t=> SELECT line, CHARINDEX('the', line) FROM jabberwocky - line | charindex - ------------------------------------------------+---------- - 'Twas brillig, and the slithy toves | 20 - Did gyre and gimble in the wabe: | 30 - All mimsy were the borogoves, | 16 - And the mome raths outgrabe. | 11 - "Beware the Jabberwock, my son! | 9 - The jaws that bite, the claws that catch! | 27 - Beware the Jubjub bird, and shun | 8 - The frumious Bandersnatch!" | 0 + charindex| + ---------+ + 0 | + 1 | + 0 | + 0 | \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/string/concat.rst b/reference/sql/sql_functions/scalar_functions/string/concat.rst index 0409216cd..d984d1cdf 100644 --- a/reference/sql/sql_functions/scalar_functions/string/concat.rst +++ b/reference/sql/sql_functions/scalar_functions/string/concat.rst @@ -4,7 +4,8 @@ ``||`` (Concatenate) ************************** -Concatenate two strings to create a longer string +Concatenate two strings or string arrays to create a longer string. + Syntax ========== @@ -38,6 +39,8 @@ Notes * SQream DB removes the trailing spaces from strings by default, which may lead to unexpected results. See the examples for more information. +* The :ref:`concat_function` provides alternative syntax for CONCAT and requires at least two arguments. + Examples =========== @@ -48,14 +51,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); @@ -76,7 +79,7 @@ Convert values to string types before concatenation .. code-block:: psql - nba=> SELECT ("Age" :: VARCHAR(2)) || "Name" FROM nba ORDER BY 1 DESC LIMIT 5; + nba=> SELECT ("Age" :: TEXT(2)) || "Name" FROM nba ORDER BY 1 DESC LIMIT 5; ?column? ---------------- 40Tim Duncan @@ -116,12 +119,11 @@ Add a space and concatenate it first to bypass the space trimming issue .. code-block:: psql - nba=> SELECT ("Age" :: VARCHAR(2) || (' ' || "Name")) FROM nba ORDER BY 1 DESC LIMIT 5; + nba=> SELECT ("Age" :: TEXT(2) || (' ' || "Name")) FROM nba ORDER BY 1 DESC LIMIT 5; ?column? ----------------- 40 Tim Duncan 40 Kevin Garnett 40 Andre Miller 39 Vince Carter - 39 Pablo Prigioni - + 39 Pablo Prigioni \ No newline at end of file diff --git a/reference/sql/sql_functions/scalar_functions/string/concat_function.rst b/reference/sql/sql_functions/scalar_functions/string/concat_function.rst new file mode 100644 index 000000000..24e397e7b --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/string/concat_function.rst @@ -0,0 +1,86 @@ +.. _concat_function: + +************************** +CONCAT function +************************** + +Concatenates one or more strings, string arrays or concatenates one or more binary values. + +Syntax +========== + + +.. code-block:: postgres + + CONCAT( [ , ... ] ) + + +Arguments +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``expr1``, ``expr2`` + - String expressions + +Returns +============ + +The data type of the returned value is TEXT, If any input value is NULL, returns NULL. + +Notes +======= + +* The :ref:`concat` operator provides alternative syntax for CONCAT and requires at least two arguments. +* SQream DB removes the trailing spaces from strings by default, which may lead to unexpected results. See the examples for more information. + +Examples +=========== + + +For these examples, assume a table named ``nba``, with the following structure: + +.. code-block:: postgres + + CREATE TABLE nba + ( + Name text(40), + Team text(40), + Number tinyint, + Position text(2), + Age tinyint, + Height text(4), + Weight real, + College text(40), + Salary float + ); + + +Here's a peek at the table contents (:download:`Download nba.csv `): + +.. csv-table:: nba.csv + :file: nba-t10.csv + :widths: auto + :header-rows: 1 + + +Concatenate two values +-------------------------------------- + +Convert values to string types before concatenation + +.. code-block:: psql + + + nba=> SELECT CONCAT(Age ,Name) FROM nba ORDER BY 1 DESC LIMIT 5; + ?column? + ---------------- + 40Tim Duncan + 40Kevin Garnett + 40Andre Miller + 39Vince Carter + 39Pablo Prigioni diff --git a/reference/sql/sql_functions/scalar_functions/string/decode.rst b/reference/sql/sql_functions/scalar_functions/string/decode.rst new file mode 100644 index 000000000..6376bc3e3 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/string/decode.rst @@ -0,0 +1,62 @@ +.. _decode: + +****** +DECODE +****** + +The ``DECODE`` function takes an expression or column and compares it to a series of search values. It returns a result value that corresponds to the first matching search value, or the default value ``NULL`` if no matches are found. + +Syntax +====== + +.. code-block:: postgres + + DECODE( , , [ , , ... ] [ , ] ) + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``expr`` + - The expression to be evaluated + * - ``search`` + - A value that ``expr`` is compared against + * - ``result1`` + - A value that is returned if ``expr`` matches ``search`` + +Return +====== + +Returns the same type as the argument supplied. + +Example +======= + +.. code-block:: sql + + CREATE TABLE test1 (european_size int not null); + INSERT INTO test1 values (8),(9),(10),(11); + + SELECT european_size,DECODE(european_size,8,40,9,41,10,42,99) from test1; + +.. code-block:: none + + +---------------+---------+ + |european_size |decode | + +---------------+---------+ + |8 |40 | + +---------------+---------+ + |9 |41 | + +---------------+---------+ + |10 |42 | + +---------------+---------+ + |11 |99 | + +---------------+---------+ + + + diff --git a/reference/sql/sql_functions/scalar_functions/string/isprefixof.rst b/reference/sql/sql_functions/scalar_functions/string/isprefixof.rst index 4a978b1ff..3da356969 100644 --- a/reference/sql/sql_functions/scalar_functions/string/isprefixof.rst +++ b/reference/sql/sql_functions/scalar_functions/string/isprefixof.rst @@ -50,7 +50,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE jabberwocky(line VARCHAR(50)); + CREATE TABLE jabberwocky(line TEXT(50)); INSERT INTO jabberwocky VALUES ('''Twas brillig, and the slithy toves '), (' Did gyre and gimble in the wabe: ') diff --git a/reference/sql/sql_functions/scalar_functions/string/left.rst b/reference/sql/sql_functions/scalar_functions/string/left.rst index 77c99b0f1..ce76244e9 100644 --- a/reference/sql/sql_functions/scalar_functions/string/left.rst +++ b/reference/sql/sql_functions/scalar_functions/string/left.rst @@ -27,13 +27,13 @@ Arguments * - ``expr`` - String expression * - ``character_count`` - - A positive integer that specifies how many characters to return. - + - The number of characters to be returned. If ``character_count <= 0``, an empty string is returned. Returns ============ Returns the same type as the argument supplied. + Notes ======= diff --git a/reference/sql/sql_functions/scalar_functions/string/len.rst b/reference/sql/sql_functions/scalar_functions/string/len.rst index d3cba24b2..276ad62be 100644 --- a/reference/sql/sql_functions/scalar_functions/string/len.rst +++ b/reference/sql/sql_functions/scalar_functions/string/len.rst @@ -4,13 +4,13 @@ LEN ************************** -Calculates the number of characters in a string. +The LEN function calculates the number of characters in a string. Keep in mind that SQream DB does not count trailing spaces, but does count leading spaces. For UTF-8 encoded ``TEXT`` strings, multi-byte characters are counted as a single character. To get the length in bytes, see :ref:`octet_length`. To get the length in characters, see :ref:`char_length`. + .. note:: - - * This function is provided for SQL Server compatability. - - * For UTF-8 encoded ``TEXT`` strings, multi-byte characters are counted as a single character. To get the length in bytes, see :ref:`octet_length`. To get the length in characters, see :ref:`char_length`. + + This function is provided for SQL Server compatibility. + Syntax ========== @@ -18,7 +18,8 @@ Syntax .. code-block:: postgres - LEN( expr ) --> INT + LEN( expr ) + LENGTH( expr ) Arguments ============ @@ -35,12 +36,12 @@ Arguments Returns ============ -Returns an integer containing the number of characters in the string. +The LEN function returns an integer value representing the length of the input string, which includes any leading spaces but excludes any trailing spaces. Notes ======= -* If the value is NULL, the result is NULL. +* If the value is ``NULL``, the result is ``NULL``. Examples =========== @@ -49,7 +50,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE jabberwocky(line VARCHAR(50)); + CREATE TABLE jabberwocky(line TEXT(50)); INSERT INTO jabberwocky VALUES ($$'Twas brillig, and the slithy toves$$), (' Did gyre and gimble in the wabe:') @@ -81,8 +82,6 @@ ASCII characters take up 1 byte per character, while Thai takes up 3 bytes and H Length of an ASCII string ---------------------------- -.. note:: SQream DB does not count trailing spaces, but does keep leading spaces. - .. code-block:: psql t=> SELECT LEN('Trailing spaces are not counted '); diff --git a/reference/sql/sql_functions/scalar_functions/string/like.rst b/reference/sql/sql_functions/scalar_functions/string/like.rst index 4640ba9be..ce5ca4942 100644 --- a/reference/sql/sql_functions/scalar_functions/string/like.rst +++ b/reference/sql/sql_functions/scalar_functions/string/like.rst @@ -83,14 +83,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/lower.rst b/reference/sql/sql_functions/scalar_functions/string/lower.rst index 69dfd4f1a..4318015dc 100644 --- a/reference/sql/sql_functions/scalar_functions/string/lower.rst +++ b/reference/sql/sql_functions/scalar_functions/string/lower.rst @@ -45,7 +45,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE jabberwocky(line VARCHAR(50)); + CREATE TABLE jabberwocky(line TEXT(50)); INSERT INTO jabberwocky VALUES ('''Twas brillig, and the slithy toves'), (' Did gyre and gimble in the wabe:') diff --git a/reference/sql/sql_functions/scalar_functions/string/octet_length.rst b/reference/sql/sql_functions/scalar_functions/string/octet_length.rst index 8bb1e3daf..061b008c4 100644 --- a/reference/sql/sql_functions/scalar_functions/string/octet_length.rst +++ b/reference/sql/sql_functions/scalar_functions/string/octet_length.rst @@ -1,21 +1,18 @@ .. _octet_length: -************************** +************ OCTET_LENGTH -************************** +************ Calculates the number of bytes in a string. -.. note:: +.. note:: + + * To get the length in bytes, see :ref:`octet_length`. - * This function is supported on ``TEXT`` strings only. - - * To get the length in characters, see :ref:`char_length`. - - * For ``VARCHAR`` strings, the octet length is the number of characters. Use :ref:`len` instead. - Syntax -========== +====== + The following is the correct syntax for the ``OCTET_LENGTH`` function: .. code-block:: postgres @@ -23,7 +20,8 @@ The following is the correct syntax for the ``OCTET_LENGTH`` function: OCTET_LEN( text_expr ) --> INT Arguments -============ +========= + The following table describes the ``OCTET_LENGTH`` arguments: .. list-table:: @@ -36,11 +34,13 @@ The following table describes the ``OCTET_LENGTH`` arguments: - ``TEXT`` expression Returns -============ +======= + The ``OCTET_LENGTH`` function returns an integer containing the number of bytes in the string. Notes -======= +===== + The following notes are applicable to the ``OCTET_LENGTH`` function: * To get the length in characters, see :ref:`char_length` @@ -48,7 +48,8 @@ The following notes are applicable to the ``OCTET_LENGTH`` function: * If the value is NULL, the result is NULL. Length in Characters and Bytes of Strings -=========== +========================================= + The **Length in characters and bytes of strings** example is based on the following table and contents: .. code-block:: postgres diff --git a/reference/sql/sql_functions/scalar_functions/string/patindex.rst b/reference/sql/sql_functions/scalar_functions/string/patindex.rst index 063fe6d5c..b1ffcda9b 100644 --- a/reference/sql/sql_functions/scalar_functions/string/patindex.rst +++ b/reference/sql/sql_functions/scalar_functions/string/patindex.rst @@ -69,7 +69,7 @@ Notes * If the value is NULL, the result is NULL. -* PATINDEX works on ``VARCHAR`` text types only. +* PATINDEX works on ``TEXT`` text types only. * PATINDEX does not work on all literal values - only on column values. diff --git a/reference/sql/sql_functions/scalar_functions/string/regexp_count.rst b/reference/sql/sql_functions/scalar_functions/string/regexp_count.rst index 5f3bd75a0..cce99e044 100644 --- a/reference/sql/sql_functions/scalar_functions/string/regexp_count.rst +++ b/reference/sql/sql_functions/scalar_functions/string/regexp_count.rst @@ -31,8 +31,8 @@ Arguments * - ``start_index`` - The character index offset to start counting from. Defaults to 1 -Test patterns -============== +Supported RegEx Patterns +======================== .. list-table:: :widths: auto @@ -41,9 +41,13 @@ Test patterns * - Pattern - Description + * - ``^`` - Match the beginning of a string + * - ``[^]`` + - Characters that do not match the speciifed string + * - ``$`` - Match the end of a string @@ -57,27 +61,38 @@ Test patterns - Match the preceding pattern at least once * - ``?`` - - Match the preceding pattern once at most - - * - ``de|abc`` - - Match either ``de`` or ``abc`` + - Match the preceding pattern once at most (``0`` or ``1`` time) * - ``(abc)*`` - - Match zero or more instances of the sequence ``abc`` + - Match zero or more instances of the sequence ``abc``, treating the expression within the parentheses as a single unit - * - ``{2}`` - - Match the preceding pattern exactly two times + * - ``{m}`` + - Match the preceding pattern exactly ``m`` times - * - ``{2,4}`` - - Match the preceding pattern between two and four times + * - ``{m,n}`` + - Match the preceding pattern at least ``m`` times but no more than ``n`` times - * - ``[a-dX]``, ``[^a-dX]`` - - - Matches any character that is (or is not when negated with ``^``) either ``a``, ``b``, ``c``, ``d``, or ``X``. - The ``-`` character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. - To include a literal ``]`` character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. - Any character that does not have a defined special meaning inside a [] pair matches only itself. + * - ``[...]`` + - Match any sing character from the list within the parentheses + + * - ``|`` + - ``OR`` clause + * - ``\`` + - Treating the subsequent characters in the expression as ordinary characters rather than metacharacters + + * - ``\n`` + - Matching the nth (``1``-``9``) preceding subexpression grouped within parentheses + + * - ``*?`` + - Occurs zero or more times + + * - ``+?`` + - Occurs one or more times + + * - ``??`` + - Occurs zero or one times + Returns ============ @@ -99,14 +114,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/regexp_instr.rst b/reference/sql/sql_functions/scalar_functions/string/regexp_instr.rst index f401fdfff..07b3ec9a8 100644 --- a/reference/sql/sql_functions/scalar_functions/string/regexp_instr.rst +++ b/reference/sql/sql_functions/scalar_functions/string/regexp_instr.rst @@ -36,8 +36,8 @@ Arguments - Specifies the location within the string to return. Using 0, the function returns the string position of the first character of the substring that matches the pattern. A value greater than 0 returns will return the position of the first character following the end of the pattern. Defaults to 0 -Test patterns -============== +Supported RegEx Patterns +======================== .. list-table:: :widths: auto @@ -46,9 +46,13 @@ Test patterns * - Pattern - Description + * - ``^`` - Match the beginning of a string + * - ``[^]`` + - Characters that do not match the speciifed string + * - ``$`` - Match the end of a string @@ -62,26 +66,37 @@ Test patterns - Match the preceding pattern at least once * - ``?`` - - Match the preceding pattern once at most - - * - ``de|abc`` - - Match either ``de`` or ``abc`` + - Match the preceding pattern once at most (``0`` or ``1`` time) * - ``(abc)*`` - - Match zero or more instances of the sequence ``abc`` + - Match zero or more instances of the sequence ``abc``, treating the expression within the parentheses as a single unit - * - ``{2}`` - - Match the preceding pattern exactly two times + * - ``{m}`` + - Match the preceding pattern exactly ``m`` times - * - ``{2,4}`` - - Match the preceding pattern between two and four times + * - ``{m,n}`` + - Match the preceding pattern at least ``m`` times but no more than ``n`` times - * - ``[a-dX]``, ``[^a-dX]`` - - - Matches any character that is (or is not when negated with ``^``) either ``a``, ``b``, ``c``, ``d``, or ``X``. - The ``-`` character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. - To include a literal ``]`` character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. - Any character that does not have a defined special meaning inside a [] pair matches only itself. + * - ``[...]`` + - Match any sing character from the list within the parentheses + + * - ``|`` + - ``OR`` clause + + * - ``\`` + - Treating the subsequent characters in the expression as ordinary characters rather than metacharacters + + * - ``\n`` + - Matching the nth (``1``-``9``) preceding subexpression grouped within parentheses + + * - ``*?`` + - Occurs zero or more times + + * - ``+?`` + - Occurs one or more times + + * - ``??`` + - Occurs zero or one times Returns ============ @@ -104,14 +119,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/regexp_replace.rst b/reference/sql/sql_functions/scalar_functions/string/regexp_replace.rst index ba3424190..49426f254 100644 --- a/reference/sql/sql_functions/scalar_functions/string/regexp_replace.rst +++ b/reference/sql/sql_functions/scalar_functions/string/regexp_replace.rst @@ -1,29 +1,27 @@ .. _regexp_replace: -************************** +************** REGEXP_REPLACE -************************** +************** + The ``REGEXP_REPLACE`` function finds and replaces text column substrings using constant regexp-based patterns with constant replacement strings. For related information, see the following: -* `REGEXP_COUNT `_ -* `REGEXP_INSTR `_ -* `REGEXP_SUBSTR `_ - - - +* :ref:`REGEXP_COUNT` +* :ref:`REGEXP_INSTR` +* :ref:`REGEXP_SUBSTR` Syntax --------------- -The following is a syntactic example of the ``REGEXP_REPLACE`` function: +====== .. code-block:: postgres REGEXP_REPLACE(input, pattern [, replacement [, position [, occurrence]]]) Arguments --------------- +========= + The following table shows the ``REGEXP_REPLACE`` arguments: .. list-table:: @@ -43,20 +41,23 @@ The following table shows the ``REGEXP_REPLACE`` arguments: * - ``occurrence`` - (Optional) Sets a specific occurrence to replace. Using ``0`` replaces all occurrences. -Test Patterns --------------- -The following table shows the ``REGEXP_REPLACE`` test patterns: +Supported RegEx Patterns +======================== .. list-table:: :widths: auto :header-rows: 1 - * - Test Pattern + * - Pattern - Description + * - ``^`` - Match the beginning of a string + * - ``[^]`` + - Characters that do not match the speciifed string + * - ``$`` - Match the end of a string @@ -70,38 +71,50 @@ The following table shows the ``REGEXP_REPLACE`` test patterns: - Match the preceding pattern at least once * - ``?`` - - Match the preceding pattern once at most + - Match the preceding pattern once at most (``0`` or ``1`` time) - * - ``de|abc`` - - Match either ``de`` or ``abc`` + * - ``(abc)*`` + - Match zero or more instances of the sequence ``abc``, treating the expression within the parentheses as a single unit - * - ``(abc)*`` - - Match zero or more instances of the sequence ``abc`` + * - ``{m}`` + - Match the preceding pattern exactly ``m`` times - * - ``{2}`` - - Match the preceding pattern exactly two times + * - ``{m,n}`` + - Match the preceding pattern at least ``m`` times but no more than ``n`` times - * - ``{2,4}`` - - Match the preceding pattern between two and four times + * - ``[...]`` + - Match any sing character from the list within the parentheses + + * - ``|`` + - ``OR`` clause - * - ``[a-dX]``, ``[^a-dX]`` - - - Matches any character that is (or is not when negated with ``^``) either ``a``, ``b``, ``c``, ``d``, or ``X``. - The ``-`` character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. - To include a literal ``]`` character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. - Any character that does not have a defined special meaning inside a [] pair matches only itself. + * - ``\`` + - Treating the subsequent characters in the expression as ordinary characters rather than metacharacters + + * - ``\n`` + - Matching the nth (``1``-``9``) preceding subexpression grouped within parentheses + + * - ``*?`` + - Occurs zero or more times + + * - ``+?`` + - Occurs one or more times + + * - ``??`` + - Occurs zero or one times Returns ------------- +======= + The ``REGEXP_REPLACE`` function returns the replaced input value. Notes --------------- +===== + The test pattern must be a literal string. Example --------------- -The following is an example of the ``REGEXP_REPLACE`` function: +======= .. code-block:: @@ -109,8 +122,11 @@ The following is an example of the ``REGEXP_REPLACE`` function: INSERT INTO test values('SWEDEN'); SELECT REGEXP_REPLACE(country_name, 'WEDE', 'PAI') FROM test; - SELECT * FROM test; +Output: -The output generated from the example above is ``SPAIN``. +.. code-block:: none + country_name| + ------------+ + SPAIN | diff --git a/reference/sql/sql_functions/scalar_functions/string/regexp_substr.rst b/reference/sql/sql_functions/scalar_functions/string/regexp_substr.rst index 1730d6ebf..3af784070 100644 --- a/reference/sql/sql_functions/scalar_functions/string/regexp_substr.rst +++ b/reference/sql/sql_functions/scalar_functions/string/regexp_substr.rst @@ -4,7 +4,7 @@ REGEXP_SUBSTR ************************** -Returns the occurence of a regex match. +Returns the occurrence of a regex match. See also: :ref:`regexp_instr`, :ref:`regexp_count`. @@ -14,7 +14,7 @@ Syntax .. code-block:: postgres - REGEXP_SUBSTR( string_expr, string_test_expr [ , start_index [ , occurence ] ] ) --> INT + REGEXP_SUBSTR( string_expr, string_test_expr [ , start_index [ , occurrence ] ] ) --> TEXT Arguments ============ @@ -31,13 +31,13 @@ Arguments - Test pattern * - ``start_index`` - The character index offset to start counting from. Defaults to 1 - * - ``occurence`` - - Which occurence to search for. Defaults to 1 + * - ``occurrence`` + - Which occurrence to search for. Defaults to 1 * - ``return_position`` - - Setes the position within the string to return. Using 0, the function returns the string position of the first character of the substring that matches the pattern. Defaults to 0 + - Sets the position within the string to return. Using 0, the function returns the string position of the first character of the substring that matches the pattern. Defaults to 0 -Test patterns -============== +Supported RegEx Patterns +======================== .. list-table:: :widths: auto @@ -46,14 +46,18 @@ Test patterns * - Pattern - Description + * - ``^`` - Match the beginning of a string + * - ``[^]`` + - Characters that do not match the specified string + * - ``$`` - Match the end of a string * - ``.`` - - Match any character (including whitespace such as carriage return and newline) + - Match any character (including white-space such as carriage return and newline) * - ``*`` - Match the preceding pattern zero or more times @@ -62,26 +66,37 @@ Test patterns - Match the preceding pattern at least once * - ``?`` - - Match the preceding pattern once at most - - * - ``de|abc`` - - Match either ``de`` or ``abc`` + - Match the preceding pattern once at most (``0`` or ``1`` time) * - ``(abc)*`` - - Match zero or more instances of the sequence ``abc`` + - Match zero or more instances of the sequence ``abc``, treating the expression within the parentheses as a single unit - * - ``{2}`` - - Match the preceding pattern exactly two times + * - ``{m}`` + - Match the preceding pattern exactly ``m`` times - * - ``{2,4}`` - - Match the preceding pattern between two and four times + * - ``{m,n}`` + - Match the preceding pattern at least ``m`` times but no more than ``n`` times - * - ``[a-dX]``, ``[^a-dX]`` - - - Matches any character that is (or is not when negated with ``^``) either ``a``, ``b``, ``c``, ``d``, or ``X``. - The ``-`` character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. - To include a literal ``]`` character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. - Any character that does not have a defined special meaning inside a [] pair matches only itself. + * - ``[...]`` + - Match any sing character from the list within the parentheses + + * - ``|`` + - ``OR`` clause + + * - ``\`` + - Treating the subsequent characters in the expression as ordinary characters rather than meta-characters + + * - ``\n`` + - Matching the nth (``1``-``9``) preceding sub-expression grouped within parentheses + + * - ``*?`` + - Occurs zero or more times + + * - ``+?`` + - Occurs one or more times + + * - ``??`` + - Occurs zero or one times Returns ============ @@ -104,14 +119,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/repeat.rst b/reference/sql/sql_functions/scalar_functions/string/repeat.rst index 4e202b5b2..df9c2fe6f 100644 --- a/reference/sql/sql_functions/scalar_functions/string/repeat.rst +++ b/reference/sql/sql_functions/scalar_functions/string/repeat.rst @@ -1,20 +1,17 @@ .. _repeat: -************************** +****** REPEAT -************************** - -Repeats a string as many times as specified. - -.. warning:: This function works ONLY with ``TEXT`` data type. - +****** + +The ``REPEAT`` function repeats an input string expression as many times as specified. Syntax ========== -.. code-block:: postgres +.. code-block:: sql - REPEAT(expr, character_count) + REPEAT('expr', n) Arguments ============ @@ -25,105 +22,77 @@ Arguments * - Parameter - Description - * - ``expr`` - - String expression - * - ``character_count`` + * - ``'expr'`` + - A ``TEXT`` expression + * - ``n`` + - An ``INTEGER`` expression specifying the number of repetitions for the string expression -Returns -============ - -Returns the same type as the argument supplied. +Return +====== -Notes -======= - -* When ``character_count`` <= 0, and empty string is returned. +* Returns a ``TEXT`` string. +* When ``n`` = 0, and empty string is returned. +* When ``n`` < 0, an error is thrown. Examples -=========== - -For these examples, consider the following table and contents: - -.. code-block:: postgres - - CREATE TABLE customer(customername TEXT)); - - INSERT INTO customer VALUES - ('Alfreds Futterkiste'), - ('Ana Trujillo Emparedados y helados'), - ('Antonio Moreno Taquería'), - ('Around the Horn'); +======== -Repeat the text in customername 2 times: ------------------------------------------ +For these example, consider the following table: -.. code-block:: psql +.. code-block:: sql - t=> SELECT REPEAT(customername, 2) FROM customers; - - repeat - -------------------------- - Alfreds FutterkisteAlfreds Futterkiste - Ana Trujillo Emparedados y heladosAna Trujillo Emparedados y helados - Antonio Moreno TaqueríaAntonio Moreno Taquería - Around the HornAround the Horn + CREATE TABLE customer(customername TEXT); + INSERT INTO customer VALUES + ('Alfreds Futterkiste'), + ('Ana Trujillo Emparedados y helados'), + ('Antonio Moreno Taquería'), + ('Around the Horn'); -Repeat the string 0 times: ----------------------------- +Repeating Content of a Table Column +----------------------------------- -.. code-block:: psql +.. code-block:: sql - t=> SELECT REPEAT('abc', 0); + SELECT REPEAT(customername, 2) FROM customer; - repeat - ----------------------------------------------- - '' - -Repeat the string 1 times: ----------------------------- - -.. code-block:: psql + repeat + -------------------------------------------------------------------- + Alfreds FutterkisteAlfreds Futterkiste + Ana Trujillo Emparedados y heladosAna Trujillo Emparedados y helados + Antonio Moreno TaqueríaAntonio Moreno Taquería + Around the HornAround the Horn - t=> SELECT REPEAT('abc', 1); - - repeat - ----------------------------------------------- - 'abc' - +Repeating a String +------------------ -Repeat the string 3 times: ----------------------------- +.. code-block:: sql -.. code-block:: psql + SELECT REPEAT('abc', 3); + + repeat + --------- + abcabcabc - t=> SELECT REPEAT('a', 3); - - repeat - ----------------------------------------------- - 'aaa' +Repeating a String 0 Times +-------------------------- -Repeat an empty string 10 times: ----------------------------- +.. code-block:: sql -.. code-block:: psql + SELECT REPEAT('abc', 0); + + repeat + ------ - t=> SELECT REPEAT('', 10); - - repeat - ----------------------------------------------- - '' - - -Repeat a string -3 times: ----------------------------- +Repeating an Empty String +------------------------- -.. code-block:: psql +.. code-block:: sql - t=> SELECT REPEAT('abc', -3); - - repeat - ----------------------------------------------- - '' \ No newline at end of file + SELECT REPEAT('', 3); + + repeat + ------ + diff --git a/reference/sql/sql_functions/scalar_functions/string/replace.rst b/reference/sql/sql_functions/scalar_functions/string/replace.rst index 5552be269..d2fab561a 100644 --- a/reference/sql/sql_functions/scalar_functions/string/replace.rst +++ b/reference/sql/sql_functions/scalar_functions/string/replace.rst @@ -6,9 +6,6 @@ REPLACE Replaces all occurrences of a specified string value with another string value. -.. warning:: With ``VARCHAR``, a substring can only be replaced with another substring of equal **byte length**. See :ref:`octet_length`. - - Syntax ========== @@ -37,12 +34,7 @@ Returns Returns the same type as the argument supplied. -Notes -======= - -* In ``VARCHAR`` strings, the ``source_expr`` and ``replacement_expr`` must be the same **byte length**. See :ref:`octet_length`. - -* If the value is NULL, the result is NULL. +.. note:: If the value is NULL, the result is NULL. Examples =========== diff --git a/reference/sql/sql_functions/scalar_functions/string/right.rst b/reference/sql/sql_functions/scalar_functions/string/right.rst index 158de7da0..9c3d89838 100644 --- a/reference/sql/sql_functions/scalar_functions/string/right.rst +++ b/reference/sql/sql_functions/scalar_functions/string/right.rst @@ -27,7 +27,7 @@ Arguments * - ``expr`` - String expression * - ``character_count`` - - A positive integer that specifies how many characters to return. + - The number of characters to be returned. If ``character_count <= 0``, an empty string is returned. Returns ============ diff --git a/reference/sql/sql_functions/scalar_functions/string/rlike.rst b/reference/sql/sql_functions/scalar_functions/string/rlike.rst index 324a6e525..3c61cb959 100644 --- a/reference/sql/sql_functions/scalar_functions/string/rlike.rst +++ b/reference/sql/sql_functions/scalar_functions/string/rlike.rst @@ -99,14 +99,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/rtrim.rst b/reference/sql/sql_functions/scalar_functions/string/rtrim.rst index 2bd5bbc38..61fdc877d 100644 --- a/reference/sql/sql_functions/scalar_functions/string/rtrim.rst +++ b/reference/sql/sql_functions/scalar_functions/string/rtrim.rst @@ -35,7 +35,7 @@ Returns the same type as the argument supplied. Notes ======= -* When using ``VARCHAR`` values, SQream DB automatically trims the trailing whitespace. Using ``RTRIM`` on ``VARCHAR`` does not affect the result. +* When using ``TEXT`` values, SQream DB automatically trims the trailing whitespace. Using ``RTRIM`` on ``TEXT`` does not affect the result. * This function is equivalent to the ANSI form ``TRIM( TRAILING FROM expr )`` diff --git a/reference/sql/sql_functions/scalar_functions/string/select_ascii.rst b/reference/sql/sql_functions/scalar_functions/string/select_ascii.rst new file mode 100644 index 000000000..2792e4b67 --- /dev/null +++ b/reference/sql/sql_functions/scalar_functions/string/select_ascii.rst @@ -0,0 +1,25 @@ +.. _select_ascii: + +***** +ASCII +***** +The **ASCII** function is commonly used in combination with other SQL functions for operations such as data transformation, validation, and storing based on ASCII values. + +Syntax +====== +The following shows the syntax for the SELECT ASCII function: + +.. code-block:: postgres + + ASCII() + +Return +====== + +The function returns an 'INT' value representing the ASCII code of the leftmost character in a string. + +Example +=========== +.. code-block:: postgres + + SELECT ASCII('hello'); diff --git a/reference/sql/sql_functions/scalar_functions/string/substring.rst b/reference/sql/sql_functions/scalar_functions/string/substring.rst index b07d951fb..578a0e892 100644 --- a/reference/sql/sql_functions/scalar_functions/string/substring.rst +++ b/reference/sql/sql_functions/scalar_functions/string/substring.rst @@ -1,24 +1,24 @@ .. _substring: -************************** +********* SUBSTRING -************************** +********* -Returns a substring of the input starting at ``start_pos``. +the ``SUBSTRING`` function is used to extract a portion of a string based on specified starting position and length. .. note:: Some systems call this function ``SUBSTR``. See also :ref:`regexp_substr`. Syntax -========== +====== .. code-block:: postgres SUBSTRING( expr, start_pos, length ) Arguments -============ +========= .. list-table:: :widths: auto @@ -27,26 +27,27 @@ Arguments * - Parameter - Description * - ``expr`` - - String expression + - Original string expression from which you want to extract the substring * - ``start_pos`` - - Starting position (starts at 1) + - Accepts an integer or bigint expression that specifies the position within the string where the extraction should begin. If start exceeds the number of characters in the expression, an empty string is returned. If start is less than 1, the expression starts from the first character * - ``length`` - - Number of characters to extract + - Accepts an integer or bigint expression that specifies the number of characters to be returned from the expression. If the sum of start and length exceeds the total number of characters in the expression, the entire value starting from the position specified by start is returned. If length is negative or zero, the function returns an empty string Returns -============ +======= + +* Returns the same type as the argument supplied -Returns the same type as the argument supplied. +* If any of the arguments is NULL, the return is NULL Notes -======= +===== * Character count starts at 1. -* If the value is NULL, the result is NULL. Examples -=========== +======== For these examples, assume a table named ``nba``, with the following structure: @@ -54,14 +55,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); diff --git a/reference/sql/sql_functions/scalar_functions/string/trim.rst b/reference/sql/sql_functions/scalar_functions/string/trim.rst index d6e90c2f8..d249c8952 100644 --- a/reference/sql/sql_functions/scalar_functions/string/trim.rst +++ b/reference/sql/sql_functions/scalar_functions/string/trim.rst @@ -35,7 +35,7 @@ Returns the same type as the argument supplied. Notes ======= -* When using ``VARCHAR`` values, SQream DB automatically trims the trailing whitespace. +* When using ``TEXT`` values, SQream DB automatically trims the trailing whitespace. * This function is equivalent to the ANSI form ``TRIM( BOTH FROM expr )`` diff --git a/reference/sql/sql_functions/scalar_functions/string/upper.rst b/reference/sql/sql_functions/scalar_functions/string/upper.rst index 219bc854e..1f9cc1b96 100644 --- a/reference/sql/sql_functions/scalar_functions/string/upper.rst +++ b/reference/sql/sql_functions/scalar_functions/string/upper.rst @@ -45,7 +45,7 @@ For these examples, consider the following table and contents: .. code-block:: postgres - CREATE TABLE jabberwocky(line VARCHAR(50)); + CREATE TABLE jabberwocky(line TEXT(50)); INSERT INTO jabberwocky VALUES ('''Twas brillig, and the slithy toves'), (' Did gyre and gimble in the wabe:') diff --git a/reference/sql/sql_functions/system_functions/index.rst b/reference/sql/sql_functions/system_functions/index.rst deleted file mode 100644 index 734a9133a..000000000 --- a/reference/sql/sql_functions/system_functions/index.rst +++ /dev/null @@ -1,16 +0,0 @@ -.. _system_functions_functions: - -******************** -System Functions -******************** - -System functions are used for working with database objects, settings, and values. - - -.. toctree:: - :maxdepth: 1 - :glob: - :hidden: - - show_version - diff --git a/reference/sql/sql_functions/user_defined_functions/index.rst b/reference/sql/sql_functions/user_defined_functions/index.rst index 225c9614e..af3e5c281 100644 --- a/reference/sql/sql_functions/user_defined_functions/index.rst +++ b/reference/sql/sql_functions/user_defined_functions/index.rst @@ -1,25 +1,13 @@ .. _user_defined_functions_index: -******************** +********************** User-Defined Functions -******************** +********************** -The following user-defined functions are functions that can be defined and configured by users: +The following user-defined functions are functions that can be defined and configured by users. +The **User-Defined Functions** page describes the following: - -* `Python user-defined functions `_. -* `Scalar SQL user-defined functions `_. - - - - -.. toctree:: - :maxdepth: 8 - :glob: - :hidden: - - - - python_functions - scalar_sql_udf +* :ref:`Python user-defined functions` +* :ref:`Scalar SQL user-defined functions` +* :ref:`Simple Scalar SQL UDF's` \ No newline at end of file diff --git a/reference/sql/sql_functions/user_defined_functions/scalar_sql_udf.rst b/reference/sql/sql_functions/user_defined_functions/scalar_sql_udf.rst index de399c3ff..84f48f971 100644 --- a/reference/sql/sql_functions/user_defined_functions/scalar_sql_udf.rst +++ b/reference/sql/sql_functions/user_defined_functions/scalar_sql_udf.rst @@ -1,13 +1,12 @@ .. _scalar_sql_udf: Scalar SQL UDF ------------------------ -A **scalar SQL UDF** is a user-defined function that returns a single value, such as the sum of a group of values. Scalar UDFs are different than table-value functions, which return a result set in the form of a table. +============== -This page describes the correct syntax when building simple scalar UDFs and provides three examples. +A **scalar SQL UDF** is a user-defined function that returns a single value, such as the sum of a group of values. Scalar UDFs are different than table-value functions, which return a result set in the form of a table. Syntax -~~~~~~~~~~~~ +------ The following example shows the correct syntax for simple scalar SQL UDF's returning the type name: @@ -28,9 +27,11 @@ The following example shows the correct syntax for simple scalar SQL UDF's retur $ function_body ::= A valid SQL statement Examples -~~~~~~~~~~ -Example 1 – Support for Different Syntax -############ +-------- + +Support for Different Syntax +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Scalar SQL UDF supports standard functionality even when different syntax is used. In the example below, the syntax ``dateadd`` is used instead of ``add_months``, although the function of each is identical. In addition, the operation works correctly even though the order of the expressions in ``add_months`` (``dt``, ``datetime``, and ``n int``) is different than ``MONTH``, ``n``, and ``dt`` in ``dateadd``. @@ -43,8 +44,9 @@ In the example below, the syntax ``dateadd`` is used instead of ``add_months``, $ SELECT dateadd(MONTH ,n,dt) $ $$ LANGUAGE SQL; -Example 2 – Manipulating Strings -############ +Manipulating Strings +~~~~~~~~~~~~~~~~~~~~ + The Scalar SQL UDF can be used to manipulate strings. The following example shows the correct syntax for converting a TEXT date to the DATE type: @@ -57,8 +59,9 @@ The following example shows the correct syntax for converting a TEXT date to the $ select (substring(f,1,4)||'-'||substring(f,5,2)||'-'||substring(f,7,2))::date $ $$ LANGUAGE SQL; -Example 3 – Manually Building Functionality -############ +Manually Building Functionality +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + You can use the Scalar SQL UDF to manually build functionality for otherwise unsupported operations. .. code-block:: console @@ -76,26 +79,22 @@ You can use the Scalar SQL UDF to manually build functionality for otherwise uns $ language sql; Usage Notes -~~~~~~~~~~~~~~ -The following usage notes apply when using simple scalar SQL UDF's: +~~~~~~~~~~~ * During this stage, the SQL embedded in the function body must be of the type ``SELECT expr;``. Creating a UDF with invalid SQL, or with valid SQL of any other type, results in an error. * As with Python UDFs, the argument list can be left empty. -* SQL UDFs can reference other UDF's, including Python UDF's. +* A function cannot (directly or indirectly) reference itself (such as by referencing another function that references it). -**NOTICE:** A function cannot (directly or indirectly) reference itself (such as by referencing another function that references it). +Since SQL UDF's are one type of supported UDFs, the following Python UDF characteristics apply: -Because SQL UDF's are one type of supported UDFs, the following Python UDF characteristics apply: - -* UDF permission rules - see `Access Control `_. +* UDF permission rules - see :ref:`Access Control `. * The ``get_function_ddl`` utility function works on these functions - see `Getting the DDL for a Function `_. * SQL UDF's should appear in the catalog with Python UDF's - see `Finding Existing UDFs in the Catalog `_. Restrictions -~~~~~~~~~~~~~~~~ -The following restrictions apply to simple scalar SQL UDFs: +~~~~~~~~~~~~ * Simple scalar SQL UDF's cannot currently reference other UDFs. * Like Python UDF's, Sqream does not support overloading. diff --git a/reference/sql/sql_functions/user_defined_functions/simple_scalar_sql_udf.rst b/reference/sql/sql_functions/user_defined_functions/simple_scalar_sql_udf.rst index 1a78e2e8c..f47983c4a 100644 --- a/reference/sql/sql_functions/user_defined_functions/simple_scalar_sql_udf.rst +++ b/reference/sql/sql_functions/user_defined_functions/simple_scalar_sql_udf.rst @@ -6,6 +6,7 @@ Simple Scalar SQL UDF's Syntax ~~~~~~~~~~~~ + The following example shows the correct syntax for simple scalar SQL UDF's: @@ -27,6 +28,7 @@ The following example shows the correct syntax for simple scalar SQL UDF's: Usage Notes ~~~~~~~~~~~~~~ + The following usage notes apply when using simple scalar SQL UDF's: * During this stage, the SQL embedded in the function body must be of the type ``SELECT expr;``. Creating a UDF with invalid SQL, or with valid SQL of any other type, results in an error. @@ -45,6 +47,7 @@ Because SQL UDF's are one type of supported UDFs, the following Python UDF chara Restrictions ~~~~~~~~~~~~~~~~~~~~~ + The following restrictions apply to simple scalar SQL UDF's: * Simple scalar SQL UDF's cannot currently reference other UDF's. diff --git a/reference/sql/sql_functions/window_functions/first_value.rst b/reference/sql/sql_functions/window_functions/first_value.rst index 07708872d..78896746a 100644 --- a/reference/sql/sql_functions/window_functions/first_value.rst +++ b/reference/sql/sql_functions/window_functions/first_value.rst @@ -5,7 +5,7 @@ FIRST_VALUE ************************** The **FIRST_VALUE** function returns the value located in the selected column of the first row of a segment. If the table is not segmented, the FIRST_VALUE function returns the value from the first row of the whole table. -This function returns the same type of variable that you input for your requested value. For example, requesting the value for the first employee in a list using an ``int`` type output returns an ``int`` type ID column. If you use a ``varchar`` type, the function returns a ``varchar`` type name column. +This function returns the same type of variable that you input for your requested value. For example, requesting the value for the first employee in a list using an ``int`` type output returns an ``int`` type ID column. If you use a ``text`` type, the function returns a ``text`` type name column. Syntax ------- @@ -21,4 +21,4 @@ None Returns --------- -Returns the value located in the selected column of the first row of a segment. +Returns the value located in the selected column of the first row of a segment. \ No newline at end of file diff --git a/reference/sql/sql_functions/window_functions/index.rst b/reference/sql/sql_functions/window_functions/index.rst index 4061ad239..e949e0e3c 100644 --- a/reference/sql/sql_functions/window_functions/index.rst +++ b/reference/sql/sql_functions/window_functions/index.rst @@ -4,26 +4,21 @@ Window Functions ******************** -Window functions are functions applied over a subset (known as a window) of the rows returned by a :ref:`select` query. +Window functions are functions applied over a subset (known as a window) of the rows returned by a :ref:`select` query and describes the following: -Read more about :ref:`window_functions` in the :ref:`sql_syntax` section. - -.. toctree:: - :maxdepth: 1 - :caption: Window Functions: - :glob: - :hidden: +.. hlist:: + :columns: 1 - lag - lead - row_number - rank - first_value - last_value - nth_value - dense_rank - percent_rank - cume_dist - ntile - + * :ref:`lag` + * :ref:`lead` + * :ref:`row_number` + * :ref:`rank` + * :ref:`first_value` + * :ref:`last_value` + * :ref:`nth_value` + * :ref:`dense_rank` + * :ref:`percent_rank` + * :ref:`cume_dist` + * :ref:`ntile` +For more information, see :ref:`window_functions` in the :ref:`sql_syntax` section. \ No newline at end of file diff --git a/reference/sql/sql_functions/window_functions/lag.rst b/reference/sql/sql_functions/window_functions/lag.rst index e93be2821..96ea55bed 100644 --- a/reference/sql/sql_functions/window_functions/lag.rst +++ b/reference/sql/sql_functions/window_functions/lag.rst @@ -59,14 +59,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/window_functions/last_value.rst b/reference/sql/sql_functions/window_functions/last_value.rst index 3cadaa9d1..ae1276e79 100644 --- a/reference/sql/sql_functions/window_functions/last_value.rst +++ b/reference/sql/sql_functions/window_functions/last_value.rst @@ -5,7 +5,7 @@ LAST_VALUE ************************** The **LAST_VALUE** function returns the value located in the selected column of the last row of a segment. If the table is not segmented, the LAST_VALUE function returns the value from the last row of the whole table. -This function returns the same type of variable that you input for your requested value. For example, requesting the value for the last employee in a list using an ``int`` type output returns an ``int`` type ID column. If you use a ``varchar`` type, the function returns a ``varchar`` type name column. +This function returns the same type of variable that you input for your requested value. For example, requesting the value for the last employee in a list using an ``int`` type output returns an ``int`` type ID column. If you use a ``text`` type, the function returns a ``text`` type name column. Syntax ------- @@ -21,4 +21,4 @@ None Returns --------- -Returns the value located in the selected column of the last row of a segment. +Returns the value located in the selected column of the last row of a segment. \ No newline at end of file diff --git a/reference/sql/sql_functions/window_functions/lead.rst b/reference/sql/sql_functions/window_functions/lead.rst index bc311689f..f1c52e1ec 100644 --- a/reference/sql/sql_functions/window_functions/lead.rst +++ b/reference/sql/sql_functions/window_functions/lead.rst @@ -59,14 +59,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/window_functions/nth_value.rst b/reference/sql/sql_functions/window_functions/nth_value.rst index a2c1dd9a6..75e97a939 100644 --- a/reference/sql/sql_functions/window_functions/nth_value.rst +++ b/reference/sql/sql_functions/window_functions/nth_value.rst @@ -22,8 +22,8 @@ The following example shows the syntax for a table named ``superstore`` used for CREATE TABLE superstore ( - "Section" varchar(40), - "Product_Name" varchar(40), + "Section" text(40), + "Product_Name" text(40), "Sales_In_K" int, ); diff --git a/reference/sql/sql_functions/window_functions/rank.rst b/reference/sql/sql_functions/window_functions/rank.rst index 28856bd04..7699fd399 100644 --- a/reference/sql/sql_functions/window_functions/rank.rst +++ b/reference/sql/sql_functions/window_functions/rank.rst @@ -48,14 +48,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_functions/window_functions/row_number.rst b/reference/sql/sql_functions/window_functions/row_number.rst index ea5786aef..cfcc14b7b 100644 --- a/reference/sql/sql_functions/window_functions/row_number.rst +++ b/reference/sql/sql_functions/window_functions/row_number.rst @@ -48,14 +48,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); diff --git a/reference/sql/sql_statements/access_control_commands/alter_default_permissions.rst b/reference/sql/sql_statements/access_control_commands/alter_default_permissions.rst index 220e05eb3..212eb662b 100644 --- a/reference/sql/sql_statements/access_control_commands/alter_default_permissions.rst +++ b/reference/sql/sql_statements/access_control_commands/alter_default_permissions.rst @@ -1,86 +1,166 @@ .. _alter_default_permissions: -***************************** +************************* ALTER DEFAULT PERMISSIONS -***************************** +************************* -``ALTER DEFAULT PERMISSIONS`` allows granting automatic permissions to future objects. +The ``ALTER DEFAULT PERMISSIONS`` command lets you grant automatic permissions to future objects. -By default, if one user creates a table, another user will not have ``SELECT`` permissions on it. -By modifying the target role's default permissions, a database administrator can ensure that -all objects created by that role will be accessible to others. +By default, users do not have ``SELECT`` permissions on tables created by other users. Database administrators can grant access to other users by modifying the target role default permissions. -Learn more about the permission system in the :ref:`access control guide`. +For more information about access control, see :ref:`Access Control`. + +.. contents:: + :local: + :depth: 1 Permissions -============= +=========== -To alter default permissions, the current role must have the ``SUPERUSER`` permission. +The ``SUPERUSER`` permission is required to alter default permissions. Syntax -========== +====== .. code-block:: postgres - alter_default_permissions_statement ::= - ALTER DEFAULT PERMISSIONS FOR target_role_name - [IN schema_name, ...] - FOR { TABLES | SCHEMAS } - { grant_clause | DROP grant_clause} - TO ROLE { role_name | public }; - - grant_clause ::= - GRANT - { CREATE FUNCTION + ALTER DEFAULT PERMISSIONS FOR modifying_role_name + [IN schema_name, ...] + FOR { + SCHEMAS + | TABLES + | FOREIGN TABLES + | VIEWS + | COLUMNS + | SAVED_QUERIES + | FUNCTIONS + } + { grant_clause + | DROP grant_clause } + TO { modified_role_name | public + } + + grant_clause ::= + GRANT + { CREATE FUNCTION | SUPERUSER | CONNECT - | CREATE | USAGE | SELECT | INSERT | DELETE | DDL + | UPDATE | EXECUTE | ALL - } - - target_role_name ::= identifier - - role_name ::= identifier + } - schema_name ::= identifier + +Supported Permissions +===================== + +The following table describes the supported permissions: + +.. list-table:: + :widths: auto + :header-rows: 1 + * - Permission + - Object + - Description + * - ``SUPERUSER`` + - Schema + - The most privileged role, with full control over a cluster, database, or schema + * - ``USAGE`` + - Schema + - For a role to see tables in a schema, it needs the ``USAGE`` permissions + * - ``SELECT`` + - Table + - Allows a user to run :ref:`select` queries on table contents + * - ``INSERT`` + - Table + - Allows a user to run :ref:`copy_from` and :ref:`insert` statements to load data into a table + * - ``UPDATE`` + - Table + - Allows a user to modify the value of certain columns in existing rows without creating a table + * - ``DELETE`` + - Table + - Allows a user to run :ref:`delete`, :ref:`truncate` statements to delete data from a table + * - ``DDL`` + - Schema, Table + - Allows a user to :ref:`alter tables`, rename columns and tables, etc. + -.. include:: grant.rst - :start-line: 127 - :end-line: 180 Examples -============ +======== -Automatic permissions for newly created schemas -------------------------------------------------- +.. contents:: + :local: + :depth: 1 + +Granting Default Table Permissions +---------------------------------- -When role ``demo`` creates a new schema, roles u1,u2 will get USAGE and CREATE permissions in the new schema: +Altering the default permissions of **r1** so that **r2** is able to execute ``SELECT`` on tables created by **r1**: .. code-block:: postgres - ALTER DEFAULT PERMISSIONS FOR demo FOR SCHEMAS GRANT USAGE, CREATE TO u1,u2; + CREATE ROLE r1; + CREATE ROLE r2; + ALTER DEFAULT PERMISSIONS FOR r1 FOR TABLES GRANT SELECT TO r2; +Once created, you can build and run the following query based on the above: -Automatic permissions for newly created tables in a schema ----------------------------------------------------------------- +.. code-block:: postgres -When role ``demo`` creates a new table in schema ``s1``, roles u1,u2 wil be granted with SELECT on it: + SELECT + tdp.database_name as "database_name", + ss.schema_name as "schema_name", + rs1.name as "table_creator", + rs2.name as "grant_to", + pts.name as "permission_type" + FROM sqream_catalog.table_default_permissions tdp + INNER JOIN sqream_catalog.roles rs1 on tdp.modifier_role_id = rs1.role_id + INNER JOIN sqream_catalog.roles rs2 on tdp.getter_role_id = rs2.role_id + LEFT JOIN sqream_catalog.schemas ss on tdp.schema_id = ss.schema_id + INNER JOIN sqream_catalog.permission_types pts on pts.permission_type_id=tdp.permission_type + ; + +The following is an example of the output generated from the above queries: + ++-----------------------+----------------------+-------------------+--------------+------------------------------+ +| **database_name** | **schema_name** | **table_creator** | **grant_to** | **permission_type** | ++-----------------------+----------------------+-------------------+--------------+------------------------------+ +| master | NULL | public | public | select | ++-----------------------+----------------------+-------------------+--------------+------------------------------+ + +For more information about default permissions, see `Default Permissions `_. + +Granting Automatic Permissions for Newly Created Schemas +-------------------------------------------------------- + +When the role ``demo`` creates a new schema, roles **u1,u2** are granted ``USAGE`` permission in the new schema, as shown below: + +.. code-block:: postgres + + ALTER DEFAULT PERMISSIONS FOR demo FOR SCHEMAS GRANT USAGE TO u1,u2; + +Granting Automatic Permissions for Newly Created Tables in a Schema +------------------------------------------------------------------- + +When the role ``demo`` creates a new table in schema ``s1``, roles **u1,u2** are granted ``SELECT`` permissions, as shown below: .. code-block:: postgres ALTER DEFAULT PERMISSIONS FOR demo IN s1 FOR TABLES GRANT SELECT TO u1,u2; -Revoke (``DROP GRANT``) permissions for newly created tables ---------------------------------------------------------------- +Revoking Permissions from Newly Created Tables +---------------------------------------------- + +Revoking permissions refers to using the ``DROP GRANT`` command, as shown below: .. code-block:: postgres - ALTER DEFAULT PERMISSIONS FOR public FOR TABLES DROP GRANT SELECT,DDL,INSERT,DELETE TO public; + ALTER DEFAULT PERMISSIONS FOR public FOR TABLES DROP GRANT SELECT,DDL,INSERT,DELETE TO public; \ No newline at end of file diff --git a/reference/sql/sql_statements/access_control_commands/create_role.rst b/reference/sql/sql_statements/access_control_commands/create_role.rst index 1eb2bba47..630cf6714 100644 --- a/reference/sql/sql_statements/access_control_commands/create_role.rst +++ b/reference/sql/sql_statements/access_control_commands/create_role.rst @@ -76,11 +76,11 @@ Creating a user role A user role has permissions to login, and has a password. -.. tip:: Some DBMSs call this *CREATE USER* or *ADD USER* - .. code-block:: postgres CREATE ROLE new_role; GRANT LOGIN to new_role; - GRANT PASSWORD 'Tr0ub4dor&3' to new_role; - GRANT CONNECT ON DATABASE master to new_role; -- Repeat for other desired databases \ No newline at end of file + GRANT PASSWORD 'passw0rd' to new_role; + GRANT CONNECT ON DATABASE master to new_role; -- Repeat for all desired databases + GRANT USAGE ON SERVICE sqream TO new_role; + GRANT ALL ON SCHEMA public TO new_role; -- It is advisable to grant permissions on at least one schema diff --git a/reference/sql/sql_statements/access_control_commands/drop_role.rst b/reference/sql/sql_statements/access_control_commands/drop_role.rst index f519870cd..fe46890ba 100644 --- a/reference/sql/sql_statements/access_control_commands/drop_role.rst +++ b/reference/sql/sql_statements/access_control_commands/drop_role.rst @@ -4,29 +4,22 @@ DROP ROLE ***************** -``DROP ROLE`` remove roles. - -Learn more about the permission system in the :ref:`access control guide`. +The ``DROP ROLE`` command is used for removing roles from the database. The optional ``IF EXISTS`` clause can be included to prevent an error if the specified role does not exist. If the ``IF EXISTS`` clause is omitted and the role does not exist, an error will be raised. See also :ref:`create_role`. -Permissions -============= - -To drop a role, the current role must have the ``SUPERUSER`` permission. - Syntax -========== +====== .. code-block:: postgres drop_role_statement ::= - DROP ROLE role_name ; + DROP ROLE [IF EXISTS] role_name ::= identifier - + Parameters -============ +========== .. list-table:: :widths: auto @@ -35,15 +28,18 @@ Parameters * - Parameter - Description * - ``role_name`` - - The name of the role to drop. - + - Role name to be removed Examples -=========== - -Dropping a role ------------------------------------------ +======== .. code-block:: postgres DROP ROLE new_role; + +Permissions +=========== + +To drop a role, the current role must have a ``SUPERUSER`` permission. + +You can learn more about system permissions in the :ref:`access control guide`. \ No newline at end of file diff --git a/reference/sql/sql_statements/access_control_commands/get_all_roles_database_ddl.rst b/reference/sql/sql_statements/access_control_commands/get_all_roles_database_ddl.rst new file mode 100644 index 000000000..13543d585 --- /dev/null +++ b/reference/sql/sql_statements/access_control_commands/get_all_roles_database_ddl.rst @@ -0,0 +1,45 @@ +.. _get_all_roles_database_ddl: + +************************** +GET ALL ROLES DATABASE DDL +************************** + +The ``GET ALL ROLES DATABASE DDL`` statement returns the definition of all role databases in DDL format. + +.. contents:: + :local: + :depth: 1 + +Syntax +========== + +.. code-block:: postgres + + select get_all_roles_database_ddl() + +Example +=========== + +.. code-block:: psql + + select get_all_roles_database_ddl(); + +Output +========== +The following is an example of the output of the ``GET_ALL_ROLES_DATABASE_DDL`` statement: + +.. code-block:: postgres + + grant create, usage on schema "public" to "public" ; alter default schema for "public" to "public"; alter default permissions for "public" for schemas grant superuser to creator_role ; alter default permissions for "public" for tables grant select, insert, delete, ddl, update to creator_role ; grant select, insert, delete, ddl, update on table "public"."customer" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."d_customer" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."demo_customer" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."demo_lineitem" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."lineitem" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."nation" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."orders" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."part" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."partsupp" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."region" to "sqream" ; grant select, insert, delete, ddl, update on table "public"."supplier" to "sqream" ; alter default schema for "sqream" to "public"; + +Permissions +============= +Using the ``GET ALL ROLES DATABASE DDL`` statement requires no special permissions. + +For more information, see the following: + +* :ref:`get_all_roles_global_ddl` + + :: + +* :ref:`get_role_permissions` \ No newline at end of file diff --git a/reference/sql/sql_statements/access_control_commands/get_statement_permissions.rst b/reference/sql/sql_statements/access_control_commands/get_statement_permissions.rst index 0846c5cc4..cee5a3d38 100644 --- a/reference/sql/sql_statements/access_control_commands/get_statement_permissions.rst +++ b/reference/sql/sql_statements/access_control_commands/get_statement_permissions.rst @@ -1,10 +1,10 @@ .. _get_statement_permissions: **************************** -GET_STATEMENT_PERMISSIONS +GET STATEMENT PERMISSIONS **************************** -``GET_STATEMENT_PERMISSIONS`` analyzes an SQL statement and returns a list of permissions required to execute it. +``GET STATEMENT PERMISSIONS`` analyzes an SQL statement and returns a list of permissions required to execute it. Use this function to understand the permissions required, before :ref:`granting` them to a specific role. @@ -15,7 +15,7 @@ See also :ref:`grant`, :ref:`create_role`. Permissions ============= -No special permissions are required to run ``GET_STATEMENT_PERMISSIONS``. +No special permissions required. Syntax ========== @@ -58,9 +58,58 @@ If the statement requires no special permissions, the utility returns an empty r * - ``object_name`` - The object name -.. include:: grant.rst - :start-line: 127 - :end-line: 180 +The following table describes the supported permissions: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Permission + - Object + - Description + * - ``LOGIN`` + - Cluster + - Login permissions (with a password) allows a role to be a user and login to a database + * - ``PASSWORD`` + - Cluster + - Sets the password for a user role + * - ``CREATE FUNCTION`` + - Database + - Allows a user to :ref:`create a Python UDF` + * - ``SUPERUSER`` + - Cluster, Database, Schema + - The most privileged role, with full control over a cluster, database, or schema + * - ``CONNECT`` + - Database + - Allows a user to connect and use a database + * - ``CREATE`` + - Database, Schema + - For a role to create and manage objects, it needs the ``CREATE`` and ``USAGE`` permissions at the respective level + * - ``USAGE`` + - Schema + - For a role to see tables in a schema, it needs the ``USAGE`` permissions + * - ``SELECT`` + - Table + - Allows a user to run :ref:`select` queries on table contents + * - ``INSERT`` + - Table + - Allows a user to run :ref:`copy_from` and :ref:`insert` statements to load data into a table + * - ``UPDATE`` + - Table + - Allows a user to modify the value of certain columns in existing rows without creating a table + * - ``DELETE`` + - Table + - Allows a user to run :ref:`delete`, :ref:`truncate` statements to delete data from a table + * - ``DDL`` + - Database, Schema, Table, Function + - Allows a user to :ref:`alter tables`, rename columns and tables, etc. + * - ``EXECUTE`` + - Function + - Allows a user to execute UDFs + * - ``ALL`` + - Cluster, Database, Schema, Table, Function + - All of the above permissions at the respective level + Examples =========== diff --git a/reference/sql/sql_statements/access_control_commands/grant.rst b/reference/sql/sql_statements/access_control_commands/grant.rst index dfb48c212..758bf7c84 100644 --- a/reference/sql/sql_statements/access_control_commands/grant.rst +++ b/reference/sql/sql_statements/access_control_commands/grant.rst @@ -15,97 +15,130 @@ Learn more about the permission system in the :ref:`access control guide' } + TO [, ...] + + -- Grant permissions at the database level: + GRANT { + CREATE + | CONNECT + | DDL + | SUPERUSER + | CREATE FUNCTION } [, ...] + | ALL [PERMISSIONS] + ON DATABASE [, ...] + TO [, ...] + + -- Grant permissions at the schema level: + GRANT { + CREATE + | DDL + | USAGE + | SUPERUSER } [, ...] + | ALL [PERMISSIONS] + ON SCHEMA [, ...] + TO [, ...] + + -- Grant permissions at the object level: + GRANT { + SELECT + | INSERT + | DELETE + | DDL + | UPDATE } [, ...] + | ALL [PERMISSIONS] + ON {TABLE [, ...] + | ALL TABLES IN SCHEMA [, ...]} + TO [, ...] + + -- Grant permissions at the catalog level: + GRANT SELECT + ON { CATALOG [, ...] } + TO [, ...] + + -- Grant permissions on the foreign table level: + + GRANT { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { FOREIGN TABLE [, ...] + | ALL FOREIGN TABLE IN SCHEMA [, ...]} + TO [, ...] + + -- Grant function execution permission: + GRANT { + ALL + | EXECUTE + | DDL } + ON FUNCTION + TO + + -- Grant permissions at the column level: + GRANT + { + { SELECT + | DDL + | INSERT + | UPDATE } [, ...] + | ALL [PERMISSIONS] + } + ON + { + COLUMN [,] IN TABLE + | COLUMN [,] IN FOREIGN TABLE + } + TO [, ...] + + -- Grant permissions on the view level + GRANT { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { VIEW [, ...] + | ALL VIEWS IN SCHEMA [, ...]} + TO [, ...] + + -- Grant permissions at the Service level: + GRANT { + {USAGE} [, ...] + | ALL [PERMISSIONS] } + ON { SERVICE [, ...] + | ALL SERVICES IN SYSTEM } + TO [, ...] + + -- Grant saved query permissions + GRANT + SELECT + | DDL + | USAGE + | ALL + ON SAVED QUERY [,...] + TO [,...] + + -- Allows role2 to use permissions granted to role1 + GRANT [, ...] + TO + + -- Also allows the role2 to grant role1 to other roles: + GRANT [, ...] + TO [,...] [WITH ADMIN OPTION] + Parameters ============ +The following table describes the ``GRANT`` parameters: + .. list-table:: :widths: auto :header-rows: 1 @@ -114,7 +147,7 @@ Parameters - Description * - ``role_name`` - The name of the role to grant permissions to - * - ``table_name``, ``database_name``, ``schema_name``, ``function_name`` + * - ``table_name``, ``database_name``, ``schema_name``, ``function_name``, ``catalog_name``, ``column_name``, ``service_name``, ``saved_query_name`` - Object to grant permissions on. * - ``WITH ADMIN OPTION`` - @@ -126,9 +159,11 @@ Parameters .. include from here -Supported permissions +Supported Permissions ======================= +The following table describes the supported permissions: + .. list-table:: :widths: auto :header-rows: 1 @@ -155,38 +190,46 @@ Supported permissions - Database, Schema, Table - For a role to create and manage objects, it needs the ``CREATE`` and ``USAGE`` permissions at the respective level * - ``USAGE`` - - Schema + - Schema, Saved Query, Services - For a role to see tables in a schema, it needs the ``USAGE`` permissions * - ``SELECT`` - - Table + - Table, Saved Query, View, Catalog, Foreign Table - Allows a user to run :ref:`select` queries on table contents * - ``INSERT`` - Table - Allows a user to run :ref:`copy_from` and :ref:`insert` statements to load data into a table + * - ``UPDATE`` + - Table + - Allows a user to modify the value of certain columns in existing rows without creating a table * - ``DELETE`` - Table - Allows a user to run :ref:`delete`, :ref:`truncate` statements to delete data from a table * - ``DDL`` - - Database, Schema, Table, Function + - Database, Schema, Table, Function, Saved Query, View, Foreign Table - Allows a user to :ref:`alter tables`, rename columns and tables, etc. * - ``EXECUTE`` - Function - Allows a user to execute UDFs * - ``ALL`` - - Cluster, Database, Schema, Table, Function + - Cluster, Database, Schema, Table, Function, Saved Query, Services, Foreign Table - All of the above permissions at the respective level - .. end include Examples =========== -Creating a user role with login permissions +This section includes the following examples: + +.. contents:: + :local: + :depth: 1 + +Creating a User Role with Log-in Permissions ---------------------------------------------- -Convert a role to a user by granting a password and login permissions +The following example shows how to convert a role to a user by granting password and log-in permissions: .. code-block:: postgres @@ -195,9 +238,11 @@ Convert a role to a user by granting a password and login permissions GRANT PASSWORD 'Tr0ub4dor&3' to new_role; GRANT CONNECT ON DATABASE master to new_role; -- Repeat for other desired databases -Promoting a user to a superuser +Promoting a User to a Superuser ------------------------------------- +The following is the syntax for promoting a user to a superuser: + .. code-block:: postgres -- On the entire cluster @@ -206,10 +251,11 @@ Promoting a user to a superuser -- For a specific database GRANT SUPERUSER ON DATABASE my_database TO new_role; - -Creating a new role for a group of users +Creating a New Role for a Group of Users -------------------------------------------- +The following example shows how to create a new role for a group of users: + .. code-block:: postgres -- Create new users (we will grant them passwords and logins later) @@ -222,10 +268,10 @@ Creating a new role for a group of users GRANT r_database_architect TO dba_user2; GRANT r_database_architect TO dba_user3; -Granting with admin option +Granting with Admin Option ------------------------------ -If ``WITH ADMIN OPTION`` is specified, the role that has the admin option can in turn grant membership in the role to others, and revoke membership in the role as well. +If ``WITH ADMIN OPTION`` is specified, the role with the **admin** option can grant membership in the role to others and revoke membership, as shown below: .. code-block:: postgres @@ -234,13 +280,18 @@ If ``WITH ADMIN OPTION`` is specified, the role that has the admin option can in GRANT r_database_architect TO dba_user1 WITH ADMIN OPTION; -Change password for user role +Changing Password for User Role -------------------------------------- -To change a user role's password, grant the user a new password. +The following is an example of changing a password for a user role. This is done by granting the user a new password: .. code-block:: postgres - GRANT PASSWORD 'new_password' TO rhendricks; + GRANT PASSWORD 'Passw0rd!' TO rhendricks; .. note:: Granting a new password overrides any previous password. Changing the password while the role has an active running statement does not affect that statement, but will affect subsequent statements. + +Permissions +============= + +To grant permissions, the current role must have the ``SUPERUSER`` permission, or have the ``ADMIN OPTION``. \ No newline at end of file diff --git a/reference/sql/sql_statements/access_control_commands/grant_usage_on_service_to_all_roles.rst b/reference/sql/sql_statements/access_control_commands/grant_usage_on_service_to_all_roles.rst new file mode 100644 index 000000000..083dcd2d6 --- /dev/null +++ b/reference/sql/sql_statements/access_control_commands/grant_usage_on_service_to_all_roles.rst @@ -0,0 +1,74 @@ +.. _grant_usage_on_service_to_all_roles: + +*********************************** +GRANT USAGE ON SERVICE TO ALL ROLES +*********************************** + +The ``GRANT USAGE ON SERVICE TO ALL ROLES`` utility function enables a ``SUPERUSER`` to grant access to services for other system roles. + +You may use it to perform one of the following options: + +* Granting access to all services for all roles +* Granting access to a specific service for all roles + + +This utility function is particularly beneficial during the upgrade process from SQreamDB version 4.2 or earlier to version 4.3 or later. In previous versions, service access permissions were not required. In this scenario, you can easily grant access to all services for all roles immediately after the upgrade. If you are already using SQreamDB version 4.3 or later, you can grant or revoke access to services by following the :ref:`access permission guide`. + +.. note:: + + When you create a new role, it automatically inherits access permissions to all services from the ``PUBLIC`` role. If you prefer to create new roles without automatically granting them access permissions to all services, you will need to follow the :ref:`ALTER DEFAULT PERMISSIONS` guide to revoke the access permissions of the ``PUBLIC`` role. + +Syntax +====== + +.. code-block:: psql + + SELECT grant_usage_on_service_to_all_roles (<'single service'>) + + +Examples +======== +Granting access to all services for all roles: + +.. code-block:: psql + + SELECT grant_usage_on_service_to_all_roles(); + +Output: + +.. code-block:: + + role_name | service_name | status + ----------+---------------+-------------------- + role1 | service1 | Permission Granted + role1 | service2 | Permission Granted + role1 | service3 | Permission Granted + role2 | service1 | Permission Granted + role2 | service2 | Permission Granted + role2 | service3 | Permission Granted + role3 | service1 | Permission Exists + role3 | service2 | Permission Exists + role3 | service3 | Permission Exists + +Granting access to one specific service for all roles: + +.. code-block:: psql + + SELECT grant_usage_on_service_to_all_roles('service1'); + +Output: + +.. code-block:: + + role_name | service_name | status + ----------+---------------+-------------------- + role1 | service1 | Permission Granted + role2 | service1 | Permission Granted + role3 | service1 | Permission Exists + + +Permissions +=========== + +Using the ``grant_usage_on_service_to_all_roles`` requires ``SUPERUSER`` permission. + diff --git a/reference/sql/sql_statements/access_control_commands/revoke.rst b/reference/sql/sql_statements/access_control_commands/revoke.rst index fbeee701d..786d55836 100644 --- a/reference/sql/sql_statements/access_control_commands/revoke.rst +++ b/reference/sql/sql_statements/access_control_commands/revoke.rst @@ -1,8 +1,8 @@ .. _revoke: -***************** +****** REVOKE -***************** +****** The ``REVOKE`` statement removes permissions from a role. It allows for removing permissions to specific objects. @@ -11,95 +11,128 @@ Learn more about the permission system in the :ref:`access control guide' } + FROM [, ...] + + -- Revoke permissions at the database level: + REVOKE { + CREATE + | CONNECT + | DDL + | SUPERUSER + | CREATE FUNCTION } [, ...] + | ALL [PERMISSIONS] + ON DATABASE [, ...] + FROM [, ...] + + -- Revoke permissions at the schema level: + REVOKE { + CREATE + | DDL + | USAGE + | SUPERUSER } [, ...] + | ALL [PERMISSIONS] + ON SCHEMA [, ...] + FROM [, ...] + + -- Revoke permissions at the object level: + REVOKE { + SELECT + | INSERT + | DELETE + | DDL + | UPDATE } [, ...] + | ALL [PERMISSIONS] + ON {TABLE [, ...] + | ALL TABLES IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke permissions at the catalog level: + REVOKE SELECT + ON { CATALOG [, ...] } + FROM [, ...] + + -- Revoke permissions on the foreign table level: + + REVOKE { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { FOREIGN TABLE [, ...] + | ALL FOREIGN TABLE IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke function execution permission: + REVOKE { + ALL + | EXECUTE + | DDL } + ON FUNCTION + FROM + + -- Revoke permissions at the column level: + REVOKE + { + { SELECT + | DDL } [, ...] + | INSERT + | UPDATE } [, ...] + | ALL [PERMISSIONS]} + ON + { + COLUMN [,] IN TABLE | COLUMN [,] IN FOREIGN TABLE + } + FROM [, ...] + + -- Revoke permissions on the view level + REVOKE { + {SELECT + | DDL } [, ...] + | ALL [PERMISSIONS] } + ON { VIEW [, ...] + | ALL VIEWS IN SCHEMA [, ...]} + FROM [, ...] + + -- Revoke permissions at the Service level: + REVOKE { + {USAGE} [, ...] + | ALL [PERMISSIONS] } + ON { SERVICE [, ...] + | ALL SERVICES IN SYSTEM } + FROM [, ...] + + -- Revoke saved query permissions + REVOKE + SELECT + | DDL + | USAGE + | ALL + ON SAVED QUERY [,...] + FROM [,...] + + -- Removes access to permissions in role1 by role 2 + REVOKE [ADMIN OPTION FOR] [, ...] + FROM [, ...] + + -- Removes permissions to grant role1 to additional roles from role2 + REVOKE [ADMIN OPTION FOR] [, ...] + FROM [, ...] Parameters -============ +========== .. list-table:: :widths: auto @@ -109,26 +142,20 @@ Parameters - Description * - ``role_name`` - The name of the role to revoke permissions from - * - ``table_name``, ``database_name``, ``schema_name``, ``function_name`` - - Object to revoke permissions on. + * - ``table_name``, ``database_name``, ``schema_name``, ``function_name``, ``catalog_name``, ``column_name``, ``service_name``, ``saved_query_name`` + - Object to revoke permissions from * - ``WITH ADMIN OPTION`` - If ``WITH ADMIN OPTION`` is specified, the role that has the admin option can in turn grant membership in the role to others, and revoke membership in the role as well. Specifying ``WITH ADMIN OPTION`` for revocation will return the role to an ordinary role. An ordinary role cannot grant or revoke membership. - - - -.. include:: grant.rst - :start-line: 127 - :end-line: 180 Examples -=========== +======== Prevent a role from modifying table contents ----------------------------------------------- +-------------------------------------------- If you don't trust user ``shifty``, reokve DDL and INSERT permissions. @@ -138,7 +165,7 @@ If you don't trust user ``shifty``, reokve DDL and INSERT permissions. REVOKE DDL ON TABLE important_table FROM shifty; Demoting a user from superuser -------------------------------------- +------------------------------ .. code-block:: postgres @@ -146,7 +173,7 @@ Demoting a user from superuser REVOKE SUPERUSER FROM new_role; Revoking admin option ------------------------------- +--------------------- If ``WITH ADMIN OPTION`` is specified, the role that has the admin option can in turn grant membership in the role to others, and revoke membership in the role as well. diff --git a/reference/sql/sql_statements/ddl_commands/add_column.rst b/reference/sql/sql_statements/ddl_commands/add_column.rst index d532c956f..0fb006226 100644 --- a/reference/sql/sql_statements/ddl_commands/add_column.rst +++ b/reference/sql/sql_statements/ddl_commands/add_column.rst @@ -1,42 +1,35 @@ .. _add_column: -********************** +********** ADD COLUMN -********************** +********** The ``ADD COLUMN`` command is used to add columns to an existing table. - - Syntax -========== -The following is the correct syntax for adding a table: +====== .. code-block:: postgres - alter_table_add_column_statement ::= - ALTER TABLE [schema_name.]table_name { ADD COLUMN column_def [, ...] } - ; - - table_name ::= identifier - - schema_name ::= identifier - - column_def :: = { column_name type_name [ default ] [ column_constraint ] } - - column_name ::= identifier - - column_constraint ::= - { NOT NULL | NULL } - - default ::= - DEFAULT default_value + ALTER TABLE [schema_name.]table_name { ADD COLUMN column_def [, ...] } + schema_name ::= identifier + + table_name ::= identifier + + column_def ::= + { column_name type_name [ default ] [ column_constraint ] CHECK('CS "compression_type"') } + column_name ::= identifier + + column_constraint ::= + { NOT NULL | NULL } + + default ::= + DEFAULT default_value Parameters -============ -The following parameters can be used for adding a table: +========== .. list-table:: :widths: auto @@ -45,47 +38,57 @@ The following parameters can be used for adding a table: * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified * - ``table_name`` - - The table name to apply the change to. + - The table name to apply the change to * - ``ADD COLUMN column_def`` - A comma separated list of ADD COLUMN commands * - ``column_def`` - - A column definition. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally. - -.. note:: - * When adding a new column to an existing table, a default (or null constraint) has to be specified, even if the table is empty. - * A new column added to the table can not contain an IDENTITY or be of the TEXT type. + - A column definition. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally +Usage Notes +=========== -Permissions -============= -The role must have the ``DDL`` permission at the database or table level. +When adding an empty column, the default values for that column will be set to ``NULL``. Examples -=========== -This section includes the following examples: - -.. contents:: - :local: - :depth: 1 +======== Adding a Simple Column with a Default Value ------------------------------------------ -This example shows how to add a simple column with a default value: +------------------------------------------- .. code-block:: postgres - ALTER TABLE cool_animals - ADD COLUMN number_of_eyes INT DEFAULT 2 NOT NULL; - + ALTER TABLE + cool_animals + ADD + COLUMN number_of_eyes INT DEFAULT 2 NOT NULL; Adding Several Columns in One Command -------------------------------------------- -This example shows how to add several columns in one command: +------------------------------------- .. code-block:: postgres - ALTER TABLE cool_animals - ADD COLUMN number_of_eyes INT DEFAULT 2 NOT NULL, - ADD COLUMN date_seen DATE DEFAULT '2019-08-01'; + ALTER TABLE + cool_animals + ADD + COLUMN number_of_eyes INT DEFAULT 2 NOT NULL, + ADD + COLUMN date_seen DATE DEFAULT '2019-08-01'; + +Adding Compressed Column +------------------------ + +.. code-block:: + + ALTER TABLE + coo_animals + ADD + COLUMN animal_salary INT CHECK('CS "dict"'); + +Follow SQreamDB :ref:`compression guide` for compression types and methods. + +Permissions +=========== + +The role must have the ``DDL`` permission at the database or table level. \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/alter_default_schema.rst b/reference/sql/sql_statements/ddl_commands/alter_default_schema.rst index e40a8af7c..8c5722f48 100644 --- a/reference/sql/sql_statements/ddl_commands/alter_default_schema.rst +++ b/reference/sql/sql_statements/ddl_commands/alter_default_schema.rst @@ -6,7 +6,7 @@ ALTER DEFAULT SCHEMA The ``ALTER DEFAULT SCHEMA`` command can be used to change a role's default schema. The default schema in SQream is ``public``. -For more information, see :ref:`create_schema` and :ref:`drop_schema`. +For more information, see :ref:`create_schema`, :ref:`drop_schema`, and :ref:`rename_schema`. diff --git a/reference/sql/sql_statements/ddl_commands/alter_table.rst b/reference/sql/sql_statements/ddl_commands/alter_table.rst index 6afb84be1..752125ad2 100644 --- a/reference/sql/sql_statements/ddl_commands/alter_table.rst +++ b/reference/sql/sql_statements/ddl_commands/alter_table.rst @@ -1,17 +1,23 @@ .. _alter_table: -********************** +*********** ALTER TABLE -********************** -You can use the ``ALTER TABLE`` command to make schema changes to a table, and can be used in conjunction with several sub-commands. +*********** -Locks -======= -Making changes to a schema makes an exclusive lock on tables. While these operations do not typically take much time, other statements may have to wait until the schema changes are completed. +You can use the ``ALTER TABLE`` command to: + +* Add, drop, and rename table columns +* Rename tables +* Add and reorder table clustering keys +* Drop table clustering keys + +Usage Note +========== + +Making changes to a schema makes an :ref:`exclusive lock` on tables. While these operations do not typically take much time, other statements may have to wait until the schema changes are completed. Sub-Commands -============== -The following table shows the sub-commands that can be used with the ``ALTER TABLE`` command: +============ .. list-table:: :widths: auto diff --git a/reference/sql/sql_statements/ddl_commands/cluster_by.rst b/reference/sql/sql_statements/ddl_commands/cluster_by.rst index 1a6972b1e..1d1aa4ae2 100644 --- a/reference/sql/sql_statements/ddl_commands/cluster_by.rst +++ b/reference/sql/sql_statements/ddl_commands/cluster_by.rst @@ -1,38 +1,35 @@ .. _cluster_by: -********************** +********** CLUSTER BY -********************** +********** -``CLUSTER BY`` can be used to change clustering keys in a table. +The ``CLUSTER BY`` command is used for changing clustering keys in a table. - -Read our :ref:`data_clustering` guide for more information. - -See also: :ref:`drop_clustering_key`, :ref:`create_table`. - - -Permissions -============= - -The role must have the ``DDL`` permission at the database or table level. +For more information, see :ref:`drop_clustering_key`, :ref:`create_table` Syntax -========== +====== .. code-block:: postgres - alter_table_rename_table_statement ::= - ALTER TABLE [schema_name.]table_name CLUSTER BY column_name [, ...] - ; + ALTER TABLE [schema_name.]table_name CLUSTER BY column_name [, ...] - table_name ::= identifier + + create_table_statement ::= + CREATE [ OR REPLACE ] TABLE [schema_name.]table_name ( + { column_def [, ...] } + ) + [ CLUSTER BY { column_name [, ...] } ] - column_name ::= identifier + column_def :: = { column_name type_name [ default ] [ column_constraint ] } + table_name ::= identifier + + column_name ::= identifier Parameters -============ +========== .. list-table:: :widths: auto @@ -41,29 +38,38 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified + * - ``OR REPLACE`` + - Creates a new tables and overwrites any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check the table contents or structure, only the table name + * - ``column_def`` + - A comma separated list of column definitions. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally * - ``table_name`` - - The table name to apply the change to. + - The table name to apply the change to * - ``column_name [, ... ]`` - Comma separated list of columns to create clustering keys for +Usage Notes +=========== -Usage notes -================= +* Clustering by ``TEXT`` columns is not supported -Removing clustering keys does not affect existing data. +* Removing clustering keys does not affect existing data -To force data to re-cluster, the table has to be recreated (i.e. with :ref:`create_table_as`). +* To force data to re-cluster, the table has to be recreated (i.e. with :ref:`create_table`). -Examples -=========== +Example +======= -Reclustering a table ------------------------------------------ +Reclustering a Table +-------------------- .. code-block:: postgres - ALTER TABLE public.users CLUSTER BY start_date; + ALTER TABLE + public.users CLUSTER BY start_date; +Permissions +=========== +The role must have the ``DDL`` permission at the database or table level. \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/create_database.rst b/reference/sql/sql_statements/ddl_commands/create_database.rst index db7ffda3f..a0496feb2 100644 --- a/reference/sql/sql_statements/ddl_commands/create_database.rst +++ b/reference/sql/sql_statements/ddl_commands/create_database.rst @@ -4,7 +4,7 @@ CREATE DATABASE ***************** -``CREATE DATABASE`` creates a new database in SQream DB +``CREATE DATABASE`` creates a new database in SQream. Permissions ============= diff --git a/reference/sql/sql_statements/ddl_commands/create_external_table.rst b/reference/sql/sql_statements/ddl_commands/create_external_table.rst deleted file mode 100644 index fc05ca71e..000000000 --- a/reference/sql/sql_statements/ddl_commands/create_external_table.rst +++ /dev/null @@ -1,156 +0,0 @@ -.. _create_external_table: - -*********************** -CREATE EXTERNAL TABLE -*********************** - -.. warning:: - - The ``CREATE EXTERNAL TABLE`` syntax is deprecated, and will be removed in future versions. - - Starting with SQream DB v2020.2, external tables have been renamed to :ref:`foreign tables`, and use a more flexible foreign data wrapper concept. See :ref:`create_foreign_table` instead. - - Upgrading to a new version of SQream DB converts existing tables automatically. When creating a new external tables, use the new foreign table syntax. - - -``CREATE TABLE`` creates a new external table in an existing database. - -See more in the :ref:`External tables guide`. - -.. tip:: - - * Data in an external table can change if the sources change, and frequent access to remote files may harm performance. - - * To create a regular table, see :ref:`CREATE TABLE ` - -Permissions -============= - -The role must have the ``CREATE`` permission at the database level. - -Syntax -========== - -.. code-block:: postgres - - create_table_statement ::= - CREATE [ OR REPLACE ] EXTERNAL TABLE [schema_name].table_name ( - { column_def [, ...] } - ) - USING FORMAT format_def - WITH { external_table_option [ ...] } - ; - - schema_name ::= identifier - - table_name ::= identifier - - format_def ::= { PARQUET | ORC | CSV } - - external_table_option ::= { - PATH '{ path_spec }' - | FIELD DELIMITER '{ field_delimiter }' - | RECORD DELIMITER '{ record_delimiter }' - | AWS_ID '{ AWS ID }' - | AWS_SECRET '{ AWS SECRET }' - } - - path_spec ::= { local filepath | S3 URI | HDFS URI } - - field_delimiter ::= delimiter_character - - record_delimiter ::= delimiter_character - - column_def ::= { column_name type_name [ default ] [ column_constraint ] } - - column_name ::= identifier - - column_constraint ::= - { NOT NULL | NULL } - - default ::= - - DEFAULT default_value - | IDENTITY [ ( start_with [ , increment_by ] ) ] - -.. _cet_parameters: - -Parameters -============ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Parameter - - Description - * - ``OR REPLACE`` - - Create a new table, and overwrite any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check the table contents or structure, only the table name. - * - ``schema_name`` - - The name of the schema in which to create the table. - * - ``table_name`` - - The name of the table to create, which must be unique inside the schema. - * - ``column_def`` - - A comma separated list of column definitions. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally. - * - ``USING FORMAT ...`` - - Specifies the format of the source files, such as ``PARQUET``, ``ORC``, or ``CSV``. - * - ``WITH PATH ...`` - - Specifies a path or URI of the source files, such as ``/path/to/*.parquet``. - * - ``FIELD DELIMITER`` - - Specifies the field delimiter for CSV files. Defaults to ``,``. - * - ``RECORD DELIMITER`` - - Specifies the record delimiter for CSV files. Defaults to a newline, ``\n`` - * - ``AWS_ID``, ``AWS_SECRET`` - - Credentials for authenticated S3 access - - -Examples -=========== - -A simple table from Tab-delimited file (TSV) ----------------------------------------------- - -.. code-block:: postgres - - CREATE OR REPLACE EXTERNAL TABLE cool_animals - (id INT NOT NULL, name VARCHAR(30) NOT NULL, weight FLOAT NOT NULL) - USING FORMAT csv - WITH PATH '/home/rhendricks/cool_animals.csv' - FIELD DELIMITER '\t'; - - -A table from a directory of Parquet files on HDFS ------------------------------------------------------ - -.. code-block:: postgres - - CREATE EXTERNAL TABLE users - (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL) - USING FORMAT Parquet - WITH PATH 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet'; - -A table from a bucket of files on S3 --------------------------------------- - -.. code-block:: postgres - - CREATE EXTERNAL TABLE users - (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL) - USING FORMAT Parquet - WITH PATH 's3://pp-secret-bucket/users/*.parquet' - AWS_ID 'our_aws_id' - AWS_SECRET 'our_aws_secret'; - - -Changing an external table to a regular table ------------------------------------------------- - -Materializes an external table into a regular table. - -.. tip: Using an external table allows you to perform ETL-like operations in SQream DB by applying SQL functions and operations to raw files - -.. code-block:: postgres - - CREATE TABLE real_table - AS SELECT * FROM external_table; - diff --git a/reference/sql/sql_statements/ddl_commands/create_foreign_table.rst b/reference/sql/sql_statements/ddl_commands/create_foreign_table.rst index d50e13380..d5aa2c4d1 100644 --- a/reference/sql/sql_statements/ddl_commands/create_foreign_table.rst +++ b/reference/sql/sql_statements/ddl_commands/create_foreign_table.rst @@ -1,82 +1,60 @@ .. _create_foreign_table: -*********************** +******************** CREATE FOREIGN TABLE -*********************** +******************** -.. note:: - - Starting with SQream DB v2020.2, external tables have been renamed to foreign tables, and use a more flexible foreign data wrapper concept. - - Upgrading to a new version of SQream DB converts existing external tables automatically. - - -``CREATE FOREIGN TABLE`` creates a new foreign table in an existing database. - -See more in the :ref:`Foreign tables guide`. - -.. tip:: - - * Data in a foreign table can change if the sources change, and frequent access to remote files may harm performance. +The ``CREATE FOREIGN TABLE`` command creates a foreign table that references data stored externally to SQreamDB. This allows for querying data located in files on a file system, :ref:`external storage platforms`, or other databases. - * To create a regular table, see :ref:`CREATE TABLE ` -Permissions -============= - -The role must have the ``CREATE`` permission at the database level. +When querying data stored in file formats that support metadata, such as Parquet, ORC, JSON, and Avro, it is possible to omit the DDL when creating a foreign table. SQreamDB can read the file metadata, enabling the automatic inference of column structure and data types. Syntax -========== +====== .. code-block:: postgres - create_table_statement ::= - CREATE [ OR REPLACE ] FOREIGN TABLE [schema_name].table_name ( - { column_def [, ...] } - ) - [ FOREIGN DATA ] WRAPPER fdw_name - [ OPTIONS ( option_def [, ... ] ) ] - ; + CREATE [ OR REPLACE ] FOREIGN TABLE [ "" ]."" ( + [ column_def [, ...] ] -- When creating foreign tables using CSV source files, it is mandatory to provide the complete table DDL + ) + [ FOREIGN DATA ] WRAPPER fdw_name + [ OPTIONS ( option_def [, ... ] ) ] - schema_name ::= identifier - - table_name ::= identifier - - fdw_name ::= - { csv_fdw | orc_fdw | parquet_fdw } - - option_def ::= - { - LOCATION = '{ path_spec }' - | DELIMITER = '{ field_delimiter }' -- for CSV only - | RECORD_DELIMITER = '{ record_delimiter }' -- for CSV only - | AWS_ID '{ AWS ID }' - | AWS_SECRET '{ AWS SECRET }' - } + fdw_name ::= + { csv_fdw | orc_fdw | parquet_fdw } - path_spec ::= { local filepath | S3 URI | HDFS URI } + option_def ::= + LOCATION = '{ path_spec }' + [ + | DELIMITER = '{ field_delimiter }' -- for CSV only + | RECORD_DELIMITER = '{ record_delimiter }' -- for CSV only + | AWS_ID '{ AWS ID }' + | AWS_SECRET '{ AWS SECRET }' + | QUOTE = {'C' | E'\ooo') -- for CSV only + ] - field_delimiter ::= delimiter_character + path_spec ::= { GS URI | S3 URI | HDFS URI } - record_delimiter ::= delimiter_character + field_delimiter ::= delimiter_character + + record_delimiter ::= delimiter_character - column_def ::= - { column_name type_name [ default ] [ column_constraint ] } + column_def ::= + { column_name type_name [ default ] [ column_constraint ] } - column_name ::= identifier + column_name ::= identifier - column_constraint ::= - { NOT NULL | NULL } + column_constraint ::= + { NOT NULL | NULL } - default ::= - DEFAULT default_value - | IDENTITY [ ( start_with [ , increment_by ] ) ] + default ::= + DEFAULT default_value + | IDENTITY [ ( start_with [ , increment_by ] ) ] .. _cft_parameters: Parameters -============ +========== .. list-table:: :widths: auto @@ -85,81 +63,182 @@ Parameters * - Parameter - Description * - ``OR REPLACE`` - - Create a new table, and overwrite any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check the table contents or structure, only the table name. + - Create a new table, and overwrite any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check the table contents or structure, only the table name * - ``schema_name`` - - The name of the schema in which to create the table. + - The name of the schema in which to create the table * - ``table_name`` - - The name of the table to create, which must be unique inside the schema. + - The name of the table to create, which must be unique inside the schema * - ``column_def`` - - A comma separated list of column definitions. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally. + - A comma separated list of column definitions. A minimal column definition includes a name and datatype. Other column constraints and default values may optionally be added. When creating foreign tables using CSV source files, it is mandatory to provide the complete table DDL * - ``WRAPPER ...`` - - Specifies the format of the source files, such as ``parquet_fdw``, ``orc_fdw``, or ``csv_fdw``. + - Specifies the format of the source files, such as ``parquet_fdw``, ``orc_fdw``, ``json_fdw``, or ``csv_fdw`` * - ``LOCATION = ...`` - - Specifies a path or URI of the source files, such as ``/path/to/*.parquet``. + - Specifies a path or URI of the source files, such as ``/path/to/*.parquet`` * - ``DELIMITER = ...`` - - Specifies the field delimiter for CSV files. Defaults to ``,``. + - Specifies the field delimiter for CSV files. Defaults to ``,`` * - ``RECORD_DELIMITER = ...`` - Specifies the record delimiter for CSV files. Defaults to a newline, ``\n`` * - ``AWS_ID``, ``AWS_SECRET`` - Credentials for authenticated S3 access + * - ``OFFSET`` + - Used to specify the number of rows to skip from the beginning of the result set + * - ``CONTINUE_ON_ERROR`` + - Specifies if errors should be ignored or skipped. When set to ``true``, the transaction continues despite rejected data and rows containing partially faulty data are skipped entirely. This parameter should be set together with ``ERROR_COUNT``. When reading multiple files, if an entire file can’t be opened it will be skipped. Default value: ``false``. Value range: ``true`` or ``false`` + * - ``ERROR_COUNT`` + - Specifies the threshold for the maximum number of faulty records that will be ignored. This setting must be used in conjunction with ``CONTINUE_ON_ERROR``. Default value: ``unlimited``. Value range: 1 to 2147483647 + * - ``QUOTE`` + - Specifies an alternative quote character. The quote character must be a single, 1-byte printable ASCII character, and the equivalent octal syntax of the copy command can be used. The quote character cannot be contained in the field delimiter, the record delimiter, or the null marker. QUOTE can be used with ``csv_fdw`` in ``COPY FROM`` and foreign tables. The following characters cannot be an alternative quote character: ``"-.:\\0123456789abcdefghijklmnopqrstuvwxyzN"`` + +Usage Notes +=========== +* When creating foreign tables from CSV files, it is required to provide a table DDL. + +* When creating a foreign table using the ``*`` wildcard, SQreamDB assumes that all files in the path use the same schema. Examples -=========== +======== -A simple table from Tab-delimited file (TSV) ----------------------------------------------- +Creating a Tab-Delimited Table +------------------------------ .. code-block:: postgres - CREATE OR REPLACE FOREIGN TABLE cool_animals - (id INT NOT NULL, name VARCHAR(30) NOT NULL, weight FLOAT NOT NULL) - WRAPPER csv_fdw - OPTIONS - ( LOCATION = '/home/rhendricks/cool_animals.csv', - DELIMITER = '\t' - ) - ; - - -A table from a directory of Parquet files on HDFS ------------------------------------------------------ + CREATE + OR REPLACE FOREIGN TABLE nba_new( + "player_name" text null, + "team_name" text null, + "jersey_number" int null, + "position" text null, + "age" int null, + "height" text null, + "weight" int null, + "college" text null, + "salary" int null + ) + WRAPPER + csv_fdw + OPTIONS + (LOCATION = 'gs://blue_docs/nba.csv', + DELIMITER = '\t' + ); + + +Creating a Table Located In a HDFS Directory +-------------------------------------------- .. code-block:: postgres - CREATE FOREIGN TABLE users - (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL) - WRAPPER parquet_fdw - OPTIONS - ( - LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet' - ); - -A table from a bucket of ORC files on S3 ------------------------------------------- + CREATE FOREIGN TABLE users ( + id INT NOT NULL, + name TEXT(30) NOT NULL, + email TEXT(50) NOT NULL + ) + WRAPPER + parquet_fdw + OPTIONS + ( + LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet' + ); + +Creating a Table Located Within a S3 Bucket of ORC Files +-------------------------------------------------------- .. code-block:: postgres - CREATE FOREIGN TABLE users - (id INT NOT NULL, name VARCHAR(30) NOT NULL, email VARCHAR(50) NOT NULL) - WRAPPER orc_fdw - OPTIONS - ( - LOCATION = 's3://pp-secret-bucket/users/*.orc', - AWS_ID = 'our_aws_id', - AWS_SECRET = 'our_aws_secret' - ); + CREATE FOREIGN TABLE users ( + id INT NOT NULL, + name TEXT(30) NOT NULL, + email TEXT(50) NOT NULL + ) + WRAPPER + orc_fdw + OPTIONS + ( + LOCATION = 's3://pp-secret-bucket/users/*.orc', + AWS_ID = 'our_aws_id', + AWS_SECRET = 'our_aws_secret' + ); -Changing a foreign table to a regular table ------------------------------------------------- +Converting a Foreign Table to an Internal Table +----------------------------------------------- -Materializes a foreign table into a regular table. - -.. tip: Using a foreign table allows you to perform ETL-like operations in SQream DB by applying SQL functions and operations to raw files +Using a foreign table allows you to perform ETL-like operations by applying SQL functions and operations to raw files. .. code-block:: postgres - CREATE TABLE real_table - AS SELECT * FROM some_foreign_table; + CREATE TABLE + real_table AS + SELECT + * + FROM + some_foreign_table; + +Using the ``OFFSET`` Parameter +------------------------------ + +The ``OFFSET`` parameter may be used with Parquet and CSV textual formats. + +.. code-block:: + + CREATE FOREIGN TABLE users7 ( + id INT NOT NULL, + name TEXT NOT NULL, + email TEXT NOT NULL + ) + WRAPPER + parquet_fdw + OPTIONS + ( + LOCATION = 'hdfs://hadoop-nn.piedpiper.com/rhendricks/users/*.parquet', + OFFSET = 2 + ); + +Using the ``CONTINUE_ON_ERROR`` and ``ERROR_COUNT`` Parameters +---------------------------------------------------------------- + +.. code-block:: + + CREATE + OR REPLACE FOREIGN TABLE cool_animalz ( + id INT NOT NULL, + name TEXT NOT NULL, + weight FLOAT NOT NULL + ) + WRAPPER + csv_fdw + OPTIONS + ( + LOCATION = '/home/rhendricks/cool_animals.csv', + DELIMITER = '\t', + CONTINUE_ON_ERROR = true, + ERROR_COUNT = 3 + ); + +Customizing Quotations Using Alternative Characters +--------------------------------------------------- + +.. code-block:: + + CREATE + OR REPLACE FOREIGN TABLE cool_animalz ( + id INT NOT NULL, + name text(30) NOT NULL, + weight FLOAT NOT NULL + ) + WRAPPER + csv_fdw + OPTIONS + ( + LOCATION = '/home/rhendricks/cool_animals.csv', + DELIMITER = '\t', + QUOTE = '@' + ); + +Permissions +=========== + +The role must have the ``CREATE`` permission at the database level. +The automatic foreign table DDL resolution feature requires **Read** permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/create_function.rst b/reference/sql/sql_statements/ddl_commands/create_function.rst index 339543a0a..d28da9784 100644 --- a/reference/sql/sql_statements/ddl_commands/create_function.rst +++ b/reference/sql/sql_statements/ddl_commands/create_function.rst @@ -4,7 +4,7 @@ CREATE FUNCTION ***************** -``CREATE FUNCTION`` creates a new user-defined function (UDF) in an existing database. +``CREATE FUNCTION`` creates a new user-defined function (UDF) in an existing database. See more in our :ref:`Python UDF (user-defined functions)` guide. @@ -52,7 +52,7 @@ Parameters * - ``argument_list`` - A comma separated list of column definitions. A column definition includes a name identifier and a datatype. * - ``return_type`` - - The SQL datatype of the return value, such as ``INT``, ``VARCHAR``, etc. + - The SQL datatype of the return value, such as ``INT``, ``TEXT``, etc. * - ``function_body`` - Python code, dollar-quoted (``$$``). diff --git a/reference/sql/sql_statements/ddl_commands/create_schema.rst b/reference/sql/sql_statements/ddl_commands/create_schema.rst index e85f328a9..dfa471dd3 100644 --- a/reference/sql/sql_statements/ddl_commands/create_schema.rst +++ b/reference/sql/sql_statements/ddl_commands/create_schema.rst @@ -1,9 +1,10 @@ .. _create_schema: -***************** +************* CREATE SCHEMA -***************** -The **CREATE SCHEMA** page describes the following: +************* + +The **CREATE SCHEMA** page describes the following: .. contents:: @@ -11,7 +12,7 @@ The **CREATE SCHEMA** page describes the following: :depth: 2 Overview -============ +======== ``CREATE SCHEMA`` creates a new schema in an existing database. A schema is a virtual space for storing tables. @@ -23,30 +24,33 @@ The **CREATE SCHEMA** statement can be used to query tables from different schem .. code-block:: postgres - select .table_name.column_name from .table_name + SELECT .table_name.column_name + FROM .table_name -See also: :ref:`drop_schema`, :ref:`alter_default_schema`. +See also: :ref:`drop_schema`, :ref:`alter_default_schema`, :ref:`rename_schema`. Permissions -============= +=========== The role must have the ``CREATE`` permission at the database level. Syntax -========== +====== + The following example shows the correct syntax for creating a schema: .. code-block:: postgres - create_schema_statement ::= - CREATE SCHEMA schema_name - ; + CREATE SCHEMA [database_name.]schema_name + - schema_name ::= identifier + schema_name ::= identifier + database_name ::= identifier Parameters -============ +========== + The following table shows the ``schema_name`` parameters: .. list-table:: @@ -59,7 +63,8 @@ The following table shows the ``schema_name`` parameters: - The name of the schema to create. Examples -=========== +======== + This section includes the following examples: .. contents:: @@ -68,7 +73,8 @@ This section includes the following examples: Creating a Schema --------------------- +----------------- + The following example shows an example of the syntax for creating a schema: .. code-block:: postgres @@ -80,7 +86,8 @@ The following example shows an example of the syntax for creating a schema: SELECT * FROM staging.users; Altering the Default Schema for a Role ------------------------------------------ +-------------------------------------- + The following example shows an example of the syntax for altering the default schema for a role: .. code-block:: postgres diff --git a/reference/sql/sql_statements/ddl_commands/create_table.rst b/reference/sql/sql_statements/ddl_commands/create_table.rst index b660e442c..111e15229 100644 --- a/reference/sql/sql_statements/ddl_commands/create_table.rst +++ b/reference/sql/sql_statements/ddl_commands/create_table.rst @@ -1,29 +1,29 @@ .. _create_table: -***************** +************ CREATE TABLE -***************** +************ The ``CREATE TABLE`` statement is used to create a new table in an existing database. -.. tip:: - * To create a table based on the result of a select query, see :ref:`CREATE TABLE AS `. - * To create a table based on files like Parquet and ORC, see :ref:`CREATE FOREIGN TABLE ` - +See also: :ref:`CREATE TABLE AS `, :ref:`CREATE FOREIGN TABLE ` +.. contents:: + :local: + :depth: 1 Syntax -========== -The following is the correct syntax for creating a table: +====== .. code-block:: postgres create_table_statement ::= - CREATE [ OR REPLACE ] TABLE [schema_name.]table_name ( - { column_def [, ...] } - ) - [ CLUSTER BY { column_name [, ...] } ] - ; + CREATE [ OR REPLACE ] TABLE [.] + { + ( [, ...] [{NULL | NOT NULL}] + | LIKE [INCLUDE PERMISSIONS] + } + [ CLUSTER BY [, ...] ] schema_name ::= identifier @@ -32,16 +32,15 @@ The following is the correct syntax for creating a table: column_def :: = { column_name type_name [ default ] [ column_constraint ] } column_name ::= identifier - - column_constraint ::= - { NOT NULL | NULL } + default ::= DEFAULT default_value | IDENTITY [ ( start_with [ , increment_by ] ) ] Parameters -============ +========== + The following parameters can be used when creating a table: .. list-table:: @@ -51,37 +50,47 @@ The following parameters can be used when creating a table: * - Parameter - Description * - ``OR REPLACE`` - - Creates a new tables and overwrites any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check the table contents or structure, only the table name. + - Creates a new table and overwrites any existing table by the same name. Does not return an error if the table already exists. ``CREATE OR REPLACE`` does not check table contents or structure, only the table name * - ``schema_name`` - - The name of the schema in which to create the table. + - The name of the schema in which to create the table * - ``table_name`` - - The name of the table to create, which must be unique inside the schema. + - The name of the table to create, which must be unique inside the schema * - ``column_def`` - - A comma separated list of column definitions. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally. + - A comma separated list of column definitions. A minimal column definition includes a name identifier and a datatype. Other column constraints and default values can be added optionally + * - ``LIKE`` + - Duplicates the column structure of an existing table. The newly created table is granted default ``CREATE TABLE`` permissions: ``SELECT``, ``INSERT``, ``DELETE``, ``DDL``, and ``UPDATE`` + * - ``INCLUDE PERMISSIONS`` + - In addition to the default ``CREATE TABLE`` permissions (``SELECT``, ``INSERT``, ``DELETE``, ``DDL``, and ``UPDATE``), the newly created table is granted the source table existing permissions * - ``CLUSTER BY column_name1 ...`` - - A commma separated list of clustering column keys. + A comma separated list of clustering column keys - See :ref:`data_clustering` for more information. - * - ``LIKE`` - - Duplicates the column structure of an existing table. - - + See :ref:`cluster_by` for more information + + +Usage Notes +=========== + +When using ``CREATE TABLE... LIKE``, the permissions from the source table are inherited by the newly created table. To add extra permissions to the new table, you can utilize the ``INCLUDE PERMISSIONS`` clause. + .. _default_values: Default Value Constraints -=============== +========================= -The ``DEFAULT`` value constraint specifies a value to use if one is not defined in an :ref:`insert` or :ref:`copy_from` statement. +The ``DEFAULT`` value constraint specifies a default value to use if none is provided in an :ref:`insert` or :ref:`copy_from` statement. This value can be a literal or ``NULL``. It's worth noting that even for nullable columns, you can still explicitly insert a ``NULL`` value using the ``NULL`` keyword, as demonstrated in the example: -The value may be either a literal or **GETDATE()**, which is evaluated at the time the row is created. +.. code-block:: postgres -.. note:: The ``DEFAULT`` constraint only applies if the column does not have a value specified in the :ref:`insert` or :ref:`copy_from` statement. You can still insert a ``NULL`` into an nullable column by explicitly inserting ``NULL``. For example, ``INSERT INTO cool_animals VALUES (1, 'Gnu', NULL)``. + INSERT INTO + cool_animals + VALUES + (1, 'Gnu', NULL); Syntax ---------- -The following is the correct syntax for using the **DEFAULT** value constraints: +------ +The following is the correct syntax for using the **DEFAULT** value constraints: .. code-block:: postgres @@ -92,7 +101,9 @@ The following is the correct syntax for using the **DEFAULT** value constraints: default ::= DEFAULT default_value - | IDENTITY [ ( start_with [ , increment_by ] ) ] + | IDENTITY [ ( start_with [ , increment_by ] ) ] [ check_specification ] + | check_specification [ IDENTITY [ ( start_with [ , increment_by ] ) ] + check_specification ::= CHECK( 'CS compression_spec' ) @@ -104,7 +115,8 @@ The following is the correct syntax for using the **DEFAULT** value constraints: .. _identity: Identity ------------------------ +-------- + The ``Identity`` (or sequence) columns can be used for generating key values. Some databases call this ``AUTOINCREMENT``. The **identity** property on a column guarantees that each new row inserted is generated based on the current seed & increment. @@ -126,36 +138,32 @@ The following table describes the identity parameters: - Incremental value that is added to the identity value of the previous row that was loaded. Examples -=========== -This section includes the following examples: +======== .. contents:: :local: :depth: 1 Creating a Standard Table ------------------ -The following is an example of the syntax used to create a standard table: +-------------------------- .. code-block:: postgres CREATE TABLE cool_animals ( id INT NOT NULL, - name varchar(30) NOT NULL, + name text(30) NOT NULL, weight FLOAT, is_agressive BOOL ); Creating a Table with Default Value Constraints for Some Columns ---------------------------------------------------- -The following is an example of the syntax used to create a table with default value constraints for some columns: - +---------------------------------------------------------------- .. code-block:: postgres CREATE TABLE cool_animals ( id INT NOT NULL, - name varchar(30) NOT NULL, + name text(30) NOT NULL, weight FLOAT, is_agressive BOOL DEFAULT false NOT NULL ); @@ -163,72 +171,73 @@ The following is an example of the syntax used to create a table with default va .. note:: The nullable/non-nullable constraint appears at the end, after the default option Creating a Table with an Identity Column ---------------------------------------------------- -The following is an example of the syntax used to create a table with an identity (auto-increment) column: - +---------------------------------------- .. code-block:: postgres CREATE TABLE users ( id BIGINT IDENTITY(0,1) NOT NULL , -- Start with 0, increment by 1 - name VARCHAR(30) NOT NULL, - country VARCHAR(30) DEFAULT 'Unknown' NOT NULL + name TEXT(30) NOT NULL, + country TEXT(30) DEFAULT 'Unknown' NOT NULL ); -.. note:: - * Identity columns are supported on ``BIGINT`` columns. - - * Identity does not enforce the uniqueness of values. The identity value can be bypassed by specifying it in an :ref:`insert` command. +.. note:: Identity does not enforce the uniqueness of values. The identity value can be bypassed by specifying it in an :ref:`insert` command. -Creating a Table from a SELECT Query ------------------------------------------ -The following is an example of the syntax used to create a table from a SELECT query: +Creating a Table from a ``SELECT`` Query +---------------------------------------- .. code-block:: postgres - CREATE TABLE users_uk AS SELECT * FROM users WHERE country = 'United Kingdom'; + CREATE TABLE + users_uk AS + SELECT + * + FROM + users + WHERE + country = 'United Kingdom'; -For more information on creating a new table from the results of a SELECT query, see :ref:`CREATE TABLE AS `. +For more information on creating a new table from the results of a ``SELECT`` query, see :ref:`CREATE TABLE AS `. Creating a Table with a Clustering Key ----------------------------------------------- -When data in a table is stored in a sorted order, the sorted columns are considered clustered. Good clustering can have a significant positive impact on performance. +-------------------------------------- -In the following example, we expect the ``start_date`` column to be naturally clustered, as new users sign up and get a newer start date. -When the clustering key is set, if the incoming data isn’t naturally clustered, it will be clustered by SQream DB during insert or bulk load. +When data within a table is organized in a sorted manner, the columns responsible for this sorting are termed as clustered. Effective clustering can greatly enhance performance. For instance, in the scenario provided, the ``start_date`` column is anticipated to naturally cluster due to the continuous influx of new users and their corresponding start dates. However, in cases where the clustering of incoming data isn't inherent, SQreamDB will automatically cluster it during insertion or bulk loading processes once the clustering key is set. The following is an example of the syntax used to create a table with a clustering key: .. code-block:: postgres CREATE TABLE users ( - name VARCHAR(30) NOT NULL, + name TEXT(30) NOT NULL, start_date datetime not null, - country VARCHAR(30) DEFAULT 'Unknown' NOT NULL + country TEXT(30) DEFAULT 'Unknown' NOT NULL ) CLUSTER BY start_date; -For more information on data clustering, see :ref:`data_clustering`. +For more information on data clustering, see :ref:`cluster_by`. Duplicating the Column Structure of an Existing Table ------------------ +----------------------------------------------------- Syntax -************ +****** + The following is the correct syntax for duplicating the column structure of an existing table: .. code-block:: postgres - CREATE [OR REPLACE] TABLE table_name + CREATE [OR REPLACE] TABLE { - (column_name column_type [{NULL | NOT NULL}] [,...]) - | LIKE source_table_name + ( [{NULL | NOT NULL}] [,...]) + | LIKE [INCLUDE PERMISSIONS] } [CLUSTER BY ...] ; Examples -************** +******** + This section includes the following examples of duplicating the column structure of an existing table using the ``LIKE`` clause: .. contents:: @@ -236,15 +245,17 @@ This section includes the following examples of duplicating the column structure :depth: 3 Creating a Table Using an Explicit Column List -~~~~~~~~~~~~ -The following is an example of creating a table using an explict column list: +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following is an example of creating a table using an explicit column list: .. code-block:: postgres CREATE TABLE t1(x int default 0 not null, y text(10) null); Creating a Second Table Based on the Structure of Another Table -~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Either of the following examples can be used to create a second table based on the structure of another table. **Example 1** @@ -261,9 +272,10 @@ Either of the following examples can be used to create a second table based on t The generated output of both of the statements above is identical. -Creating a Table based on External Tables and Views -~~~~~~~~~~~~ -The following is example of creating a table based on external tables and views: +Creating a Table based on Foreign Tables and Views +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following is an example of creating a table based on foreign tables and views: .. code-block:: postgres @@ -271,8 +283,27 @@ The following is example of creating a table based on external tables and views: CREATE VIEW v as SELECT x+1,y,y || 'abc' from t1; CREATE TABLE t3 LIKE v; -When duplicating the column structure of an existing table, the target table of the ``LIKE`` clause can be a regular or an external table, or a view. +When duplicating the column structure of an existing table, the target table of the ``LIKE`` clause can be either a native, a regular, or an external table, or a view. + +The following table describes which properties are copied from the target table to the newly created table: + ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| **Property** | **Native Table** | **External Table** | **View** | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| Column names | Copied | Copied | Copied | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| Column types | Copied | Copied | Copied | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| ``NULL``/``NOT NULL`` | Copied | Copied | Copied | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| ``text`` length constraints | Copied | Copied | Does not exist in source object | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| Compression specification | Copied | Does not exist in source object | Does not exist in source object | ++-----------------------------+------------------+---------------------------------+---------------------------------+ +| Default/identity | Copied | Does not exist in source object | Does not exist in source object | ++-----------------------------+------------------+---------------------------------+---------------------------------+ Permissions ============= -The role must have the ``CREATE`` permission at the schema level. + +``CREATE TABLE`` requires ``CREATE`` permission at the schema level. diff --git a/reference/sql/sql_statements/ddl_commands/create_table_as.rst b/reference/sql/sql_statements/ddl_commands/create_table_as.rst index a7f9dd4d4..7ffd565df 100644 --- a/reference/sql/sql_statements/ddl_commands/create_table_as.rst +++ b/reference/sql/sql_statements/ddl_commands/create_table_as.rst @@ -3,7 +3,7 @@ ***************** CREATE TABLE AS ***************** - + The ``CREATE TABLE AS`` commands creates a new table from the result of a select query. @@ -64,6 +64,8 @@ This section includes the following examples: :local: :depth: 1 +.. warning:: The ``SELECT`` statement decrypts information by default. When executing ``CREATE TABLE AS SELECT``, encrypted information will appear as clear text in the newly created table. + Creating a Copy of a Foreign Table or View --------------------------------------------------------------------------- diff --git a/reference/sql/sql_statements/ddl_commands/create_view.rst b/reference/sql/sql_statements/ddl_commands/create_view.rst index 9812ddeec..4c6a98427 100644 --- a/reference/sql/sql_statements/ddl_commands/create_view.rst +++ b/reference/sql/sql_statements/ddl_commands/create_view.rst @@ -3,7 +3,7 @@ ***************** CREATE VIEW ***************** - + ``CREATE VIEW`` creates a new view in an existing database. A view is a virtual table. .. tip:: diff --git a/reference/sql/sql_statements/ddl_commands/drop_clustering_key.rst b/reference/sql/sql_statements/ddl_commands/drop_clustering_key.rst index 41b10bdfa..42c457c3f 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_clustering_key.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_clustering_key.rst @@ -1,34 +1,26 @@ .. _drop_clustering_key: -********************** +******************* DROP CLUSTERING KEY -********************** - +******************* + ``DROP CLUSTERING KEY`` drops all clustering keys in a table. -Read our :ref:`data_clustering` guide for more information. - -See also: :ref:`cluster_by`, :ref:`create_table`. - - -Permissions -============= +Read our :ref:`cluster_by` guide for more information. -The role must have the ``DDL`` permission at the database or table level. +See also :ref:`create_table` Syntax -========== +====== .. code-block:: postgres - alter_table_rename_table_statement ::= - ALTER TABLE [schema_name.]table_name DROP CLUSTERING KEY - ; + ALTER TABLE [schema_name.]table_name DROP CLUSTERING KEY - table_name ::= identifier + table_name ::= identifier Parameters -============ +========== .. list-table:: :widths: auto @@ -37,28 +29,29 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified * - ``table_name`` - - The table name to apply the change to. + - The table name to apply the change to Usage notes -================= - -Removing clustering keys does not affect existing data. - -To force data to re-cluster, the table has to be recreated (i.e. with :ref:`create_table_as`). - +=========== +* Removing clustering keys does not affect existing data +* To force data to re-cluster, the table has to be recreated (i.e. with :ref:`create_table_as`) Examples -=========== +======== -Dropping clustering keys in a table ------------------------------------------ +Dropping Clustering Keys in a Table +----------------------------------- .. code-block:: postgres - ALTER TABLE public.users DROP CLUSTERING KEY + ALTER TABLE + public.users DROP CLUSTERING KEY +Permissions +=========== +The role must have the ``DDL`` permission at the database or table level. diff --git a/reference/sql/sql_statements/ddl_commands/drop_column.rst b/reference/sql/sql_statements/ddl_commands/drop_column.rst index 391367e16..0fa54c049 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_column.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_column.rst @@ -1,35 +1,26 @@ .. _drop_column: -********************** +*********** DROP COLUMN -********************** - +*********** + ``DROP COLUMN`` can be used to remove columns from a table. -Permissions -============= - -The role must have the ``DDL`` permission at the database or table level. - Syntax -========== +====== .. code-block:: postgres - alter_table_drop_column_statement ::= - ALTER TABLE [schema_name.]table_name DROP COLUMN column_name - ; - - table_name ::= identifier - - schema_name ::= identifier - - column_name ::= identifier + ALTER TABLE [schema_name.]table_name DROP COLUMN column_name + schema_name ::= identifier + + table_name ::= identifier + column_name ::= identifier Parameters -============ +========== .. list-table:: :widths: auto @@ -38,26 +29,32 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified * - ``table_name`` - - The table name to apply the change to. + - The table name to apply the change to * - ``column_name`` - - The column to remove. + - The column to remove Examples -=========== +======== -Removing a column ------------------------------------------ +Removing a Column +----------------- .. code-block:: postgres - -- Remove the 'weight' column - ALTER TABLE users DROP COLUMN weight; + ALTER TABLE + users DROP COLUMN weight; -Removing a column with a quoted identifier name ----------------------------------------------------- +Removing a Column with a Quoted Identifier Name +----------------------------------------------- .. code-block:: postgres - ALTER TABLE users DROP COLUMN "Weight in kilograms"; \ No newline at end of file + ALTER TABLE + users DROP COLUMN "Weight in kilograms"; + +Permissions +=========== + +The role must have the ``DDL`` permission at the database or table level. \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/drop_database.rst b/reference/sql/sql_statements/ddl_commands/drop_database.rst index 0cfbbcd30..76b75c5b3 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_database.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_database.rst @@ -3,7 +3,7 @@ ********************** DROP DATABASE ********************** - + ``DROP DATABASE`` can be used to remove a database and all of its objects. Permissions @@ -17,7 +17,7 @@ Syntax .. code-block:: postgres drop_database_statement ::= - DROP DATABASE database_name + DROP DATABASE [ IF EXISTS ] database_name ; database_name ::= identifier @@ -35,7 +35,9 @@ Parameters - Description * - ``database_name`` - The name of the database to drop. This can not be the current database in use. - + * - ``IF EXISTS`` + - Drop the database if it exists. No error if the database does not exist. + Examples =========== @@ -60,4 +62,10 @@ The current database in use can't be dropped. Switch to another database first. raviga=> \c master master=> DROP DATABASE raviga; - executed \ No newline at end of file + executed + +.. code-block:: sql + + DROP DATABASE IF EXISTS green_database; + + Status: Ended successfully \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/drop_function.rst b/reference/sql/sql_statements/ddl_commands/drop_function.rst index 726085f9f..98b957ad8 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_function.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_function.rst @@ -3,7 +3,7 @@ ********************** DROP FUNCTION ********************** - + ``DROP FUNCTION`` can be used to remove a user defined function. Permissions diff --git a/reference/sql/sql_statements/ddl_commands/drop_schema.rst b/reference/sql/sql_statements/ddl_commands/drop_schema.rst index c10ab7f8f..79597b49a 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_schema.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_schema.rst @@ -3,14 +3,14 @@ ********************** DROP SCHEMA ********************** - + ``DROP SCHEMA`` can be used to remove a schema. The schema has to be empty before removal. SQream DB does not support dropping a schema with objects. -See also: :ref:`create_schema`, :ref:`alter_default_schema`. +See also: :ref:`create_schema`, :ref:`alter_default_schema`, and :ref:`rename_schema`. Permissions ============= @@ -74,4 +74,4 @@ To drop the schema, drop the schema's tables first, and then drop the schema: t=> DROP TABLE test.bar; executed t=> DROP SCHEMA test; - executed \ No newline at end of file + executed diff --git a/reference/sql/sql_statements/ddl_commands/drop_table.rst b/reference/sql/sql_statements/ddl_commands/drop_table.rst index e2a704ff8..53fdc8445 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_table.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_table.rst @@ -3,7 +3,7 @@ ********************** DROP TABLE ********************** - + ``DROP TABLE`` can be used to remove a table and all of its contents. Permissions diff --git a/reference/sql/sql_statements/ddl_commands/drop_view.rst b/reference/sql/sql_statements/ddl_commands/drop_view.rst index e93629ab4..6e18254e6 100644 --- a/reference/sql/sql_statements/ddl_commands/drop_view.rst +++ b/reference/sql/sql_statements/ddl_commands/drop_view.rst @@ -3,7 +3,7 @@ ********************** DROP VIEW ********************** - + ``DROP VIEW`` can be used to remove a view. Because a view is logical, this does not affect any data in any of the referenced tables. diff --git a/reference/sql/sql_statements/ddl_commands/rename_column.rst b/reference/sql/sql_statements/ddl_commands/rename_column.rst index f91933f71..51e1e035a 100644 --- a/reference/sql/sql_statements/ddl_commands/rename_column.rst +++ b/reference/sql/sql_statements/ddl_commands/rename_column.rst @@ -1,35 +1,28 @@ .. _rename_column: -********************** +************* RENAME COLUMN -********************** +************* -``RENAME COLUMN`` can be used to rename columns in a table. - -Permissions -============= - -The role must have the ``DDL`` permission at the database or table level. +The ``RENAME COLUMN`` command can be used to rename columns in a table. Syntax -========== +====== .. code-block:: postgres - alter_table_rename_column_statement ::= - ALTER TABLE [schema_name.]table_name RENAME COLUMN current_name TO new_name - ; + ALTER TABLE [schema_name.]table_name RENAME COLUMN current_name TO new_name - table_name ::= identifier - - schema_name ::= identifier + schema_name ::= identifier + + table_name ::= identifier - current_name ::= identifier + current_name ::= identifier - new_name ::= identifier + new_name ::= identifier Parameters -============ +========== .. list-table:: :widths: auto @@ -38,28 +31,34 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified * - ``table_name`` - - The table name to apply the change to. + - The table name to apply the change to * - ``current_name`` - - The column to rename. + - The column to rename * - ``new_name`` - - The new column name. + - The new column name Examples -=========== +======== -Renaming a column ------------------------------------------ +Renaming a Column +----------------- .. code-block:: postgres - -- Remove the 'weight' column - ALTER TABLE users RENAME COLUMN weight TO mass; + ALTER TABLE + users RENAME COLUMN weight TO mass; -Renaming a quoted name --------------------------- +Renaming a Quoted Name +---------------------- .. code-block:: postgres - ALTER TABLE users RENAME COLUMN "mass" TO "Mass (Kilograms); \ No newline at end of file + ALTER TABLE + users RENAME COLUMN "mass" TO "Mass" (Kilograms); + +Permissions +=========== + +The role must have the ``DDL`` permission at the database or table level. \ No newline at end of file diff --git a/reference/sql/sql_statements/ddl_commands/rename_schema.rst b/reference/sql/sql_statements/ddl_commands/rename_schema.rst new file mode 100644 index 000000000..0024d59c3 --- /dev/null +++ b/reference/sql/sql_statements/ddl_commands/rename_schema.rst @@ -0,0 +1,54 @@ +.. _rename_schema: + +************* +RENAME SCHEMA +************* + +Renaming schemas is mainly used for improving the clarity and organization of a database by giving schemas more meaningful or concise names. + +.. warning:: Renaming a schema can void existing views that use this schema. See more about :ref:`recompiling views `. + +Permissions +=========== + +The role must have the ``DDL`` permission at the database level. + +Syntax +====== + +.. code-block:: postgres + + alter_schema_rename_schema_statement ::= + ALTER SCHEMA [database_name.]current_name RENAME TO new_name + ; + + current_name ::= identifier + + database_name ::= identifier + + new_name ::= identifier + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The database name for the schema. Defaults to ``master`` if not specified. + * - ``current_name`` + - The schema name to apply the change to. + * - ``new_name`` + - The new schema name. + +Examples +======== + +.. code-block:: postgres + + ALTER SCHEMA master.staging RENAME TO staging_new; + + diff --git a/reference/sql/sql_statements/ddl_commands/rename_table.rst b/reference/sql/sql_statements/ddl_commands/rename_table.rst index e24ba6efe..f426dfdc3 100644 --- a/reference/sql/sql_statements/ddl_commands/rename_table.rst +++ b/reference/sql/sql_statements/ddl_commands/rename_table.rst @@ -1,35 +1,28 @@ .. _rename_table: -********************** +************ RENAME TABLE -********************** - -``RENAME TABLE`` can be used to rename a table. +************ + +``RENAME TABLE`` can be used to rename a table. .. warning:: Renaming a table can void existing views that use this table. See more about :ref:`recompiling views `. -Permissions -============= - -The role must have the ``DDL`` permission at the database or table level. - Syntax -========== +====== .. code-block:: postgres - alter_table_rename_table_statement ::= - ALTER TABLE [schema_name.]current_name RENAME TO new_name - ; + ALTER TABLE [schema_name.]current_name RENAME TO new_name - current_name ::= identifier - - schema_name ::= identifier - - new_name ::= identifier + schema_name ::= identifier + + current_name ::= identifier + + new_name ::= identifier Parameters -============ +========== .. list-table:: :widths: auto @@ -38,20 +31,24 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The schema name for the table. Defaults to ``public`` if not specified. + - The schema name for the table. Defaults to ``public`` if not specified * - ``current_name`` - - The table name to apply the change to. + - The table name to apply the change to * - ``new_name`` - - The new table name. + - The new table name Examples -=========== +======== -Renaming a table ------------------------------------------ +Renaming a Table +---------------- .. code-block:: postgres - ALTER TABLE public.users RENAME TO former_users; + ALTER TABLE + public.users RENAME TO former_users; +Permissions +=========== +The role must have the ``DDL`` permission at the database or table level. diff --git a/reference/sql/sql_statements/dml_commands/copy_from.rst b/reference/sql/sql_statements/dml_commands/copy_from.rst index 05d5083aa..713ec6563 100644 --- a/reference/sql/sql_statements/dml_commands/copy_from.rst +++ b/reference/sql/sql_statements/dml_commands/copy_from.rst @@ -1,10 +1,10 @@ .. _copy_from: -********************** +********* COPY FROM -********************** +********* -``COPY ... FROM`` is a statement that allows loading data from files on the filesystem and importing them into SQream tables. This is the recommended way for bulk loading CSV files into SQream DB. In general, ``COPY`` moves data between filesystem files and SQream DB tables. +``COPY ... FROM`` is a statement that allows loading data from files on the filesystem and importing them into SQreamDB tables. This is the recommended way for bulk loading CSV files into SQreamDB. In general, ``COPY`` moves data between filesystem files and SQreamDB tables. .. note:: * Learn how to migrate from CSV files in the :ref:`csv` guide @@ -12,22 +12,22 @@ COPY FROM * To load Parquet or ORC files, see :ref:`CREATE FOREIGN TABLE` Permissions -============= +=========== The role must have the ``INSERT`` permission to the destination table. Syntax -========== +====== .. code-block:: postgres - COPY [schema name.]table_name + COPY [schema name.]table_name [ () [, ...] ] FROM WRAPPER fdw_name OPTIONS ( [ copy_from_option [, ...] ] ) - ; + schema_name ::= identifer @@ -44,7 +44,7 @@ Syntax | LIMIT = { limit } | DELIMITER = '{ delimiter }' - + | RECORD_DELIMITER = '{ record delimiter }' | ERROR_LOG = '{ local filepath }' @@ -60,13 +60,15 @@ Syntax | AWS_ID = '{ AWS ID }' | AWS_SECRET = '{ AWS Secret }' + + | DELETE_SOURCE_ON_SUCCESS = { true | false } offset ::= positive integer limit ::= positive integer delimiter ::= string - + record delimiter ::= string error count ::= integer @@ -87,7 +89,7 @@ Syntax .. _copy_from_config_options: Elements -============ +======== .. list-table:: :widths: auto @@ -102,12 +104,12 @@ Elements - - Table to copy data into * - ``QUOTE`` - - " + - ``"`` - - - Specifies an alternative quote character. The quote character must be a single, 1-byte printable ASCII character, and the equivalent octal syntax of the copy command can be used. The quote character cannot be contained in the field delimiter, the record delimiter, or the null marker. ``QUOTE`` can be used with ``csv_fdw`` in **COPY FROM** and foreign tables. - * - ``name_fdw`` + - Specifies an alternative quote character. The quote character must be a single, 1-byte printable ASCII character, and the equivalent octal syntax of the copy command can be used. The quote character cannot be contained in the field delimiter, the record delimiter, or the null marker. ``QUOTE`` can be used with ``csv_fdw`` in ``COPY FROM`` and foreign tables. The following characters cannot be an alternative quote character: ``"-.:\\0123456789abcdefghijklmnopqrstuvwxyzN"`` + * - ``fdw_name`` - - - ``csv_fdw``, ``orc_fdw``, or ``parquet_fdw`` + - ``csv_fdw``, ``orc_fdw``, ``parquet_fdw``, ``json_fdw``, or ``avro_fdw`` - The name of the Foreign Data Wrapper to use * - ``LOCATION`` - None @@ -133,7 +135,7 @@ Elements - No error log - - - When used, the ``COPY`` process will write error information from unparsable rows to the file specified by this parameter. + When used, the ``COPY`` process will write error information from unparsable rows to the file specified by this parameter. ``ERROR_LOG`` requires ``CONTINUE_ON_ERROR`` to be set to ``true`` * If an existing file path is specified, it will be overwritten. @@ -155,7 +157,7 @@ Elements * - ``CONTINUE_ON_ERROR`` - ``false`` - - true, false + - ``true`` | ``false`` - Specifies if errors should be ignored or skipped. When set to ``true``, the transaction will continue despite rejected data. @@ -176,6 +178,10 @@ Elements - None - - Specifies the authentication details for secured S3 buckets + * - ``DELETE_SOURCE_ON_SUCCESS`` + - ``false`` + - ``true`` | ``false`` + - When set to ``true``, the source file or files associated with the target path will be deleted after a successful completion of the ``COPY FROM`` operation. File deletion will not occur in the case of unsuccessful ``COPY FROM`` operations, such as when a user lacks delete permissions on their operating system. It's important to note that this parameter cannot be used concurrently with the ``OFFSET``, ``ERROR_LOG``, ``REJECTED_DATA``, ``ERROR_COUNT``, and ``LIMIT`` parameters. This parameter is supported for S3, HDFS, and GCP Object Storage. .. _copy_date_parsers: @@ -254,25 +260,21 @@ Supported Date Formats .. _field_delimiters: Supported Field Delimiters -===================================================== +========================== Field delimiters can be one or more characters. Customizing Quotations Using Alternative Characters ----------------------------- - -Syntax Example 1 - Customizing Quotations Using Alternative Characters -************ +----------------------------------------------------- -The following is the correct syntax for customizing quotations using alternative characters: +Syntax: .. code-block:: postgres - copy t from wrapper csv_fdw options (location = '/tmp/source_file.csv', quote='@'); - copy t to wrapper csv_fdw options (location = '/tmp/destination_file.csv', quote='@'); + COPY t FROM wrapper csv_fdw OPTIONS (location = '/tmp/source_file.csv', quote='@'); + COPY t TO wrapper csv_fdw OPTIONS (location = '/tmp/destination_file.csv', quote='@'); -Usage Example 1 - Customizing Quotations Using Alternative Characters -************ +Example: The following is an example of line taken from a CSV when customizing quotations using a character: @@ -281,18 +283,17 @@ The following is an example of line taken from a CSV when customizing quotations Pepsi-"Cola",@Coca-"Cola"@,Sprite,Fanta -Syntax Example 2 - Customizing Quotations Using ASCII Character Codes -************ +Customizing Quotations Using ASCII Character Codes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The following is the correct syntax for customizing quotations using ASCII character codes: +Syntax: .. code-block:: postgres copy t from wrapper csv_fdw options (location = '/tmp/source_file.csv', quote=E'\064'); copy t to wrapper csv_fdw options (location = '/tmp/destination_file.csv', quote=E'\064'); -Usage Example 2 - Customizing Quotations Using ASCII Character Codes -************ +Example: The following is an example of line taken from a CSV when customizing quotations using an ASCII character code: @@ -300,92 +301,328 @@ The following is an example of line taken from a CSV when customizing quotations Pepsi-"Cola",@Coca-"Cola"@,Sprite,Fanta - - Multi-Character Delimiters ----------------------------------- +-------------------------- -SQream DB supports multi-character field delimiters, sometimes found in non-standard files. +SQreamDB supports multi-character field delimiters, sometimes found in non-standard files. A multi-character delimiter can be specified. For example, ``DELIMITER '%%'``, ``DELIMITER '{~}'``, etc. Printable Characters ------------------------ +-------------------- + +All printable ASCII character (except for ``N``) can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (``,``). + + +The following table shows the supported printable ASCII characters: + ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| **Character** | **Description** | **ASCII** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| (Space) | Space | 32 | 40 | 20 | 100000 | | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ! | Exclamation Mark | 33 | 41 | 21 | 100001 | ! | ! | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| # | Hash or Number | 35 | 43 | 23 | 100011 | # | # | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| $ | Dollar Sign | 36 | 44 | 24 | 100100 | $ | $ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| % | Percentage | 37 | 45 | 25 | 100101 | % | % | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| & | Ampersand | 38 | 46 | 26 | 100110 | & | & | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ( | Left Parenthesis | 40 | 50 | 28 | 101000 | ( | ( | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ) | Right Parenthesis | 41 | 51 | 29 | 101001 | ) | ) | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \*\ | Asterisk | 42 | 52 | 2A | 101010 | * | * | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \+\ | Plus Sign | 43 | 53 | 2B | 101011 | + | + | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| , | Comma | 44 | 54 | 2C | 101100 | , | , | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| / | Slash | 47 | 57 | 2F | 101111 | / | / | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ; | Semicolon | 59 | 73 | 3B | 111011 | ; | ; | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| < | Less Than | 60 | 74 | 3C | 111100 | < | < | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| = | Equals Sign | 61 | 75 | 3D | 111101 | = | = | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| > | Greater Than | 62 | 76 | 3E | 111110 | > | > | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ? | Question Mark | 63 | 77 | 3F | 111111 | ? | ? | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| @ | At Sign | 64 | 100 | 40 | 1000000 | @ | @ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| [ | Left Square Bracket | 91 | 133 | 5B | 1011011 | [ | [ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \\ | Backslash | 92 | 134 | 5C | 1011100 | \&\#92\; | \ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ] | Right Square Bracket | 93 | 135 | 5D | 1011101 | ] | ] | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ^ | Caret or Circumflex | 94 | 136 | 5E | 1011110 | ^ | &hat; | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| _ | Underscore | 95 | 137 | 5F | 1011111 | _ | _ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ` | Grave Accent | 96 | 140 | 60 | 1100000 | ` | ` | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| { | Left Curly Bracket | 123 | 173 | 7B | 1111011 | { | { | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \|\ | Vertical Bar | 124 | 174 | 7C | 1111100 | | | | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| } | Right Curly Bracket | 125 | 175 | 7D | 1111101 | } | } | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ~ | Tilde | 126 | 176 | 7E | 1111110 | ~ | ˜ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 58 | : | Colon | 72 | 3A | 111010 | : | : | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 65 | A | A | 101 | 41 | 1000001 | A | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 66 | B | B | 102 | 42 | 1000010 | B | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 67 | C | C | 103 | 43 | 1000011 | C | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 68 | D | D | 104 | 44 | 1000100 | D | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 69 | E | E | 105 | 45 | 1000101 | E | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 70 | F | F | 106 | 46 | 1000110 | F | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 71 | G | G | 107 | 47 | 1000111 | G | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 72 | H | H | 110 | 48 | 1001000 | H | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 73 | I | I | 111 | 49 | 1001001 | I | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 74 | J | J | 112 | 4A | 1001010 | J | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 75 | K | K | 113 | 4B | 1001011 | K | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 76 | L | L | 114 | 4C | 1001100 | L | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 77 | M | M | 115 | 4D | 1001101 | M | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 79 | O | O | 117 | 4F | 1001111 | O | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 80 | P | P | 120 | 50 | 1010000 | P | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 81 | Q | Q | 121 | 51 | 1010001 | Q | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 82 | R | R | 122 | 52 | 1010010 | R | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 83 | S | S | 123 | 53 | 1010011 | S | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 84 | T | T | 124 | 54 | 1010100 | T | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 85 | U | U | 125 | 55 | 1010101 | U | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 86 | V | V | 126 | 56 | 1010110 | V | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 87 | W | W | 127 | 57 | 1010111 | W | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 88 | X | X | 130 | 58 | 1011000 | X | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 89 | Y | Y | 131 | 59 | 1011001 | Y | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 90 | Z | Z | 132 | 5A | 1011010 | Z | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 97 | a | a | 141 | 61 | 1100001 | a | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 98 | b | b | 142 | 62 | 1100010 | b | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 99 | c | c | 143 | 63 | 1100011 | c | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 100 | d | d | 144 | 64 | 1100100 | d | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 101 | e | e | 145 | 65 | 1100101 | e | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 102 | f | f | 146 | 66 | 1100110 | f | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 103 | g | g | 147 | 67 | 1100111 | g | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 104 | h | h | 150 | 68 | 1101000 | h | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 105 | i | i | 151 | 69 | 1101001 | i | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 106 | j | j | 152 | 6A | 1101010 | j | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 107 | k | k | 153 | 6B | 1101011 | k | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 108 | l | l | 154 | 6C | 1101100 | l | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 109 | m | m | 155 | 6D | 1101101 | m | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 110 | n | n | 156 | 6E | 1101110 | n | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 111 | o | o | 157 | 6F | 1101111 | o | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 112 | p | p | 160 | 70 | 1110000 | p | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 113 | q | q | 161 | 71 | 1110001 | q | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 114 | r | r | 162 | 72 | 1110010 | r | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 115 | s | s | 163 | 73 | 1110011 | s | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 116 | t | t | 164 | 74 | 1110100 | t | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 117 | u | u | 165 | 75 | 1110101 | u | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 118 | v | v | 166 | 76 | 1110110 | v | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 119 | w | w | 167 | 77 | 1110111 | w | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 120 | x | x | 170 | 78 | 1111000 | x | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 121 | y | y | 171 | 79 | 1111001 | y | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 122 | z | z | 172 | 7A | 1111010 | z | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ -Any printable ASCII character (or characters) can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (``,``). - -A printable character is any ASCII character in the range 32 - 126. - -:ref:`Literal quoting rules` apply with delimiters. For example, to use ``'`` as a field delimiter, use ``DELIMITER ''''`` Non-Printable Characters ----------------------------- - -A non-printable character (1 - 31, 127) can be used in its octal form. +------------------------ A tab can be specified by escaping it, for example ``\t``. Other non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. +The following table shows the supported non-printable ASCII characters: + ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| **Character** | **Description** | **Octal** | **ASCII** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| NUL | Null | 0 | 0 | 0 | 0 | � | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SOH | Start of Heading | 1 | 1 | 1 | 1 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| STX | Start of Text | 2 | 2 | 2 | 10 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ETX | End of Text | 3 | 3 | 3 | 11 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| EOT | End of Transmission | 4 | 4 | 4 | 100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ENQ | Enquiry | 5 | 5 | 5 | 101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ACK | Acknowledge | 6 | 6 | 6 | 110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| BEL | Bell | 7 | 7 | 7 | 111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| BS | Backspace | 10 | 8 | 8 | 1000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| HT | Horizontal Tab | 11 | 9 | 9 | 1001 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| VT | Vertical Tab | 13 | 11 | 0B | 1011 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| FF | NP Form Feed, New Page | 14 | 12 | 0C | 1100 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SO | Shift Out | 16 | 14 | 0E | 1110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SI | Shift In | 17 | 15 | 0F | 1111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DLE | Data Link Escape | 20 | 16 | 10 | 10000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC1 | Device Control 1 | 21 | 17 | 11 | 10001 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC2 | Device Control 2 | 22 | 18 | 12 | 10010 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC3 | Device Control 3 | 23 | 19 | 13 | 10011 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC4 | Device Control 4 | 24 | 20 | 14 | 10100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| NAK | Negative Acknowledge | 25 | 21 | 15 | 10101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SYN | Synchronous Idle | 26 | 22 | 16 | 10110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ETB | End of Transmission Block | 27 | 23 | 17 | 10111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| CAN | Cancel | 30 | 24 | 18 | 11000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| EM | End of Medium | 31 | 25 | 19 | 11001 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SUB | Substitute | 32 | 26 | 1A | 11010 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ESC | Escape | 33 | 27 | 1B | 11011 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| FS | File Separator | 34 | 28 | 1C | 11100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| GS | Group Separator | 35 | 29 | 1D | 11101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| RS | Record Separator | 36 | 30 | 1E | 11110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| US | Unit Separator | 37 | 31 | 1F | 11111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DEL | Delete | 177 | 127 | 7F | 1111111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ + .. _capturing_rejected_rows: Unsupported Field Delimiters -========================== +============================ + The following ASCII field delimiters (octal range 001 - 176) are not supported: -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| **Character** | **Decimal** | **Symbol** | **Character** | **Decimal** | **Symbol** | **Character** | **Decimal** | **Symbol** | -+===============+=============+============+===============+=============+============+===============+=============+============+ -| - | 45 | 55 | b | 98 | 142 | q | 113 | 161 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| . | 46 | 56 | c | 99 | 143 | r | 114 | 162 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| : | 58 | 72 | d | 100 | 144 | s | 115 | 163 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| \ | 92 | 134 | e | 101 | 145 | t | 116 | 164 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 0 | 48 | 60 | f | 102 | 146 | u | 117 | 165 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 1 | 49 | 61 | g | 103 | 147 | v | 118 | 166 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 2 | 50 | 62 | h | 104 | 150 | w | 119 | 167 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 3 | 51 | 63 | i | 105 | 151 | x | 120 | 170 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 4 | 52 | 64 | j | 106 | 152 | y | 121 | 171 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 5 | 53 | 65 | k | 107 | 153 | z | 122 | 172 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 6 | 54 | 66 | l | 108 | 154 | N | 78 | 116 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 7 | 55 | 67 | m | 109 | 155 | 10 | 49 | 12 | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ -| 8 | 56 | 70 | n | 110 | 156 | 13 | 49 | 13 | -+---------------+-------------+------------+---------------+-------------+------------+ | | | -| 9 | 57 | 71 | o | 111 | 157 | | | | -+---------------+-------------+------------+---------------+-------------+------------+ | | | -| a | 97 | 141 | p | 112 | 160 | | | | -+---------------+-------------+------------+---------------+-------------+------------+---------------+-------------+------------+ +The following table shows the unsupported ASCII field delimiters: + ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| **ASCII** | **Character** | **Description** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 10 | LF | NL Line Feed, New Line | 12 | 0A | 1010 | | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 13 | CR | Carriage Return | 15 | 0D | 1101 | | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 34 | " | Double Quote | 42 | 22 | 100010 | " | " | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 45 | \-\ | Minus Sign | 55 | 2D | 101101 | - | − | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 46 | . | Period | 56 | 2E | 101110 | . | . | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 48 | 0 | Zero | 60 | 30 | 110000 | 0 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 49 | 1 | Number One | 61 | 31 | 110001 | 1 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 50 | 2 | Number Two | 62 | 32 | 110010 | 2 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 51 | 3 | Number Three | 63 | 33 | 110011 | 3 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 52 | 4 | Number Four | 64 | 34 | 110100 | 4 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 53 | 5 | Number Five | 65 | 35 | 110101 | 5 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 54 | 6 | Number Six | 66 | 36 | 110110 | 6 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 55 | 7 | Number Seven | 67 | 37 | 110111 | 7 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 56 | 8 | Number Eight | 70 | 38 | 111000 | 8 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 57 | 9 | Number Nine | 71 | 39 | 111001 | 9 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 58 | : | Colon | 72 | 3A | 111010 | : | : | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 78 | N | N | 116 | 4E | 1001110 | N | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ Capturing Rejected Rows -========================== +======================= Prior to the column process and storage, the ``COPY`` command parses the data. -Whenever the data can’t be parsed because it is improperly formatted or doesn’t match the data structure, the entire record (or row) will be rejected. - -.. image:: /_static/images/copy_from_rejected_rows.png - +Whenever the data can’t be parsed because it is improperly formatted or doesn’t match the data structure, the entire record (or row) will be rejected. -#. When ``ERROR_LOG`` is not used, the ``COPY`` command will stop and roll back the transaction upon the first error. +When ``ERROR_LOG`` is not used, the ``COPY`` command will stop and roll back the transaction upon the first error. -#. When ``ERROR_LOG`` is set and ``ERROR_VERBOSITY`` is set to ``1`` (default), all errors and rejected rows are saved to the file path specified. +.. image:: /_static/images/copy_from_rejected_rows.png + :width: 50% -#. When ``ERROR_LOG`` is set and ``ERROR_VERBOSITY`` is set to ``0``, rejected rows are saved to the file path specified, but errors are not logged. This is useful for replaying the file later. CSV Support -================ +=========== By default, SQream DB's CSV parser can handle `RFC 4180 standard CSVs `_ , but can also be modified to support non-standard CSVs (with multi-character delimiters, unquoted fields, etc). @@ -411,7 +648,7 @@ All CSV files should be prepared according to these recommendations: Other modes of escaping are not supported (e.g. ``1,"What are \"birds\"?"`` is not a valid way of escaping CSV values). Marking Null Markers ---------------- +-------------------- ``NULL`` values can be marked in two ways in the CSV: @@ -421,10 +658,10 @@ Marking Null Markers .. note:: If a text field is quoted but contains no content (``""``) it is considered an empty text field. It is not considered ``NULL``. Examples -=========== +======== Loading a Standard CSV File ------------------------------- +--------------------------- .. code-block:: postgres @@ -432,7 +669,7 @@ Loading a Standard CSV File Skipping Faulty Rows ------------------------------- +-------------------- .. code-block:: postgres @@ -440,7 +677,7 @@ Skipping Faulty Rows Skipping a Maximum of 100 Faulty Rows ------------------------------------ +------------------------------------- .. code-block:: postgres @@ -463,7 +700,7 @@ Loading a Tab Separated Value (TSV) File Loading an ORC File -------------------------------------------- +------------------- .. code-block:: postgres @@ -471,15 +708,28 @@ Loading an ORC File Loading a Parquet File -------------------------------------------- +---------------------- .. code-block:: postgres COPY table_name FROM WRAPPER parquet_fdw OPTIONS (location = '/tmp/file.parquet'); + +Loading a JSON File +---------------------- + +.. code-block:: postgres + + COPY t FROM WRAPPER json_fdw OPTIONS (location = 'somefile.json'); +Loading an AVRO File +---------------------- + +.. code-block:: postgres + + COPY t FROM WRAPPER fdw_name OPTIONS ([ copy_from_option [, ...] ]); Loading a Text File with Non-Printable Delimiters ------------------------------------------------------ +------------------------------------------------- In the file below, the separator is ``DC1``, which is represented by ASCII 17 decimal or 021 octal. @@ -488,7 +738,7 @@ In the file below, the separator is ``DC1``, which is represented by ASCII 17 de COPY table_name FROM WRAPPER psv_fdw OPTIONS (location = '/tmp/file.txt', delimiter = E'\021'); Loading a Text File with Multi-Character Delimiters ------------------------------------------------------ +--------------------------------------------------- In the file below, the separator is ``^|``. @@ -504,7 +754,7 @@ In the file below, the separator is ``'|``. The quote character has to be repeat Loading Files with a Header Row ------------------------------------ +------------------------------- Use ``OFFSET`` to skip rows. @@ -514,6 +764,19 @@ Use ``OFFSET`` to skip rows. COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/file.psv', delimiter = '|', offset = 2); +Loading Files Using ``DELETE_SOURCE_ON_SUCCESS`` +------------------------------------------------- + +.. code-block:: sql + + -- Single file: + + COPY t FROM WRAPPER json_fdw OPTIONS (location = '/tmp/wrappers/t.json', DELETE_SOURCE_ON_SUCCESS = true); + + -- Multiple files: + + COPY t FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/wrappers/group*.csv', DELETE_SOURCE_ON_SUCCESS = true); + Loading Files Formatted for Windows (``\r\n``) --------------------------------------------------- @@ -528,7 +791,25 @@ Loading a File from a Public S3 Bucket .. code-block:: postgres - COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = 's3://sqream-demo-data/file.csv', delimiter = '\r\n', offset = 2); + COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = 's3://sqream-demo-data/file.csv', delimiter = '\r\n', offset = 2); + +Loading a File From a Google Cloud Platform Bucket +---------------------------------------------------- + +To access a Google Cloud Platform (GCP) Bucket it is required that your environment be authorized. + +.. code-block:: + + COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = 'gs:////*'); + +Loading a File From Azure +---------------------------------- + +To access Azure it is required that your environment be authorized. + +.. code-block:: + + COPY table_name FROM WRAPPER csv_fdw OPTIONS(location = 'azure://sqreamrole.core.windows.net/sqream-demo-data/file.csv'); Loading Files from an Authenticated S3 Bucket --------------------------------------------------- @@ -544,24 +825,30 @@ Saving Rejected Rows to a File .. code-block:: postgres - COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/file.csv', - ,continue_on_error = true - ,error_log = '/temp/load_error.log' - ); + COPY table_name FROM WRAPPER csv_fdw + OPTIONS + ( + location = '/tmp/file.csv' + ,continue_on_error = true + ,error_log = '/temp/load_error.log' + ); .. code-block:: postgres - COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/file.psv' - ,delimiter '|' - ,error_log = '/temp/load_error.log' -- Save error log - ,rejected_data = '/temp/load_rejected.log' -- Only save rejected rows - ,limit = 100 -- Only load 100 rows - ,error_count = 5 -- Stop the load if 5 errors reached - ); + COPY table_name FROM WRAPPER csv_fdw + OPTIONS + ( + location = '/tmp/file.psv' + ,delimiter '|' + ,error_log = '/temp/load_error.log' -- Save error log + ,rejected_data = '/temp/load_rejected.log' -- Only save rejected rows + ,limit = 100 -- Only load 100 rows + ,error_count = 5 -- Stop the load if 5 errors reached + ); Loading CSV Files from a Set of Directories ------------------------------------------- +------------------------------------------- .. code-block:: postgres @@ -581,12 +868,31 @@ When the source of the files does not match the table structure, tell the ``COPY Loading Non-Standard Dates ---------------------------------- -If files contain dates not formatted as ``ISO8601``, tell ``COPY`` how to parse the column. After parsing, the date will appear as ``ISO8601`` inside SQream DB. - -These are called date parsers. You can find the supported dates in the :ref:`'Supported date parsers' table` above +If your files contain dates in a format other than ``ISO8601``, you can specify a :ref:`parsing` format to convert them during the import process. This ensures the dates are stored internally as ``ISO8601`` within the database. In this example, ``date_col1`` and ``date_col2`` in the table are non-standard. ``date_col3`` is mentioned explicitly, but can be left out. Any column that is not specified is assumed to be ``ISO8601``. .. code-block:: postgres - COPY table_name FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/*.csv', datetime_format = 'DMY'); + COPY my_table (date_col1, date_col2, date_col3) FROM WRAPPER csv_fdw OPTIONS (location = '/tmp/my_data.csv', offset = 2, datetime_format 'DMY'); + +Loading Specific Columns +------------------------ + +Loading specific columns using the ``COPY FROM`` command: + +* Does not support CSV files + +* Requires that the target table columns be nullable + +.. code-block:: postgres + + COPY + new_nba (name, salary) + FROM + WRAPPER + parquet_fdw + OPTIONS + ( + LOCATION = '/tmp/nba.parquet' + ); \ No newline at end of file diff --git a/reference/sql/sql_statements/dml_commands/copy_to.rst b/reference/sql/sql_statements/dml_commands/copy_to.rst index 61e6b35b2..0ba6ed451 100644 --- a/reference/sql/sql_statements/dml_commands/copy_to.rst +++ b/reference/sql/sql_statements/dml_commands/copy_to.rst @@ -4,20 +4,24 @@ COPY TO ********************** -``COPY ... TO`` is a statement that can be used to export data from a SQream database table or query to a file on the filesystem. +The ``COPY TO`` statement is used for exporting data from a SQream database table or for exporting query results to a file on the filesystem. +You may wish to export data from SQream for any of the following reasons: -In general, ``COPY`` moves data between filesystem files and SQream DB tables. +* To use data in external tables. See :ref:`Working with External Data`. +* To share data with other clients or consumers using different systems. +* To copy data into another SQream cluster. -.. note:: To copy data from a file to a table, see :ref:`COPY FROM`. +In general, ``COPY`` moves data between filesystem files and SQream DB tables. If you wish to copy data from a file to a table, see :ref:`COPY FROM`. -Permissions -============= - -The role must have the ``SELECT`` permission on every table or schema that is referenced by the statement. +.. contents:: + :local: + :depth: 1 Syntax ========== +The following is the correct syntax for using the **COPY TO** statement: + .. code-block:: postgres COPY { [schema_name].table_name [ ( column_name [, ... ] ) ] | query } @@ -29,7 +33,7 @@ Syntax ) ; - fdw_name ::= csw_fdw | parquet_fdw | orc_fdw + fdw_name ::= csv_fdw | parquet_fdw | orc_fdw schema_name ::= identifer @@ -41,25 +45,31 @@ Syntax | DELIMITER = '{ delimiter }' - | RECORD_DELIMITER = '{ record delimiter }' - | HEADER = { true | false } | AWS_ID = '{ AWS ID }' | AWS_SECRET = '{ AWS Secret }' + + | MAX_FILE_SIZE = '{ size_in_bytes }' + + | ENFORCE_SINGLE_FILE = { true | false } - delimiter ::= string - record delimiter ::= string + delimiter ::= string AWS ID ::= string AWS Secret ::= string + +.. note:: In Studio, you must write the parameters using lower case letters. Using upper case letters generates an error. + Elements ============ +The following table shows the ``COPY_TO`` elements: + .. list-table:: :widths: auto :header-rows: 1 @@ -71,50 +81,399 @@ Elements * - ``query`` - An SQL query that returns a table result, or a table name * - ``fdw_name`` - - The name of the Foreign Data Wrapper to use. Supported FDWs are ``csv_fdw``, ``orc_fdw``, or ``parquet_fdw``. + - The name of the Foreign Data Wrapper to use. Supported FDWs are ``csv_fdw``, ``orc_fdw``, ``avro_fdw`` or ``parquet_fdw``. * - ``LOCATION`` - A path on the local filesystem, S3, or HDFS URI. For example, ``/tmp/foo.csv``, ``s3://my-bucket/foo.csv``, or ``hdfs://my-namenode:8020/foo.csv``. The local path must be an absolute path that SQream DB can access. * - ``HEADER`` - The CSV file will contain a header line with the names of each column in the file. This option is allowed only when using CSV format. * - ``DELIMITER`` - - Specifies the character that separates fields (columns) within each row of the file. The default is a comma character (``,``). + - Specifies the character or string that separates fields (columns) within each row of the file. The default is a comma character (``,``). This option is allowed only when using CSV format. * - ``AWS_ID``, ``AWS_SECRET`` - Specifies the authentication details for secured S3 buckets + * - ``MAX_FILE_SIZE`` + - Sets the maximum file size (bytes). Default value: 16*2^20 (16MB). + * - ``ENFORCE_SINGLE_FILE`` + - Enforces the maximum file size (bytes). Permitted values: ``true`` - creates one file of unlimited size, ``false`` - permits creating several files together limited by the ``MAX_FILE_SIZE``. When set to ``true``, the single file size is not limited by the ``MAX_FILE_SIZE`` setting. When set to ``false``, the combined file sizes cannot exceed the ``MAX_FILE_SIZE``. Default value: ``TRUE``. + +Usage Notes +=========== -Usage notes -=============== +.. contents:: + :local: + :depth: 1 -Supported field delimiters +Supported Field Delimiters ------------------------------ -Printable characters -^^^^^^^^^^^^^^^^^^^^^ +.. contents:: + :local: + :depth: 1 + +Printable ASCII Characters +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +All printable ASCII character (except for ``N``) can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (``,``). + +The following table shows the supported printable ASCII characters: + ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| **Character** | **Description** | **ASCII** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| (Space) | Space | 32 | 40 | 20 | 100000 | | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ! | Exclamation Mark | 33 | 41 | 21 | 100001 | ! | ! | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| # | Hash or Number | 35 | 43 | 23 | 100011 | # | # | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| $ | Dollar Sign | 36 | 44 | 24 | 100100 | $ | $ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| % | Percentage | 37 | 45 | 25 | 100101 | % | % | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| & | Ampersand | 38 | 46 | 26 | 100110 | & | & | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ( | Left Parenthesis | 40 | 50 | 28 | 101000 | ( | ( | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ) | Right Parenthesis | 41 | 51 | 29 | 101001 | ) | ) | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \*\ | Asterisk | 42 | 52 | 2A | 101010 | * | * | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \+\ | Plus Sign | 43 | 53 | 2B | 101011 | + | + | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| , | Comma | 44 | 54 | 2C | 101100 | , | , | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| / | Slash | 47 | 57 | 2F | 101111 | / | / | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ; | Semicolon | 59 | 73 | 3B | 111011 | ; | ; | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| < | Less Than | 60 | 74 | 3C | 111100 | < | < | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| = | Equals Sign | 61 | 75 | 3D | 111101 | = | = | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| > | Greater Than | 62 | 76 | 3E | 111110 | > | > | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ? | Question Mark | 63 | 77 | 3F | 111111 | ? | ? | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| @ | At Sign | 64 | 100 | 40 | 1000000 | @ | @ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| [ | Left Square Bracket | 91 | 133 | 5B | 1011011 | [ | [ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \\ | Backslash | 92 | 134 | 5C | 1011100 | \&\#92\; | \ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ] | Right Square Bracket | 93 | 135 | 5D | 1011101 | ] | ] | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ^ | Caret or Circumflex | 94 | 136 | 5E | 1011110 | ^ | &hat; | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| _ | Underscore | 95 | 137 | 5F | 1011111 | _ | _ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ` | Grave Accent | 96 | 140 | 60 | 1100000 | ` | ` | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| { | Left Curly Bracket | 123 | 173 | 7B | 1111011 | { | { | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| \|\ | Vertical Bar | 124 | 174 | 7C | 1111100 | | | | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| } | Right Curly Bracket | 125 | 175 | 7D | 1111101 | } | } | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ~ | Tilde | 126 | 176 | 7E | 1111110 | ~ | ˜ | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 58 | : | Colon | 72 | 3A | 111010 | : | : | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 65 | A | A | 101 | 41 | 1000001 | A | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 66 | B | B | 102 | 42 | 1000010 | B | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 67 | C | C | 103 | 43 | 1000011 | C | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 68 | D | D | 104 | 44 | 1000100 | D | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 69 | E | E | 105 | 45 | 1000101 | E | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 70 | F | F | 106 | 46 | 1000110 | F | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 71 | G | G | 107 | 47 | 1000111 | G | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 72 | H | H | 110 | 48 | 1001000 | H | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 73 | I | I | 111 | 49 | 1001001 | I | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 74 | J | J | 112 | 4A | 1001010 | J | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 75 | K | K | 113 | 4B | 1001011 | K | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 76 | L | L | 114 | 4C | 1001100 | L | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 77 | M | M | 115 | 4D | 1001101 | M | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 79 | O | O | 117 | 4F | 1001111 | O | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 80 | P | P | 120 | 50 | 1010000 | P | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 81 | Q | Q | 121 | 51 | 1010001 | Q | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 82 | R | R | 122 | 52 | 1010010 | R | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 83 | S | S | 123 | 53 | 1010011 | S | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 84 | T | T | 124 | 54 | 1010100 | T | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 85 | U | U | 125 | 55 | 1010101 | U | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 86 | V | V | 126 | 56 | 1010110 | V | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 87 | W | W | 127 | 57 | 1010111 | W | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 88 | X | X | 130 | 58 | 1011000 | X | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 89 | Y | Y | 131 | 59 | 1011001 | Y | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 90 | Z | Z | 132 | 5A | 1011010 | Z | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 97 | a | a | 141 | 61 | 1100001 | a | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 98 | b | b | 142 | 62 | 1100010 | b | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 99 | c | c | 143 | 63 | 1100011 | c | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 100 | d | d | 144 | 64 | 1100100 | d | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 101 | e | e | 145 | 65 | 1100101 | e | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 102 | f | f | 146 | 66 | 1100110 | f | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 103 | g | g | 147 | 67 | 1100111 | g | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 104 | h | h | 150 | 68 | 1101000 | h | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 105 | i | i | 151 | 69 | 1101001 | i | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 106 | j | j | 152 | 6A | 1101010 | j | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 107 | k | k | 153 | 6B | 1101011 | k | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 108 | l | l | 154 | 6C | 1101100 | l | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 109 | m | m | 155 | 6D | 1101101 | m | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 110 | n | n | 156 | 6E | 1101110 | n | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 111 | o | o | 157 | 6F | 1101111 | o | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 112 | p | p | 160 | 70 | 1110000 | p | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 113 | q | q | 161 | 71 | 1110001 | q | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 114 | r | r | 162 | 72 | 1110010 | r | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 115 | s | s | 163 | 73 | 1110011 | s | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 116 | t | t | 164 | 74 | 1110100 | t | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 117 | u | u | 165 | 75 | 1110101 | u | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 118 | v | v | 166 | 76 | 1110110 | v | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 119 | w | w | 167 | 77 | 1110111 | w | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 120 | x | x | 170 | 78 | 1111000 | x | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 121 | y | y | 171 | 79 | 1111001 | y | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ +| 122 | z | z | 172 | 7A | 1111010 | z | | ++---------------+----------------------+-----------+-----------+---------+------------+---------------+---------------+ + +Non-Printable ASCII Characters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following table shows the supported non-printable ASCII characters: + ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| **Character** | **Description** | **Octal** | **ASCII** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| NUL | Null | 0 | 0 | 0 | 0 | � | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SOH | Start of Heading | 1 | 1 | 1 | 1 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| STX | Start of Text | 2 | 2 | 2 | 10 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ETX | End of Text | 3 | 3 | 3 | 11 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| EOT | End of Transmission | 4 | 4 | 4 | 100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ENQ | Enquiry | 5 | 5 | 5 | 101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ACK | Acknowledge | 6 | 6 | 6 | 110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| BEL | Bell | 7 | 7 | 7 | 111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| BS | Backspace | 10 | 8 | 8 | 1000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| HT | Horizontal Tab | 11 | 9 | 9 | 1001 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| VT | Vertical Tab | 13 | 11 | 0B | 1011 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| FF | NP Form Feed, New Page | 14 | 12 | 0C | 1100 | | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SO | Shift Out | 16 | 14 | 0E | 1110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SI | Shift In | 17 | 15 | 0F | 1111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DLE | Data Link Escape | 20 | 16 | 10 | 10000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC1 | Device Control 1 | 21 | 17 | 11 | 10001 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC2 | Device Control 2 | 22 | 18 | 12 | 10010 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC3 | Device Control 3 | 23 | 19 | 13 | 10011 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DC4 | Device Control 4 | 24 | 20 | 14 | 10100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| NAK | Negative Acknowledge | 25 | 21 | 15 | 10101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SYN | Synchronous Idle | 26 | 22 | 16 | 10110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ETB | End of Transmission Block | 27 | 23 | 17 | 10111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| CAN | Cancel | 30 | 24 | 18 | 11000 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| EM | End of Medium | 31 | 25 | 19 | 11001 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| SUB | Substitute | 32 | 26 | 1A | 11010 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| ESC | Escape | 33 | 27 | 1B | 11011 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| FS | File Separator | 34 | 28 | 1C | 11100 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| GS | Group Separator | 35 | 29 | 1D | 11101 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| RS | Record Separator | 36 | 30 | 1E | 11110 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| US | Unit Separator | 37 | 31 | 1F | 11111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ +| DEL | Delete | 177 | 127 | 7F | 1111111 |  | | ++---------------+---------------------------+-----------+-----------+---------+------------+---------------+---------------+ + +A tab can be specified by escaping it, for example ``\t``. Other non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. -Any printable ASCII character can be used as a delimiter without special syntax. The default CSV field delimiter is a comma (``,``). +For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. -A printable character is any ASCII character in the range 32 - 126. +.. note:: Delimiters are only applicable to the CSV file format. + +Unsupported ASCII Field Delimiters +----------------------------------- + +The following table shows the unsupported ASCII field delimiters: + ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| **ASCII** | **Character** | **Description** | **Octal** | **Hex** | **Binary** | **HTML Code** | **HTML Name** | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 10 | LF | NL Line Feed, New Line | 12 | 0A | 1010 | | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 13 | CR | Carriage Return | 15 | 0D | 1101 | | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 34 | " | Double Quote | 42 | 22 | 100010 | " | " | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 45 | \-\ | Minus Sign | 55 | 2D | 101101 | - | − | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 46 | . | Period | 56 | 2E | 101110 | . | . | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 48 | 0 | Zero | 60 | 30 | 110000 | 0 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 49 | 1 | Number One | 61 | 31 | 110001 | 1 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 50 | 2 | Number Two | 62 | 32 | 110010 | 2 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 51 | 3 | Number Three | 63 | 33 | 110011 | 3 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 52 | 4 | Number Four | 64 | 34 | 110100 | 4 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 53 | 5 | Number Five | 65 | 35 | 110101 | 5 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 54 | 6 | Number Six | 66 | 36 | 110110 | 6 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 55 | 7 | Number Seven | 67 | 37 | 110111 | 7 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 56 | 8 | Number Eight | 70 | 38 | 111000 | 8 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 57 | 9 | Number Nine | 71 | 39 | 111001 | 9 | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 58 | : | Colon | 72 | 3A | 111010 | : | : | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 92 | \\ | Backslash | 134 | 5C | 01011100 | \&\#92\; | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ +| 78 | N | N | 116 | 4E | 1001110 | N | | ++-----------+---------------+------------------------+-----------+---------+------------+---------------+---------------+ + +Date Format +--------------- -Non-printable characters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The date format in the output CSV is formatted as ISO 8601 (``2019-12-31 20:30:55.123``), regardless of how it was parsed initially with :ref:`COPY FROM date parsers`. -A non-printable character (1 - 31, 127) can be used in its octal form. +For more information on the ``datetime`` format, see :ref:`sql_data_types_date`. -A tab can be specified by escaping it, for example ``\t``. Other non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. +Examples +======== -For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. +.. contents:: + :local: + :depth: 1 +Exporting Data From SQream to External File Tables +-------------------------------------------------- -Date format ---------------- +Parquet +^^^^^^^ -The date format in the output CSV is formatted as ISO 8601 (``2019-12-31 20:30:55.123``), regardless of how it was parsed initially with :ref:`COPY FROM date parsers`. +The compression algorithm used for exporting data from SQream to Parquet files is Snappy. +Exporting tables to Parquet files: -Examples -=========== +.. code-block:: psql + + COPY nba TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/nba_export.parquet'); + +Exporting query results to Parquet files: + +.. code-block:: psql + + COPY (SELECT name FROM nba WHERE salary<1148640) TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/file.parquet'); + +ORC +^^^ +The compression algorithm used for exporting data from SQream to ORC files is ZLIB. + +Exporting tables to ORC files: + +.. code-block:: psql + + COPY nba TO WRAPPER orc_fdw OPTIONS (LOCATION = '/tmp/nba_export.orc'); + +Exporting query results to ORC files: + +.. code-block:: psql + + COPY (SELECT name from nba where salary<1148640) TO WRAPPER orc_fdw OPTIONS (LOCATION = '/tmp/file.orc'); + +AVRO +^^^^ +The compression algorithm used for exporting data from SQream to Parquet files is Snappy. + +Exporting tables to AVRO files: + +.. code-block:: psql + + COPY nba TO WRAPPER avro_fdw OPTIONS (LOCATION = '/tmp/nba_export.avro'); + +Exporting query results to AVRO files: + +.. code-block:: psql + + COPY (SELECT name from nba where salary<1148640) TO WRAPPER avro_fdw OPTIONS (LOCATION = '/tmp/file.avro'); + +CSV +^^^ -Export table to a CSV without HEADER ------------------------------------- +Exporting a table to a CSV file without a HEADER row: .. code-block:: psql @@ -130,8 +489,7 @@ Export table to a CSV without HEADER Jonas Jerebko,Boston Celtics,8,PF,29,6-10,231,\N,5000000 Amir Johnson,Boston Celtics,90,PF,29,6-9,240,\N,12000000 -Export table to a CSV with a HEADER row ------------------------------------------ +Exporting a table to a CSV file with a HEADER row: .. code-block:: psql @@ -146,13 +504,30 @@ Export table to a CSV with a HEADER row John Holland,Boston Celtics,30,SG,27,6-5,205,Boston University,\N R.J. Hunter,Boston Celtics,28,SG,22,6-5,185,Georgia State,1148640 Jonas Jerebko,Boston Celtics,8,PF,29,6-10,231,\N,5000000 + +Exporting the result of a query to a CSV file: -Export table to a TSV with a header row ------------------------------------------ +.. code-block:: psql + + COPY (SELECT Team, AVG(Salary) FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv'); + +.. code-block:: console + + $ head -n5 nba_salaries.csv + Atlanta Hawks,4860196 + Boston Celtics,4181504 + Brooklyn Nets,3501898 + Charlotte Hornets,5222728 + Chicago Bulls,5785558 + +TSV +^^^ + +Exporting a table to a TSV file with a HEADER row: .. code-block:: psql - COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = '|', HEADER = true); + COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = '\t', HEADER = true); .. code-block:: console @@ -164,71 +539,51 @@ Export table to a TSV with a header row R.J. Hunter Boston Celtics 28 SG 22 6-5 185 Georgia State 1148640 Jonas Jerebko Boston Celtics 8 PF 29 6-10 231 \N 5000000 -Use non-printable ASCII characters as delimiter -------------------------------------------------------- +Exporting Data From SQream to Cloud Storage +------------------------------------------- -Non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. - -For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. - -.. code-block:: psql - - COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\017'); +The following is an example of saving files to an authenticated S3 bucket: .. code-block:: psql - COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\011'); -- 011 is a tab character + COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 's3://my_bucket/salaries/nba_export.csv', AWS_ID = 'my_aws_id', AWS_SECRET = 'my_aws_secret'); -Exporting the result of a query to a CSV --------------------------------------------- +The following is an example of saving files to an HDFS path: .. code-block:: psql - COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv'); + COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 'hdfs://pp_namenode:8020/nba_export.csv'); -.. code-block:: console - - $ head -n5 nba_salaries.csv - Atlanta Hawks,4860196 - Boston Celtics,4181504 - Brooklyn Nets,3501898 - Charlotte Hornets,5222728 - Chicago Bulls,5785558 +Using Non-Printable ASCII Characters as Delimiters +-------------------------------------------------- -Saving files to an authenticated S3 bucket --------------------------------------------- +The following is an example of using non-printable ASCII characters as delimiters: -.. code-block:: psql - - COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 's3://my_bucket/salaries/nba_export.csv', AWS_ID = 'my_aws_id', AWS_SECRET = 'my_aws_secret'); +Non-printable characters can be specified using their octal representations, by using the ``E'\000'`` format, where ``000`` is the octal value of the character. -Saving files to an HDFS path --------------------------------------------- +For example, ASCII character ``15``, known as "shift in", can be specified using ``E'\017'``. .. code-block:: psql - COPY (SELECT "Team", AVG("Salary") FROM nba GROUP BY 1) TO WRAPPER csv_fdw OPTIONS (LOCATION = 'hdfs://pp_namenode:8020/nba_export.csv'); - - -Export table to a parquet file ------------------------------- + COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\017'); .. code-block:: psql - COPY nba TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/nba_export.parquet'); - + COPY nba TO WRAPPER csv_fdw OPTIONS (LOCATION = '/tmp/nba_export.csv', DELIMITER = E'\011'); -- 011 is a tab character -Export a query to a parquet file --------------------------------- +Using the ``MAX_FILE_SIZE`` and ``ENFORCE_SINGLE_FILE`` parameters: +------------------------------------------------------------------- .. code-block:: psql - COPY (select x,y from t where z=0) TO WRAPPER parquet_fdw OPTIONS (LOCATION = '/tmp/file.parquet'); + COPY nba TO WRAPPER csv_fdw OPTIONS( + max_file_size = '250000000', + enforce_single_file = 'true', + location = '/tmp/nba_export.parquet' + ); -Export table to a ORC file ------------------------------- +Permissions +============= -.. code-block:: psql - - COPY nba TO WRAPPER orc_fdw OPTIONS (LOCATION = '/tmp/nba_export.orc'); +The role must have the ``SELECT`` permission on every table or schema that is referenced by the statement. \ No newline at end of file diff --git a/reference/sql/sql_statements/dml_commands/delete.rst b/reference/sql/sql_statements/dml_commands/delete.rst index 2aa9c6729..b6e184657 100644 --- a/reference/sql/sql_statements/dml_commands/delete.rst +++ b/reference/sql/sql_statements/dml_commands/delete.rst @@ -1,18 +1,17 @@ .. _delete: -********************** +****** DELETE -********************** +****** Overview -================== +======== + The ``DELETE`` statement is used to remove specific rows from a table. SQream deletes data in the following steps: 1. The designated rows are marked as deleted, but remain on-disk until the user initiates a clean-up process. - - :: #. The user initiates a clean-up process is initiated to delete the rows. @@ -22,8 +21,6 @@ Note the following: * The :ref:`ALTER TABLE` and other `DDL operations `_ are blocked on tables that require clean-up. - - * The value expression for deletion cannot be the result of a sub-query or join. * SQream may abort delete processes exceeding a pre-defined time threshold. If the estimated time exceeds the threshold, an error message is displayed with an description for overriding the threshold and continuing with the delete. @@ -35,7 +32,7 @@ For more information about SQream's delete methodology, see the :ref:`delete_gui * To delete columns, see :ref:`DROP COLUMN`. Permissions -============= +=========== To execute the ``DELETE`` statement, the ``DELETE`` and ``SELECT`` permissions must be assigned to the role at the table level. @@ -43,7 +40,8 @@ For more information about assigning permissions to roles, see `Creating, Assign Syntax -========== +====== + The following is the correct syntax for executing the ``DELETE`` statement: .. code-block:: postgres @@ -72,9 +70,17 @@ The following is the correct syntax for triggering a clean-up: schema_name ::= identifier +For systems with delete parallelism capabilities, use the following syntax to enhance deletion performance and shorten runtime: + +.. code-block:: postgres + + SELECT set_parallel_delete_threads(x); + +.. note:: You may configure up to 10 threads. Parameters -============ +========== + The following table describes the parameters used for executing the ``DELETE`` statement: .. list-table:: @@ -91,10 +97,16 @@ The following table describes the parameters used for executing the ``DELETE`` s - An expression that returns Boolean values using columns, such as `` = ``. Rows that match the expression will be deleted. +Limitations +=========== + +**Encrypted Columns:** ``CLEANUP_CHUNKS`` does not support tables with encrypted columns. Use :ref:`RECHUNK ` instead. + Examples -=========== +======== + The **Examples** section shows the following examples: * :ref:`Deleting values from a table` @@ -104,7 +116,8 @@ The **Examples** section shows the following examples: .. _deleting_values_from_a_table: Deleting Values from a Table ------------------------------- +---------------------------- + The following shows an example of deleting values from a table: .. code-block:: psql @@ -133,7 +146,8 @@ The following shows an example of deleting values from a table: .. _deleting_values_based_on_more_complex_predicates: Deleting Values Based on More Complex Predicates ---------------------------------------------------- +------------------------------------------------ + The following shows an example of deleting values based on more complex predicates: .. code-block:: psql @@ -160,7 +174,8 @@ The following shows an example of deleting values based on more complex predicat 4 rows Deleting Values that Contain Multi-Table Conditions ------------------ +--------------------------------------------------- + The following shows an example of deleting values that contain multi-table conditions. The example is based on the following tables: .. image:: /_static/images/delete_optimization.png @@ -183,7 +198,8 @@ The statement below uses the ``EXISTS`` subquery to delete all bands based in Sw .. _identifying_and_cleaning_up_tables: Identifying and Cleaning Up Tables ---------------------------------------- +---------------------------------- + The following section shows examples of each phase required for cleaning up tables: * :ref:`Listing tables that require clean-up` @@ -193,7 +209,8 @@ The following section shows examples of each phase required for cleaning up tabl .. _listing_tables_that_require_cleanup: Listing Tables that Require Clean-Up -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + The following shows an example of listing tables that require clean-up: .. code-block:: psql @@ -209,7 +226,8 @@ The following shows an example of listing tables that require clean-up: .. _identifying_cleanup_predicates: Identify Clean-Up Predicates -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + The following shows an example of listing the clean-up predicates: .. code-block:: psql @@ -225,7 +243,8 @@ The following shows an example of listing the clean-up predicates: .. _triggering_a_cleanup: Triggering a Clean-Up -^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^ + The following shows an example of triggering a clean-up: .. code-block:: psql diff --git a/reference/sql/sql_statements/dml_commands/insert.rst b/reference/sql/sql_statements/dml_commands/insert.rst index 2607669bd..4005b471e 100644 --- a/reference/sql/sql_statements/dml_commands/insert.rst +++ b/reference/sql/sql_statements/dml_commands/insert.rst @@ -113,6 +113,9 @@ For example, SELECT name, weight FROM all_animals WHERE region = 'Australia'; + +.. warning:: The ``SELECT`` statement decrypts information by default. When executing ``INSERT INTO TABLE AS SELECT``, encrypted information will appear as clear text in the newly created table. + Inserting data with positional placeholders --------------------------------------------- diff --git a/reference/sql/sql_statements/dml_commands/select.rst b/reference/sql/sql_statements/dml_commands/select.rst index f47ffaec1..60bf2fa0b 100644 --- a/reference/sql/sql_statements/dml_commands/select.rst +++ b/reference/sql/sql_statements/dml_commands/select.rst @@ -1,8 +1,8 @@ .. _select: -********************** +****** SELECT -********************** +****** ``SELECT`` is the main statement that allows reading and processing of data. It is used to retrieve rows and columns from one or more tables. @@ -12,14 +12,15 @@ When used alone, the statement is known as a "``SELECT`` statement" or "``SELECT .. contents:: In this topic: :local: + :depth: 1 Permissions -============= +=========== The role must have the ``SELECT`` permission on every table or schema that is referenced by the ``SELECT`` query. Syntax -========== +====== .. code-block:: postgres @@ -87,7 +88,7 @@ Syntax Elements -============ +======== .. list-table:: :widths: auto @@ -111,12 +112,14 @@ Elements - Restricts the operation to only retrieve the first ``num_rows`` rows. * - ``UNION ALL`` - Concatenates the results of two queries together. ``UNION ALL`` does not remove duplicates. + * - ``TOP`` + - Limits the number of rows returned by a query. The ``TOP`` clause takes a numeric expression and a **subtraction** :ref:`arithmetic operator` ``TOP(a-b)``. Notes -=========== +===== -Query processing ------------------ +Query Processing +---------------- Queries are processed in a manner equivalent to the following order: @@ -136,8 +139,8 @@ Inside the ``FROM`` clause, the processing occurs in the usual way, from the out .. _select_lists: -Select lists ----------------- +Select Lists +------------ The ``select_list`` is a comma separated list of column names and value expressions. @@ -155,7 +158,7 @@ The ``select_list`` is a comma separated list of column names and value expressi SELECT a, SUM(b) FROM t GROUP BY 1 ORDER BY 2 DESC; Examples -=========== +======== Assume a table named ``nba``, with the following structure: @@ -163,14 +166,14 @@ Assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); @@ -183,8 +186,8 @@ Here's a peek at the table contents (:download:`Download nba.csv SELECT COUNT(*) FROM nba; 457 -Get all columns ------------------ +Get All Columns +--------------- ``*`` is used as shorthand for "all columns". @@ -237,8 +240,8 @@ Get all columns .. _where: -Filter on conditions ------------------------ +Filter On Conditions +-------------------- Use the ``WHERE`` clause to filter results. @@ -259,8 +262,8 @@ Use the ``WHERE`` clause to filter results. Joel Embiid,22,4626960 Nerlens Noel,22,3457800 -Filter based on a list ------------------------- +Filter Based On a List +---------------------- ``WHERE column IN (value_expr in comma separated list)`` matches the column with any value in the list. @@ -277,8 +280,8 @@ Filter based on a list Jeff Withey,26,947276,Utah Jazz -Select only distinct rows ---------------------------- +Select Only Distinct Rows +------------------------- .. code-block:: psql @@ -314,16 +317,16 @@ Select only distinct rows Utah Jazz Washington Wizards -Count distinct values ------------------------ +Count Distinct Values +--------------------- .. code-block:: psql nba=> SELECT COUNT(DISTINCT "Team") FROM nba; 30 -Rename columns with aliases ------------------------------ +Rename Columns With Aliases +--------------------------- .. code-block:: psql @@ -341,8 +344,8 @@ Rename columns with aliases R.J. Hunter | Boston Celtics | 1148640 Jonas Jerebko | Boston Celtics | 5000000 -Searching with ``LIKE`` -------------------------- +Searching With ``LIKE`` +----------------------- :ref:`like` allows pattern matching text in the ``WHERE`` clause. @@ -359,8 +362,8 @@ Searching with ``LIKE`` Allen Crabbe,24,947276,Portland Trail Blazers Ed Davis,27,6980802,Portland Trail Blazers -Aggregate functions ----------------------- +Aggregate Functions +------------------- Aggregate functions compute a single result from a column. @@ -390,8 +393,8 @@ Aggregate functions are often combined with ``GROUP BY``. A query like ``SELECT "Team",max("Salary") FROM nba`` is not valid, and will result in an error. -Filtering on aggregates --------------------------- +Filtering on Aggregates +----------------------- Filtering on aggregates is done with the ``HAVING`` clause, rather than the ``WHERE`` clause. @@ -410,8 +413,8 @@ Filtering on aggregates is done with the ``HAVING`` clause, rather than the ``WH .. _order_by: -Sorting results -------------------- +Sorting Results +--------------- ``ORDER BY`` takes a comma separated list of ordering specifications - a column followed by ``ASC`` for ascending or ``DESC`` for descending. @@ -450,8 +453,8 @@ Sorting results Portland Trail Blazers | 3220121 Philadelphia 76ers | 2213778 -Sorting with multiple columns ------------------------------------ +Sorting With Multiple Columns +----------------------------- Order retrieved rows by multiple columns: @@ -471,8 +474,8 @@ Order retrieved rows by multiple columns: Aaron Brooks | PG | 161 | 2250000 -Combining two or more queries ---------------------------------- +Combining Two or More Queries +----------------------------- ``UNION ALL`` can be used to combine the results of two or more queries into one result set. @@ -488,17 +491,15 @@ Combining two or more queries PG PG -Common table expressions (CTE) --------------------------------- - -Common table expressions or CTEs allow a possibly complex subquery to be represented in a short way later on, for improved readability. +Common Table Expression +----------------------- -It does not affect query performance. +A Common Table Expression (CTE) is a temporary named result set that can be referenced within a statement, allowing for more readable and modular queries. CTEs do not affect query performance. .. code-block:: psql - nba=> WITH s AS (SELECT "Name" FROM nba WHERE "Salary" > 20000000) - . SELECT * FROM nba AS n, s WHERE n."Name" = s."Name"; + WITH s AS (SELECT Name FROM nba WHERE Salary > 20000000) + SELECT * FROM nba AS n, s WHERE n.Name = s.Name; Name | Team | Number | Position | Age | Height | Weight | College | Salary | name0 ----------------+-----------------------+--------+----------+-----+--------+--------+--------------+----------+---------------- Carmelo Anthony | New York Knicks | 7 | SF | 32 | 6-8 | 240 | Syracuse | 22875000 | Carmelo Anthony @@ -510,31 +511,31 @@ It does not affect query performance. Kobe Bryant | Los Angeles Lakers | 24 | SF | 37 | 6-6 | 212 | | 25000000 | Kobe Bryant LeBron James | Cleveland Cavaliers | 23 | SF | 31 | 6-8 | 250 | | 22970500 | LeBron James -In this example, the ``WITH`` clause defines the temporary name ``r`` for the subquery which finds salaries over $20 million. The result set becomes a valid table reference in any table expression of the subsequent SELECT clause. +In this example, the ``WITH`` clause defines the temporary name ``s`` for the subquery which finds salaries over $20 million. The result set becomes a valid table reference in any table expression of the subsequent ``SELECT`` clause. Nested CTEs -^^^^^^^^^^^^^^ +^^^^^^^^^^^ -SQream DB also supports any amount of nested CTEs, such as this: +SQreamDB also supports any amount of nested CTEs, such as this: .. code-block:: postgres WITH w AS (SELECT * FROM - (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY "Salary" DESC)) - SELECT * FROM w ORDER BY "Weight" DESC; + (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY Salary DESC)) + SELECT * FROM w ORDER BY Weight DESC; Reusing CTEs -^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^ -SQream DB supports reusing CTEs several times in a query: +SQreamDB supports reusing CTEs several times in a query: .. code-block:: psql - nba=> WITH - . nba_ct AS (SELECT "Name", "Team" FROM nba WHERE "College"='Connecticut'), - . nba_az AS (SELECT "Name", "Team" FROM nba WHERE "College"='Arizona') - . SELECT * FROM nba_az JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; + WITH + nba_ct AS (SELECT "Name", "Team" FROM nba WHERE "College"='Connecticut'), + nba_az AS (SELECT "Name", "Team" FROM nba WHERE "College"='Arizona') + SELECT * FROM nba_az JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; Name | Team | name0 | team0 ----------------+-----------------+----------------+---------------- Stanley Johnson | Detroit Pistons | Andre Drummond | Detroit Pistons diff --git a/reference/sql/sql_statements/dml_commands/update.rst b/reference/sql/sql_statements/dml_commands/update.rst new file mode 100644 index 000000000..79ffc7bf7 --- /dev/null +++ b/reference/sql/sql_statements/dml_commands/update.rst @@ -0,0 +1,204 @@ +.. _update: + +****** +UPDATE +****** + +The **UPDATE** statement page describes the following: + +.. |icon-new_2022.1| image:: /_static/images/new_2022.1.png + :align: middle + :width: 110 + +.. contents:: + :local: + :depth: 1 + +Overview +======== + +The ``UPDATE`` statement is used to modify the value of certain columns in existing rows without creating a table. + +It can be used to do the following: + +* Performing localized changes in existing data, such as correcting mistakes discovered after ingesting data. + + :: + +* Setting columns based on the values of others. + +.. warning:: Using the ``UPDATE`` command on column clustered using a cluster key can undo your clustering. + +The ``UPDATE`` statement cannot be used to reference other tables in the ``WHERE`` or ``SET`` clauses. + +Syntax +====== + +The following is the correct syntax for the ``UPDATE`` command: + +.. code-block:: postgres + + UPDATE target_table_name [[AS] alias1] + SET column_name = expression [,...] + [FROM additional_table_name [[AS] alias2][,...]] + [WHERE condition] + + + +Parameters +========== + +The following table describes the ``UPDATE`` parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``target_table_name`` + - Specifies the table containing the data to be updated. + * - ``column_name`` + - Specifies the column containing the data to be updated. + * - ``additional_table_name`` + - Additional tables used in the WHERE condition for performing complex joins. + * - ``condition`` + - Specifies the condition for updating the data. + +Examples +======== + +The examples section shows how to modify the value of certain columns in existing rows without creating a table. + +To be able to follow the examples, create these two tables: + +**countries** + ++----+--------+--------------+ +| id | name | records_sold | ++====+========+==============+ +| 1 | Israel | 0 | ++----+--------+--------------+ +| 2 | UK | 0 | ++----+--------+--------------+ +| 3 | USA | 0 | ++----+--------+--------------+ +| 4 | Sweden | 0 | ++----+--------+--------------+ + +**bands** + ++----+-------------+------------+ +| id | name | country_id | ++====+=============+============+ +| 1 | The Beatles | 2 | ++----+-------------+------------+ +| 2 | The Ramones | 3 | ++----+-------------+------------+ +| 3 | ABBA | 4 | ++----+-------------+------------+ +| 4 | Ace of Base | 4 | ++----+-------------+------------+ + +.. code-block:: postgres + + create or replace table countries ( id int, name text, records_sold int); + insert into countries values (1, 'Israel', 0); + insert into countries values (2, 'UK', 0); + insert into countries values (3, 'USA', 0); + insert into countries values (4, 'Sweden', 0); + + create or replace table bands ( id int, name text, country_id int); + insert into bands values (1, 'The Beatles', 2); + insert into bands values (2, 'The Ramones', 3); + insert into bands values (3, 'ABBA', 4); + insert into bands values (4, 'Ace of Base', 4); + + + +.. contents:: + :local: + :depth: 1 + +Updating an Entire Table +------------------------ + +Two different ``UPDATE`` methods for updating an entire table. + +.. code-block:: postgres + + UPDATE countries SET records_sold = 0; + +.. code-block:: postgres + + UPDATE countries SET records_sold = 0 WHERE true; + + +Performing Simple Updates +------------------------- + +The following is an example of performing a simple update: + +.. code-block:: postgres + + UPDATE countries SET records_sold = records_sold + 1 WHERE name = 'Israel'; + +Updating Tables that Contain Multi-Table Conditions +--------------------------------------------------- + +The following shows an example of updating tables that contain multi-table conditions: + +.. code-block:: postgres + + UPDATE countries + SET records_sold = records_sold + 1 + WHERE EXISTS ( + SELECT 1 FROM bands + WHERE bands.country_id = countries.id + AND bands.name = 'ABBA' + ); + + +Updating Tables that Contain Multi-Table Expressions +---------------------------------------------------- + +The following shows an example of updating tables that contain multi-table expressions: + +.. code-block:: postgres + + UPDATE countries + SET records_sold = records_sold + + CASE + WHEN name = 'Israel' THEN 2 + ELSE 1 + END + FROM countries c + ; + +Triggering a Cleanup +-------------------- + +When an ``UPDATE`` statement is executed, it creates new chunks that contain the updated data. As a result, residual data may be left behind, and a cleanup operation is necessary to complete the physical removal of data. +This cleanup is usually done automatically overnight, but you can choose to do so yourself to remove the redundant files immediately. + + +The following is the syntax for triggering a cleanup: + +.. code-block:: postgres + + SELECT cleanup_chunks('schema_name','table_name'); + SELECT cleanup_extents('schema_name','table_name'); + + +Permissions +=========== + +Executing an ``UPDATE`` statement requires the following permissions: + +* Both ``UPDATE`` and ``SELECT`` permissions on the target table. +* The ``SELECT`` permission for each additional table you reference in the statement (in either the ``FROM`` clause or ``WHERE`` subquery section). + +Locking and Concurrency +======================= + +Executing the ``UPDATE`` statement obtains an exclusive UPDATE lock on the target table, but does not lock the destination tables. \ No newline at end of file diff --git a/reference/sql/sql_statements/dml_commands/values.rst b/reference/sql/sql_statements/dml_commands/values.rst index 4e8e1f28b..aa8d367e1 100644 --- a/reference/sql/sql_statements/dml_commands/values.rst +++ b/reference/sql/sql_statements/dml_commands/values.rst @@ -1,83 +1,105 @@ .. _values: -********************** +****** VALUES -********************** +****** -``VALUES`` is a table constructor - a clause that can be used to define tabular data. - -.. tip:: - * Use VALUES in conjunction with :ref:`INSERT` statements to insert a set of one or more rows. - - -Permissions -============= - -This clause requires no special permissions. +``VALUES`` is a table constructor used to define tabular data. It’s utilized with :ref:`INSERT` statements to insert one or more rows. Syntax -========== +====== .. code-block:: postgres - values_expr ::= VALUES ( value_expr [, ... ] ) [, ... ] - ; -Notes +Usage Notes =========== -Each set of comma-separated ``value_expr`` in parentheses represents a single row in the result set. +.. glossary:: + + ``value_expr`` + Each set of comma-separated ``value_expr`` in parentheses represents a single row in the result set. -Column names of the result table are auto-generated. To rename the column names, add an ``AS`` clause. + **Column Names** + Column names of the result table are auto-generated. To rename the column names, add an ``AS`` clause. + + **Aggregations** + Aggregations (e.g., ``SUM``, ``COUNT``) cannot be directly used in the ``VALUES`` clause. Examples -=========== +======== -Tabular data with VALUES --------------------------- +Tabular data with ``VALUES`` +---------------------------- .. code-block:: psql - master=> VALUES (1,2,3,4), (5,6,7,8), (9,10,11,12); - 1,2,3,4 - 5,6,7,8 - 9,10,11,12 - 3 rows + VALUES (1,2,3,4), (5,6,7,8), (9,10,11,12); -Using VALUES with a SELECT query ----------------------------------- + clmn1 |clmn2 |clmn3 |clmn4 + ------+------+------+----- + 1 | 2 | 3 | 4 + 5 | 6 | 7 | 8 + 9 | 10 | 11 | 12 -To use VALUES in a select query, assign a :ref:`name` to the ``VALUES`` clause with ``AS`` +Using ``VALUES`` in a ``SELECT`` Query +-------------------------------------- .. code-block:: postgres - master=> SELECT t.* FROM (VALUES (1,2,3,'a'), (5,6,7,'b'), (9,10,11,'c')) AS t; - 1,2,3,a - 5,6,7,b - 9,10,11,c - - 3 rows + SELECT + t.* + FROM + ( + VALUES + (1, 2, 3, 'a'), + (5, 6, 7, 'b'), + (9, 10, 11, 'c') + ) AS t; + + clmn1 |clmn2 |clmn3 |clmn4 + ------+------+------+----- + 1 | 2 | 3 | a + 5 | 6 | 7 | b + 9 | 10 | 11 | c You can also use this to rename the columns .. code-block:: postgres - SELECT t.* FROM (VALUES (1,2,3,'a'), (5,6,7,'b'), (9,10,11,'c')) AS t(a,b,c,d); - - -Creating a table with ``VALUES`` + SELECT + t.* + FROM + ( + VALUES + (1, 2, 3, 'a'), + (5, 6, 7, 'b'), + (9, 10, 11, 'c') + ) AS t(a, b, c, d); + +Creating a Table Using ``VALUES`` --------------------------------- Use ``AS`` to assign names to columns .. code-block:: postgres - CREATE TABLE cool_animals AS - (SELECT t.* - FROM (VALUES (1, 'dog'), - (2, 'cat'), - (3, 'horse'), - (4, 'hippopotamus') - ) AS t(id, name) - ); + CREATE TABLE + cool_animals AS ( + SELECTt.* + FROM + ( + VALUES + (1, 'dog'), + (2, 'cat'), + (3, 'horse'), + (4, 'hippopotamus') + ) + AS t(id, name) + ); + +Permissions +=========== + +This clause requires no special permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/index.rst b/reference/sql/sql_statements/index.rst index 955e1212f..b983876b2 100644 --- a/reference/sql/sql_statements/index.rst +++ b/reference/sql/sql_statements/index.rst @@ -4,15 +4,23 @@ SQL Statements *************** -SQream DB supports commands from ANSI SQL. +The **SQL Statements** page describes the following commands: + +.. contents:: + :local: + :depth: 1 + +SQream supports commands from ANSI SQL. .. _ddl_commands_list: Data Definition Commands (DDL) ================================ -.. list-table:: DDL Commands - :widths: auto +The following table shows the Data Definition commands: + +.. list-table:: + :widths: 30 100 :header-rows: 1 :name: ddl_commands @@ -24,13 +32,13 @@ Data Definition Commands (DDL) - Change the default schema for a role * - :ref:`ALTER TABLE` - Change the schema of a table + * - :ref:`CLUSTER BY` + - Change clustering keys in a table * - :ref:`CREATE DATABASE` - Create a new database - * - :ref:`CREATE EXTERNAL TABLE` - - Create a new external table in the database (deprecated) * - :ref:`CREATE FOREIGN TABLE` - Create a new foreign table in the database - * - :ref:`CREATE FUNCTION ` + * - :ref:`CREATE FUNCTION` - Create a new user defined function in the database * - :ref:`CREATE SCHEMA` - Create a new schema in the database @@ -40,6 +48,8 @@ Data Definition Commands (DDL) - Create a new table in the database using results from a select query * - :ref:`CREATE VIEW` - Create a new view in the database + * - :ref:`DROP CLUSTERING KEY` + - Drops all clustering keys in a table * - :ref:`DROP COLUMN` - Drop a column from a table * - :ref:`DROP DATABASE` @@ -56,15 +66,20 @@ Data Definition Commands (DDL) - Rename a column * - :ref:`RENAME TABLE` - Rename a table + * - :ref:`RENAME SCHEMA` + - Rename a schema + + Data Manipulation Commands (DML) ================================ -.. list-table:: DML Commands - :widths: auto +The following table shows the Data Manipulation commands: + +.. list-table:: + :widths: 30 100 :header-rows: 1 :name: dml_commands - * - Command - Usage @@ -82,87 +97,101 @@ Data Manipulation Commands (DML) - Select rows and column from a table * - :ref:`TRUNCATE` - Delete all rows from a table + * - :ref:`UPDATE` + - Modify the value of certain columns in existing rows without creating a table * - :ref:`VALUES` - Return rows containing literal values Utility Commands ================== -.. list-table:: Utility Commands - :widths: auto +The following table shows the Utility commands: + +.. list-table:: + :widths: 30 100 :header-rows: 1 - + * - Command - Usage - * - :ref:`SELECT GET_LICENSE_INFO` - - View a user's license information - * - :ref:`SELECT GET_DDL` + * - :ref:`DROP SAVED QUERY` + - Drops a saved query + * - :ref:`DUMP DATABASE DDL` + - View the ``CREATE TABLE`` statement for an current database + * - :ref:`EXECUTE SAVED QUERY` + - Executes a previously saved query + * - :ref:`EXPLAIN` + - Returns a static query plan, which can be used to debug query plans + * - :ref:`export_open_snapshots` + - Lists and saves information about all currently open snapshots to a specified file + * - :ref:`GET` + - Transfer data files stored within the database's internal staging area to a user's local file system. + * - :ref:`get_chunk_info` + - Retrieves information of specific chunks + * - :ref:`GET DDL` - View the ``CREATE TABLE`` statement for a table - * - :ref:`SELECT GET_FUNCTION_DDL` + * - :ref:`get_extent_info` + - Retrieves information of specific extents + * - :ref:`GET FUNCTION DDL` - View the ``CREATE FUNCTION`` statement for a UDF - * - :ref:`SELECT GET_VIEW_DDL` - - View the ``CREATE VIEW`` statement for a view - * - :ref:`SELECT RECOMPILE_VIEW` + * - :ref:`GET LICENSE INFO` + - View a user's license information + * - :ref:`GLOBAL GRACEFUL SHUTDOWN` + - Graceful shutdown of all servers in the cluster + * - :ref:`GPU METRICS` + - Monitor license quota usage by reviewing monthly or daily GPU usage + * - :ref:`get_open_snapshots` + - Lists information about all currently open snapshots + * - :ref:`GET TOTAL CHUNKS SIZE` + - Returns the total size of all data chunks saved in the system + * - :ref:`GET VIEW DDL` + - View the ``CREATE VIEW`` statement for a view + * - :ref:`HEALTH CHECK MONITORING` + - Returns system health monitoring logs + * - :ref:`LDAP GET ATTR` + - Enables you to specify the LDAP attributes you want the SQreamDB role catalog table to show + * - :ref:`LIST SAVED QUERIES` + - Lists previously saved query names, one per row + * - :ref:`PUT` + - Transfer data files stored within a user's local file system to the database's internal staging area. + * - :ref:`RECHUNK` + - Enables you to merge small data chunks into larger ones + * - :ref:`RECOMPILE SAVED QUERY` + - Recompiles a saved query that has been invalidated due to a schema change + * - :ref:`RECOMPILE VIEW` - Recreate a view after schema changes - * - :ref:`SELECT DUMP_DATABASE_DDL` - - View the ``CREATE TABLE`` statement for an current database - -Saved Queries -=================== + * - :ref:`REMOVE` + - Delete data files stored within the database's internal staging area. + * - :ref:`REMOVE LOCK` + - Clears locks + * - :ref:`REMOVE STATEMENT LOCKS` + - Clears all locks in the system + * - :ref:`SHOW CONNECTIONS` + - Returns a list of active sessions on the current worker + * - :ref:`SHOW LOCKS` + - Returns a list of locks from across the cluster + * - :ref:`SHOW NODE INFO` + - Returns a snapshot of the current query plan, similar to ``EXPLAIN ANALYZE`` from other databases + * - :ref:`SHOW SAVED QUERY` + - Returns a single row result containing the saved query string + * - :ref:`SHOW SERVER STATUS` + - Returns a list of active sessions across the cluster + * - :ref:`SHUTDOWN SERVER` + - Sets your server to finish compiling all active queries before shutting down according to a user-defined time value + * - :ref:`STOP STATEMENT` + - Stops or aborts an active statement + * - :ref:`SHOW VERSION` + - Returns the system version for SQream DB + * - :ref:`swap_table_names` + - Swaps the names of two tables contained within a schema -.. list-table:: Saved Queries - :widths: auto - :header-rows: 1 - - * - Command - - Usage - * - :ref:`SELECT DROP_SAVED_QUERY` - - Drop a saved query - * - :ref:`SELECT EXECUTE_SAVED_QUERY` - - Executes a saved query - * - :ref:`SELECT LIST_SAVED_QUERIES` - - Returns a list of saved queries - * - :ref:`SELECT RECOMPILE_SAVED_QUERY` - - Recompiles a query that has been invalidated by a schema change - * - :ref:`SELECT SAVE_QUERY` - - Compiles and saves a query for re-use and sharing - * - :ref:`SELECT SHOW_SAVED_QUERY` - - Shows query text for a saved query - -For more information, see :ref:`saved_queries` - - -Monitoring -=============== - -Monitoring statements allow a database administrator to execute actions in the system, such as aborting a query or get information about system processes. - -.. list-table:: Monitoring - :widths: auto - :header-rows: 1 - - * - Command - - Usage - * - :ref:`explain` - - Returns a static query plan for a statement - * - :ref:`show_connections` - - Returns a list of jobs and statements on the current worker - * - :ref:`show_locks` - - Returns any existing locks in the database - * - :ref:`show_node_info` - - Returns a query plan for an actively running statement with timing information - * - :ref:`show_server_status` - - Shows running statements across the cluster - * - :ref:`show_version` - - Returns the version of SQream DB - * - :ref:`stop_statement` - - Stops a query (or statement) if it is currently running Workload Management ====================== -.. list-table:: Workload Management - :widths: auto +The following table shows the Workload Management commands: + +.. list-table:: + :widths: 30 100 :header-rows: 1 * - Command @@ -170,16 +199,18 @@ Workload Management * - :ref:`subscribe_service` - Add a SQream DB worker to a service queue * - :ref:`unsubscribe_service` - - Remove a SQream DB worker to a service queue + - Remove a SQream DB worker from a service queue * - :ref:`show_subscribed_instances` - Return a list of service queues and workers Access Control Commands ================================ -.. list-table:: Access Control Commands - :widths: auto - :header-rows: 1 +The following table shows the Access Control commands: + +.. list-table:: + :widths: 30 100 + :header-rows: 1 * - Command - Usage @@ -191,25 +222,23 @@ Access Control Commands - Creates a roles, which lets a database administrator control permissions on tables and databases * - :ref:`drop_role` - Removes roles + * - :ref:`get_all_roles_database_ddl` + - Returns the definition of all role databases in DDL format + * - :ref:`get_role_permissions` + - Returns all permissions granted to a role in table format + * - :ref:`get_role_global_ddl` + - Returns the definition of a global role in DDL format + * - :ref:`get_all_roles_global_ddl` + - Returns the definition of all global roles in DDL format + * - :ref:`get_role_database_ddl` + - Returns the definition of a role's database in DDL format * - :ref:`get_statement_permissions` - Returns a list of permissions required to run a statement or query * - :ref:`grant` - Grant permissions to a role + * - :ref:`grant_usage_on_service_to_all_roles` + - Grant service usage permissions * - :ref:`revoke` - Revoke permissions from a role * - :ref:`rename_role` - Rename a role - - -.. toctree:: - :maxdepth: 1 - :titlesonly: - :hidden: - :glob: - - ddl_commands/* - dml_commands/* - utility_commands/* - monitoring_commands/* - wlm_commands/* - access_control_commands/* \ No newline at end of file diff --git a/reference/sql/sql_statements/monitoring_commands/show_locks.rst b/reference/sql/sql_statements/monitoring_commands/show_locks.rst deleted file mode 100644 index d5e7c02ec..000000000 --- a/reference/sql/sql_statements/monitoring_commands/show_locks.rst +++ /dev/null @@ -1,79 +0,0 @@ -.. _show_locks: - -******************** -SHOW_LOCKS -******************** - -``SHOW_LOCKS`` returns a list of locks from across the cluster. - -Read more about locks in :ref:`concurrency_and_locks`. - -Permissions -============= - -The role must have the ``SUPERUSER`` permissions. - -Syntax -========== - -.. code-block:: postgres - - show_locks_statement ::= - SELECT SHOW_LOCKS() - ; - -Parameters -============ - -None - -Returns -========= - -This function returns a list of active locks. If no locks are active in the cluster, the result set will be empty. - -.. list-table:: Result columns - :widths: auto - :header-rows: 1 - - * - ``stmt_id`` - - Statement ID that caused the lock. - * - ``stmt_string`` - - Statement text - * - ``username`` - - The role that executed the statement - * - ``server`` - - The worker node's IP - * - ``port`` - - The worker node's port - * - ``locked_object`` - - The full qualified name of the object being locked, separated with ``$`` (e.g. ``table$t$public$nba2`` for table ``nba2`` in schema ``public``, in database ``t`` - * - ``lockmode`` - - The locking mode (:ref:`inclusive` or :ref:`exclusive`). - * - ``statement_start_time`` - - Timestamp the statement started - * - ``lock_start_time`` - - Timestamp the lock was obtained - - -Examples -=========== - -Using ``SHOW_LOCKS`` to see active locks ---------------------------------------------------- - -In this example, we create a table based on results (:ref:`create_table_as`), but we are also effectively dropping the previous table (by using ``OR REPLACE``). Thus, SQream DB applies locks during the table creation process to prevent the table from being altered during it's creation. - - -.. code-block:: psql - - t=> SELECT SHOW_LOCKS(); - statement_id | statement_string | username | server | port | locked_object | lockmode | statement_start_time | lock_start_time - -------------+-------------------------------------------------------------------------------------------------+----------+--------------+------+---------------------------------+-----------+----------------------+-------------------- - 287 | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream | 192.168.1.91 | 5000 | database$t | Inclusive | 2019-12-26 00:03:30 | 2019-12-26 00:03:30 - 287 | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream | 192.168.1.91 | 5000 | globalpermission$ | Exclusive | 2019-12-26 00:03:30 | 2019-12-26 00:03:30 - 287 | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream | 192.168.1.91 | 5000 | schema$t$public | Inclusive | 2019-12-26 00:03:30 | 2019-12-26 00:03:30 - 287 | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream | 192.168.1.91 | 5000 | table$t$public$nba2$Insert | Exclusive | 2019-12-26 00:03:30 | 2019-12-26 00:03:30 - 287 | CREATE OR REPLACE TABLE nba2 AS SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | sqream | 192.168.1.91 | 5000 | table$t$public$nba2$Update | Exclusive | 2019-12-26 00:03:30 | 2019-12-26 00:03:30 - - diff --git a/reference/sql/sql_statements/monitoring_commands/show_server_status.rst b/reference/sql/sql_statements/monitoring_commands/show_server_status.rst deleted file mode 100644 index f59f79ccc..000000000 --- a/reference/sql/sql_statements/monitoring_commands/show_server_status.rst +++ /dev/null @@ -1,108 +0,0 @@ -.. _show_server_status: - -******************** -SHOW_SERVER_STATUS -******************** - -``SHOW_SERVER_STATUS`` returns a list of active sessions across the cluster. - -To list active statements on the current worker only, see :ref:`show_connections`. - -Permissions -============= - -The role must have the ``SUPERUSER`` permissions. - -Syntax -========== - -.. code-block:: postgres - - show_server_status_statement ::= - SELECT SHOW_SERVER_STATUS() - ; - -Parameters -============ - -None - -Returns -========= - -This function returns a list of active sessions. If no sessions are active across the cluster, the result set will be empty. - -.. list-table:: Result columns - :widths: auto - :header-rows: 1 - - * - ``service`` - - The service name for the statement - * - ``instance`` - - The worker ID - * - ``connection_id`` - - Connection ID - * - ``serverip`` - - Worker end-point IP - * - ``serverport`` - - Worker end-point port - * - ``database_name`` - - Database name for the statement - * - ``user_name`` - - Username running the statement - * - ``clientip`` - - Client IP - * - ``statementid`` - - Statement ID - * - ``statement`` - - Statement text - * - ``statementstarttime`` - - Statement start timestamp - * - ``statementstatus`` - - Statement status (see table below) - * - ``statementstatusstart`` - - Last updated timestamp - -.. include from here: 66 - - -.. list-table:: Statement status values - :widths: auto - :header-rows: 1 - - * - Status - - Description - * - ``Preparing`` - - Statement is being prepared - * - ``In queue`` - - Statement is waiting for execution - * - ``Initializing`` - - Statement has entered execution checks - * - ``Executing`` - - Statement is executing - * - ``Stopping`` - - Statement is in the process of stopping - - -.. include until here 86 - -Notes -=========== - -* This utility shows the active sessions. Some sessions may be actively connected, but not running any statements. - -Examples -=========== - -Using ``SHOW_SERVER_STATUS`` to get statement IDs ----------------------------------------------------- - - -.. code-block:: psql - - t=> SELECT SHOW_SERVER_STATUS(); - service | instanceid | connection_id | serverip | serverport | database_name | user_name | clientip | statementid | statement | statementstarttime | statementstatus | statementstatusstart - --------+------------+---------------+--------------+------------+---------------+------------+-------------+-------------+-----------------------------+---------------------+-----------------+--------------------- - sqream | | 102 | 192.168.1.91 | 5000 | t | rhendricks | 192.168.0.1 | 128 | SELECT SHOW_SERVER_STATUS() | 24-12-2019 00:14:53 | Executing | 24-12-2019 00:14:53 - -The statement ID is ``128``, running on worker ``192.168.1.91``. diff --git a/reference/sql/sql_statements/monitoring_commands/stop_statement.rst b/reference/sql/sql_statements/monitoring_commands/stop_statement.rst deleted file mode 100644 index 30efc25b5..000000000 --- a/reference/sql/sql_statements/monitoring_commands/stop_statement.rst +++ /dev/null @@ -1,77 +0,0 @@ -.. _stop_statement: - -******************** -STOP_STATEMENT -******************** - -``STOP_STATEMENT`` stops or aborts an active statement. - -To find a statement by ID, see :ref:`show_server_status` and :ref:`show_connections`. - -.. tip:: Some DBMSs call this process killing a session, terminating a job, or kill query - -Permissions -============= - -The role must have the ``SUPERUSER`` permissions. - -Syntax -========== - -.. code-block:: postgres - - stop_statement_statement ::= - SELECT STOP_STATEMENT(stmt_id) - ; - - stmt_id ::= bigint - -Parameters -============ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Parameter - - Description - * - ``stmt_id`` - - The statement ID to stop - -Returns -========= - -This utility does not return any value, and always succeeds even if the statement does not exist, or has already stopped. - - -Notes -=========== - -* This utility always succeeds even if the statement does not exist, or has already stopped. - -Examples -=========== - -Using :ref:`show_connections` to get statement IDs ----------------------------------------------------- - -.. tip:: Use :ref:`show_server_status` to find statments from across the entire cluster, or :ref:`show_connections` to show statements from the current worker the client is connected to. - -.. code-block:: psql - - t=> SELECT SHOW_CONNECTIONS(); - ip | conn_id | conn_start_time | stmt_id | stmt_start_time | stmt - -------------+---------+---------------------+---------+---------------------+-------------------------- - 192.168.1.91 | 103 | 2019-12-24 00:01:27 | 129 | 2019-12-24 00:38:18 | SELECT GET_DATE(), * F... - 192.168.1.91 | 23 | 2019-12-24 00:01:27 | -1 | 2019-12-24 00:01:27 | - 192.168.1.91 | 22 | 2019-12-24 00:01:27 | -1 | 2019-12-24 00:01:27 | - 192.168.1.91 | 26 | 2019-12-24 00:01:28 | -1 | 2019-12-24 00:01:28 | - - -The statement ID we're interested in is ``129``. We can now stop this statement: - -.. code-block:: psql - - t=> SELECT STOP_STATEMENT(129) - executed - diff --git a/reference/sql/sql_statements/utility_commands/drop_saved_query.rst b/reference/sql/sql_statements/utility_commands/drop_saved_query.rst index f7faef6c5..af69bf715 100644 --- a/reference/sql/sql_statements/utility_commands/drop_saved_query.rst +++ b/reference/sql/sql_statements/utility_commands/drop_saved_query.rst @@ -1,28 +1,24 @@ +:orphan: + .. _drop_saved_query: ******************** -DROP_SAVED_QUERY +DROP SAVED QUERY ******************** -``DROP_SAVED_QUERY`` drops a :ref:`previously saved query`. - -Read more in the :ref:`saved_queries` guide. - -See also: ref:`save_query`, :ref:`execute_saved_query`, ref:`show_saved_query`, ref:`list_saved_queries`. +``DROP_SAVED_QUERY`` drops a previously :ref:`saved query`. -Permissions -============= +Read more in the :ref:`saved_queries` guide. -Dropping a saved query requires no special permissions. +See also: :ref:`save_query`, :ref:`execute_saved_query`, :ref:`show_saved_query`, :ref:`list_saved_queries`. Syntax ========== -.. code-block:: postgres +.. code-block:: sql drop_saved_query_statement ::= SELECT DROP_SAVED_QUERY(saved_query_name) - ; saved_query_name ::= string_literal @@ -49,7 +45,12 @@ Examples Dropping a previously saved query --------------------------------------- -.. code-block:: psql +.. code-block:: sql + + SELECT DROP_SAVED_QUERY('select_all'); + + +Permissions +============= - t=> SELECT DROP_SAVED_QUERY('select_all'); - executed +Dropping a saved query requires ``DDL`` permissions on the saved query and ``SELECT`` permissions to access the tables referenced in the query. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/dump_database_ddl.rst b/reference/sql/sql_statements/utility_commands/dump_database_ddl.rst index bf246b803..131b65f52 100644 --- a/reference/sql/sql_statements/utility_commands/dump_database_ddl.rst +++ b/reference/sql/sql_statements/utility_commands/dump_database_ddl.rst @@ -1,78 +1,71 @@ +:orphan: + .. _dump_database_ddl: ***************** DUMP_DATABASE_DDL ***************** -``DUMP_DATABASE_DDL()`` is a function that shows the ``CREATE`` statements for database objects including views and tables. Begining with 2020.3.1, DUMP_DATABASE_DDL includes foreign tables in the output. - -.. warning:: - This function does not currently show UDFs. To list available UDFs, use the catalog: - - .. code-block:: psql - - farm=> SELECT * FROM sqream_catalog.user_defined_functions; - farm,1,my_distance - - Then, export UDFs one-by-one using :ref:`GET_FUNCTION_DDL`. - -.. tip:: - * For just tables, see :ref:`GET_DDL`. - * For just views, see :ref:`GET_VIEW_DDL`. - * For UDFs, see :ref:`GET_FUNCTION_DDL`. - -Permissions -============= - -The role must have the ``CONNECT`` permission at the database level. +``DUMP_DATABASE_DDL()`` is a function that shows the ``CREATE`` statements for database objects including views and tables. Syntax -========== +====== .. code-block:: postgres - dump_database_ddl_statement ::= - SELECT DUMP_DATABASE_DDL() - ; + SELECT DUMP_DATABASE_DDL() -Parameters -============ +Examples +======== -This function accepts no parameters. +.. code-block:: postgres -Examples -=========== + SELECT + DUMP_DATABASE_DDL(); -Getting the DDL for a database ---------------------------------- +Exporting database DDL to a file +-------------------------------- -.. code-block:: psql +.. code-block:: postgres - farm=> SELECT DUMP_DATABASE_DDL(); - create table "public"."cool_animals" ( - "id" int not null, - "name" varchar(30) not null, - "weight" double null, - "is_agressive" bool default false not null - ) - ; + COPY + ( + SELECT + DUMP_DATABASE_DDL() + ) TO + WRAPPER + csv_fdw + OPTIONS + (LOCATION = 's3://sqream-docs/database.ddl'); + +Showing the ``CREATE`` Statements for UDFs +------------------------------------------ + +``DUMP_DATABASE_DDL`` does not show UDFs. + +To list available UDFs: + +#. Retrieve UDFs from catalog: + + .. code-block:: postgres - create view "public".angry_animals as - select - "cool_animals"."id" as "id", - "cool_animals"."name" as "name", - "cool_animals"."weight" as "weight", - "cool_animals"."is_agressive" as "is_agressive" - from - "public".cool_animals as cool_animals - where - "cool_animals"."is_agressive" = false; + SELECT + * + FROM + sqream_catalog.user_defined_functions; + Output: + .. code-block:: console -Exporting database DDL to a file ------------------------------------- + database_name|function_id|function_name| + -------------+-----------+-------------+ + master | 0|add_months | + master | 2|my_distance | + +#. Export UDFs one-by-one using :ref:`GET_FUNCTION_DDL`. -.. code-block:: postgres +Permissions +=========== - COPY (SELECT DUMP_DATABASE_DDL()) TO '/home/rhendricks/database.ddl'; +The role must have the ``CONNECT`` permission at the database level. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/execute_saved_query.rst b/reference/sql/sql_statements/utility_commands/execute_saved_query.rst index 6fe41fa08..cb3205be1 100644 --- a/reference/sql/sql_statements/utility_commands/execute_saved_query.rst +++ b/reference/sql/sql_statements/utility_commands/execute_saved_query.rst @@ -1,28 +1,24 @@ +:orphan: + .. _execute_saved_query: ******************** -EXECUTE_SAVED_QUERY +EXECUTE SAVED QUERY ******************** -``EXECUTE_SAVED_QUERY`` executes a :ref:`previously saved query`. +``EXECUTE_SAVED_QUERY`` executes a previously :ref:`saved query`. Read more in the :ref:`saved_queries` guide. -See also: ref:`save_query`, :ref:`drop_saved_query`, ref:`show_saved_query`, ref:`list_saved_queries`. - -Permissions -============= - -Executing a saved query requires ``SELECT`` permissions to access the tables referenced in the query. +See also: :ref:`save_query`, :ref:`drop_saved_query`, :ref:`show_saved_query`, :ref:`list_saved_queries`. Syntax ========== -.. code-block:: postgres +.. code-block:: sql execute_saved_query_statement ::= SELECT EXECUTE_SAVED_QUERY(saved_query_name, [ , argument [ , ... ] ] ) - ; saved_query_name ::= string_literal @@ -53,7 +49,7 @@ Notes * Query parameters can be used as substitutes for literal expressions. Parameters cannot be used to substitute identifiers, column names, table names, or other parts of the query. -* Query parameters of a string datatype (like ``VARCHAR``) must be of a fixed length, and can be used in equality checks, but not patterns (e.g. :ref:`like`, :ref:`rlike`, etc) +* Query parameters of a string datatype (like ``text``) must be of a fixed length, and can be used in equality checks, but not patterns (e.g. :ref:`like`, :ref:`rlike`, etc) * Query parameters' types are inferred at compile time. @@ -66,14 +62,14 @@ Assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); @@ -89,11 +85,11 @@ Here's a peek at the table contents (:download:`Download nba.csv SELECT SAVE_QUERY('select_all','SELECT * FROM nba'); - executed - t=> SELECT EXECUTE_SAVED_QUERY('select_all'); + SELECT SAVE_QUERY('select_all','SELECT * FROM nba'); + + SELECT EXECUTE_SAVED_QUERY('select_all'); Name | Team | Number | Position | Age | Height | Weight | College | Salary -------------------------+------------------------+--------+----------+-----+--------+--------+-----------------------+--------- Avery Bradley | Boston Celtics | 0 | PG | 25 | 6-2 | 180 | Texas | 7730337 @@ -109,11 +105,11 @@ Use parameters to replace them later at execution time. .. tip:: Use dollar quoting (`$$`) to avoid escaping strings. - .. code-block:: psql +.. code-block:: psql - t=> SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); - executed - t=> SELECT EXECUTE_SAVED_QUERY('select_by_weight_and_team', 240, 'Toronto Raptors'); + SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); + + SELECT EXECUTE_SAVED_QUERY('select_by_weight_and_team', 240, 'Toronto Raptors'); Name | Team | Number | Position | Age | Height | Weight | College | Salary ------------------+-----------------+--------+----------+-----+--------+--------+-------------+-------- Bismack Biyombo | Toronto Raptors | 8 | C | 23 | 6-9 | 245 | | 2814000 @@ -121,3 +117,7 @@ Use parameters to replace them later at execution time. Jason Thompson | Toronto Raptors | 1 | PF | 29 | 6-11 | 250 | Rider | 245177 Jonas Valanciunas | Toronto Raptors | 17 | C | 24 | 7-0 | 255 | | 4660482 +Permissions +============= + +Executing a saved query requires ``USAGE`` permissions on the saved query and ``SELECT`` permissions to access the tables referenced in the query. \ No newline at end of file diff --git a/reference/sql/sql_statements/monitoring_commands/explain.rst b/reference/sql/sql_statements/utility_commands/explain.rst similarity index 99% rename from reference/sql/sql_statements/monitoring_commands/explain.rst rename to reference/sql/sql_statements/utility_commands/explain.rst index e9b7e7ee9..cbdb1e4a8 100644 --- a/reference/sql/sql_statements/monitoring_commands/explain.rst +++ b/reference/sql/sql_statements/utility_commands/explain.rst @@ -1,3 +1,5 @@ +:orphan: + .. _explain: ***************** diff --git a/reference/sql/sql_statements/utility_commands/export_open_snapshots.rst b/reference/sql/sql_statements/utility_commands/export_open_snapshots.rst new file mode 100644 index 000000000..fefd5dffe --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/export_open_snapshots.rst @@ -0,0 +1,40 @@ +:orphan: + +.. _export_open_snapshots: + +********************* +EXPORT OPEN SNAPSHOTS +********************* + +The ``EXPORT_OPEN_SNAPSHOTS`` utility function lists and saves information about all currently open snapshots to a specified file. + +Syntax +====== + +.. code-block:: postgres + + SELECT EXPORT_OPEN_SNAPSHOTS('') + +Parameter +========= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``file_path.txt`` + - The path to where you wish to export information about currently open snapshots + +Example +======= + +.. code-block:: postgres + + SELECT EXPORT_OPEN_SNAPSHOTS('./a.txt'); + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get.rst b/reference/sql/sql_statements/utility_commands/get.rst new file mode 100644 index 000000000..1ad88b002 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get.rst @@ -0,0 +1,47 @@ +:orphan: + +.. _get: + +*** +GET +*** + +The ``GET`` function addresses the need to transfer data files stored within the database's internal staging area to a user's local file system. + + +Syntax +====== + +.. code-block:: postgres + + GET <'SQDB-cluster-relative-file-path'> TO <'local-file-path'>; + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``SQDB-cluster-relative-file-path`` + - Source File path - relative to the defined SQDB cluster staging area. + * - ``local-file-path`` + - Destination File path - on the executing client machine. + +Important Considerations +======================== + * File size may be up to 25MB. + * The SQDB cluster staging area's root path is configured using the ``stagingAreaRootPath`` flag. By default, this area is created within the defined ``storageClusterPath`` as a subdirectory named staging_area. This ``staging_area`` directory contains two subdirectories: ``content``, intended for user-uploaded data, and ``temp``, reserved for internal system operations. + * The command is limited to a single file copy per execution. + * File extensions are limited to supported FDWs. + * The command execution is CPU based and does not use GPU Workers. + * Up to 50 concurrent ``PUT`` / ``GET`` / ``REMOVE`` operation are supported per SQDB cluster. + * The feature is supported for the following drivers: PySQream, JDBC, ODBC + + +Permissions +============= + +The role must have the ``SUPERUSER`` privilege. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_all_roles_global_ddl.rst b/reference/sql/sql_statements/utility_commands/get_all_roles_global_ddl.rst new file mode 100644 index 000000000..794e37faa --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_all_roles_global_ddl.rst @@ -0,0 +1,54 @@ +:orphan: + +.. _get_all_roles_global_ddl: + +************************ +GET ALL ROLES GLOBAL DDL +************************ + +The ``GET_ALL_ROLES_GLOBAL_DDL`` statement returns the definition of all global roles in DDL format. + +.. contents:: + :local: + :depth: 1 + +Syntax +====== + +The following is the correct syntax for using the ``GET_ALL_ROLES_GLOBAL_DDL`` statement: + +.. code-block:: postgres + + select get_all_roles_global_ddl() + +Example +======= + +The following is an example of using the ``GET_ALL_ROLES_GLOBAL_DDL`` statement: + +.. code-block:: psql + + select get_all_roles_global_ddl(); + + +Output +====== + +The following is an example of the output of the ``GET_ALL_ROLES_GLOBAL_DDL`` statement: + +.. code-block:: postgres + + create role "public"; create role "sqream"; grant superuser, login to "sqream" ; + +Permissions +=========== + +Using the ``GET_ALL_ROLES_GLOBAL_DDL`` statement requires no special permissions. + +For more information, see the following: + +* :ref:`get_all_roles_database_ddl` + + :: + +* :ref:`get_role_permissions` \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_chunk_info.rst b/reference/sql/sql_statements/utility_commands/get_chunk_info.rst new file mode 100644 index 000000000..b01e84814 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_chunk_info.rst @@ -0,0 +1,78 @@ +:orphan: + +.. _get_chunk_info: + +************** +GET CHUNK INFO +************** + +The ``GET CHUNK INFO`` utility command allows you to retrieve information of specific chunks. + +Syntax +====== + +.. code-block:: sql + + SELECT get_chunk_info(, , [ chunck_id ]) + +Parameters +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``database_name`` + - The database within to search for chunk + * - ``table_id`` + - The id of the table related to the chunk + * - ``chunck_id`` + - The id of a specific chunk to search for + +Returns +======= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``database_name`` + - The database within the chunk exists + * - ``table_id`` + - The id of the table related to the chunk + * - ``column_id`` + - The id of the column related to the chunk + * - ``chunk_id`` + - The id of the chunk + * - ``extent_id`` + - The id of the extent + * - ``compressed_size`` + - The size of the chunk's compressed data + * - ``uncompressed_size`` + - The size of the chunk's uncompressed data + +Examples +======== + +.. code-block:: sql + + SELECT get_chunk_info(mfg_ldc_lake, 17271948, 143); + +Output: + +.. code-block:: console + + database_name |table_id |column_id |chunk_id |extent_id |compressed_size |uncompressed_size + --------------+---------+----------+---------+----------+----------------+----------------- + mfg_ldc |17271948 |16 |143 |142 |9892 |9892 + mfg_ldc |17271948 |17 |143 |142 |8 |39568 + + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_ddl.rst b/reference/sql/sql_statements/utility_commands/get_ddl.rst index f2566e99a..fc8ce9f1a 100644 --- a/reference/sql/sql_statements/utility_commands/get_ddl.rst +++ b/reference/sql/sql_statements/utility_commands/get_ddl.rst @@ -1,36 +1,25 @@ -.. _get_ddl: - -***************** -GET_DDL -***************** +:orphan: -``GET_DDL(
)`` is a function that shows the :ref:`CREATE TABLE` statement for a table. +.. _get_ddl: -.. tip:: - * For views, see :ref:`GET_VIEW_DDL`. - * For the entire database, see :ref:`DUMP_DATABASE_DDL`. - * For UDFs, see :ref:`GET_FUNCTION_DDL`. +******* +GET DDL +******* -Permissions -============= +The ``GET DDL`` function retrieves the Data Definition Language (DDL) statement used to create a table. It may include additional information that was added by SQreamDB (e.g., explicit ``NULL`` constraints). -The role must have the ``CONNECT`` permission at the database level. +See also: :ref:`GET_VIEW_DDL`, :ref:`DUMP_DATABASE_DDL`, :ref:`GET_FUNCTION_DDL` Syntax -========== +====== .. code-block:: postgres - get_ddl_statement ::= - SELECT GET_DDL('[schema_name.]table_name') - ; - - schema_name ::= identifier - - table_name ::= identifier + SELECT + GET_DDL([''.]'') Parameters -============ +========== .. list-table:: :widths: auto @@ -39,39 +28,54 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The name of the schema. + - The name of the schema * - ``table_name`` - - The name of the table. + - The name of the table Examples -=========== +======== -Getting the DDL for a table ------------------------------ -The result of the ``GET_DDL`` function is a verbose version of the :ref:`create_table` syntax, which may include additional information that was added by SQream DB. For example, a ``NULL`` constraint may be specified explicitly. +.. code-block:: postgres -.. code-block:: psql - - farm=> CREATE TABLE cool_animals ( - id INT NOT NULL, - name varchar(30) NOT NULL, - weight FLOAT, - is_agressive BOOL DEFAULT false NOT NULL - ); - executed - - farm=> SELECT GET_DDL('cool_animals'); - create table "public"."cool_animals" ( - "id" int not null, - "name" varchar(30) not null, - "weight" double null, - "is_agressive" bool default false not null ) - ; + -- Create a table: + CREATE TABLE + cool_animals ( + id INT NOT NULL, + name TEXT NOT NULL, + weight FLOAT, + is_agressive BOOL DEFAULT false NOT NULL + ); + + -- Get table ddl: + SELECT + GET_DDL('cool_animals'); + + -- Result: + CREATE TABLE + 'public'.'cool_animals' ( + 'id' INT NOT NULL, + 'name' TEXT NOT NULL, + 'weight' DOUBLE NULL, + 'is_agressive' BOOL DEFAULT FALSE NOT NULL + ); Exporting table DDL to a file -------------------------------- +----------------------------- .. code-block:: postgres - COPY (SELECT GET_DDL('cool_animals')) TO '/home/rhendricks/animals.ddl'; + COPY + ( + SELECT + GET_DDL('cool_animals') + ) TO + WRAPPER + csv_fdw + OPTIONS + (LOCATION = 's3://sqream-docs/cool_animals_ddl.csv'); + +Permissions +============= + +The role must have the ``CONNECT`` permission at the database level. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_extent_info.rst b/reference/sql/sql_statements/utility_commands/get_extent_info.rst new file mode 100644 index 000000000..5fe998b85 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_extent_info.rst @@ -0,0 +1,78 @@ +:orphan: + +.. _get_extent_info: + +*************** +GET EXTENT INFO +*************** + +The ``GET EXTENT INFO`` utility command allows you to retrieve information of specific extents. + +Syntax +====== + +.. code-block:: sql + + SELECT get_extent_info(, , [ column_id ]) + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``database_name`` + - The database within to search for extent + * - ``table_id`` + - The id of the table related to the extent + * - ``column_id`` + - The id of a specific extent to search for + +Returns +======= + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``database_name`` + - The database within the chunk exists + * - ``table_id`` + - The id of the table related to the chunk + * - ``column_id`` + - The id of the column related to the chunk + * - ``chunk_id`` + - The id of the chunk + * - ``extent_id`` + - The id of the extent + * - ``compressed_size`` + - The size of the extent's compressed data + * - ``uncompressed_size`` + - The size of the extent's uncompressed data + +Examples +======== + +.. code-block:: sql + + SELECT get_extent_info(mfg_ldc_lake, 17271948, 143); + +Output: + +.. code-block:: console + + database_name |table_id |column_id |chunk_id |extent_id |compressed_size |uncompressed_size + --------------+---------+----------+---------+----------+----------------+----------------- + mfg_ldc |17271948 |16 |143 |142 |9892 |9892 + mfg_ldc |17271948 |17 |143 |142 |8 |39568 + + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_function_ddl.rst b/reference/sql/sql_statements/utility_commands/get_function_ddl.rst index e456073ae..c3a4ba7f6 100644 --- a/reference/sql/sql_statements/utility_commands/get_function_ddl.rst +++ b/reference/sql/sql_statements/utility_commands/get_function_ddl.rst @@ -1,31 +1,20 @@ +:orphan: + .. _get_function_ddl: ***************** -GET_FUNCTION_DDL +GET FUNCTION DDL ***************** -``GET_FUNCTION_DDL()`` is a function that shows the :ref:`CREATE FUNCTION` statement for a function. - -.. tip:: - * For tables, see :ref:`GET_DDL`. - * For views, see :ref:`GET_VIEW_DDL`. - * For the entire database, see :ref:`DUMP_DATABASE_DDL`. - -Permissions -============= - -The role must have the ``CONNECT`` permission at the database level. +``GET_FUNCTION_DDL`` is a function that shows the :ref:`create_function` statement for a function. Syntax -========== +====== .. code-block:: postgres - get_function_ddl_statement ::= - SELECT GET_FUNCTION_DDL('function_name') - ; - function_name ::= identifier + SELECT GET_FUNCTION_DDL('') Parameters ============ @@ -37,46 +26,58 @@ Parameters * - Parameter - Description * - ``function_name`` - - The name of the function. + - The name of the function Examples -=========== +======== -Getting the DDL for a function ---------------------------------- - -The result of the ``GET_FUNCTION_DDL`` function is a verbose version of the CREATE FUNCTION statement, which may include additional information that was added by SQream DB. For example, some type names and identifiers may be quoted or altered. +.. code-block:: postgres -.. code-block:: psql + CREATE OR REPLACE FUNCTION my_distance (x1 float, + y1 float, + x2 float, + y2 float) returns float as + $$ import mathIF y1 < X1:RETURN 0.0 + else: + return math.Sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2) + $$ + language python; - master=> CREATE OR REPLACE FUNCTION my_distance (x1 float, y1 float, x2 float, y2 float) RETURNS float as $$ - import math - if y1 < x1: - return 0.0 - else: - return math.sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2) - $$ LANGUAGE PYTHON; - executed - master=> SELECT GET_FUNCTION_DDL('my_distance'); - create function "my_distance" (x1 float, + SELECT + GET_FUNCTION_DDL('my_distance'); + + CREATE FUNCTION 'my_distance' (x1 float, y1 float, x2 float, y2 float) returns float as - $$ - import math - if y1 < x1: - return 0.0 - else: - return math.sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2) - $$ - language python volatile; + $$ + import math + if y1 < x1: + return 0.0 + else: + return math.sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2) + $$ + language python volatile; Exporting function DDL to a file ------------------------------------- +-------------------------------- .. code-block:: postgres - COPY (SELECT GET_FUNCTION_DDL('my_distance')) TO '/home/rhendricks/my_distance.sql'; + COPY + ( + SELECT + GET_FUNCTION_DDL('my_distance') + ) TO + WRAPPER + csv_fdw + OPTIONS + (LOCATION = 's3://sqream-docs/cool_animals_ddl.csv'); + +Permissions +============= + +The role must have the ``CONNECT`` permission at the database level. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_license_info.rst b/reference/sql/sql_statements/utility_commands/get_license_info.rst index 8c28c0ea4..226009edb 100644 --- a/reference/sql/sql_statements/utility_commands/get_license_info.rst +++ b/reference/sql/sql_statements/utility_commands/get_license_info.rst @@ -1,16 +1,21 @@ +:orphan: + .. _get_license_info: ******************** GET_LICENSE_INFO ******************** + ``GET_LICENSE_INFO`` displays information related to data size limitations, expiration date, and type of license currently used by the SQream cluster. Permissions ============= + No special permissions are required. Syntax ========== + The following is the correct syntax for running the ``GET LICENSE INFO`` statement: .. code-block:: postgres @@ -21,6 +26,7 @@ The following is the correct syntax for running the ``GET LICENSE INFO`` stateme Returns ========== + The following table shows the ``GET_LICENSE_INFO`` license information in the order that it is returned: .. list-table:: @@ -59,14 +65,9 @@ Example =========== The following is an example of the returned license information described in the **Returns** section above: -.. code-block:: psql +.. code-block:: console - 10,100,compressed,20,2045-03-18,0,0,10 - -Parameters -============ -The ``GET_LICENSE_INFO`` command has no parameters. + compressed_cluster_size | uncompressed_cluster_size | compress_type | cluster_size_limit | expiration_date | is_date_expired | is_size_exceeded | cluster_size_left + ------------------------+---------------------------+---------------+--------------------+-----------------+-----------------+------------------+------------------ + 10 | 100 | compressed | 20 | 2045-03-18 | 0 | 0 | 10 -Notes -========= -If the license expires or exceeds quotas, contact a SQream representative to extend the license. diff --git a/reference/sql/sql_statements/utility_commands/get_open_snapshots.rst b/reference/sql/sql_statements/utility_commands/get_open_snapshots.rst new file mode 100644 index 000000000..7cbb01798 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_open_snapshots.rst @@ -0,0 +1,63 @@ +:orphan: + +.. _get_open_snapshots: + +****************** +GET_OPEN_SNAPSHOTS +****************** + +The ``GET_OPEN_SNAPSHOTS`` utility function lists information about all currently open snapshots. + +Syntax +====== + +.. code-block:: postgres + + SELECT GET_OPEN_SNAPSHOTS() + +Output +====== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``database_name`` + - + * - ``reason`` + - + * - ``open_time`` + - + * - ``database_version`` + - + * - ``snapshot_id`` + - + * - ``statement_id`` + - + * - ``current_time`` + - + * - ``is_statement_active`` + - + +Example +======= + +.. code-block:: postgres + + SELECT GET_OPEN_SNAPSHOTS(); + +Output: + +.. code-block:: console + + database_name |reason |open_time |database_version |snapshot_id |statement_id |current_time |is_statement_active + --------------+----------------+-------------------+-----------------+------------+-------------+-------------------+------------------- + master |on_new_statement|2024-07-04 17:16:56|1 |30898 |0 |2024-07-04 17:16:57|1 + + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_role_database_ddl.rst b/reference/sql/sql_statements/utility_commands/get_role_database_ddl.rst new file mode 100644 index 000000000..410559d82 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_role_database_ddl.rst @@ -0,0 +1,48 @@ +:orphan: + +.. _get_role_database_ddl: + +********************* +GET ROLE DATABASE DDL +********************* + +The ``GET_ROLE_DATABASE_DDL`` statement returns the definition of a role's database in DDL format. + +Syntax +====== + +.. code-block:: postgres + + SELECT GET_ROLE_DATABASE_DDL('') + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``role_name`` + - The role for which to get database definition + +Example +======= + +.. code-block:: postgres + + SELECT GET_ROLE_DATABASE_DDL('public'); + +Output: + +.. code-block:: console + + Name|Value | + ----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + ddl |grant create, usage on schema 'master'.'public' to 'public' ;alter default schema for 'public' to 'master'.'public';alter default permissions for 'public' for schemas grant superuser to creator_role ;alter default permissions for 'public' for tables grant select, insert, delete, update, ddl to creator_role ;alter default permissions for 'public' for external tables grant select, ddl to creator_role ;alter default permissions for 'public' for views grant select, ddl to creator_role ;| + +Permissions +=========== + +Using the ``GET_ROLE_DATABASE_DDL`` statement requires no special permissions. diff --git a/reference/sql/sql_statements/utility_commands/get_role_global_ddl.rst b/reference/sql/sql_statements/utility_commands/get_role_global_ddl.rst new file mode 100644 index 000000000..0d0e4b5f0 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_role_global_ddl.rst @@ -0,0 +1,69 @@ +:orphan: + +.. _get_role_global_ddl: + +******************* +GET ROLE GLOBAL DDL +******************* + +The ``GET_ROLE_GLOBAL_DDL`` statement returns the definition of a global role in DDL format. + +The ``GET_ROLE_GLOBAL_DDL`` page describes the following: + +.. contents:: + :local: + :depth: 1 + +Syntax +====== + +The following is the correct syntax for using the ``GET_ROLE_GLOBAL_DDL`` statement: + +.. code-block:: postgres + + select get_role_global_ddl(<'role_name'>) + +Example +======= + +The following is an example of using the ``GET_ROLE_GLOBAL_DDL`` statement: + +.. code-block:: psql + + select get_role_global_ddl('public'); + +Parameters +========== + +The following table shows the ``GET_ROLE_GLOBAL_DDL`` parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``role_name`` + - The definition of the global role in DDL format. + +Output +====== + +The following is an example of the output of the ``GET_ROLE_GLOBAL_DDL`` statement: + +.. code-block:: postgres + + create role "public"; + +Permissions +=========== + +Using the ``GET_ROLE_GLOBAL_DDL`` statement requires no special permissions. + +For more information, see the following: + +* :ref:`get_role_database_ddl` + + :: + +* :ref:`get_role_permissions` \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_role_permissions.rst b/reference/sql/sql_statements/utility_commands/get_role_permissions.rst new file mode 100644 index 000000000..94a19e075 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_role_permissions.rst @@ -0,0 +1,82 @@ +:orphan: + +.. _get_role_permissions: + +******************** +GET ROLE PERMISSIONS +******************** + +The ``GET_ROLE_PERMISSIONS`` statement returns all permissions granted to a role in table format. + +The ``GET_ROLE_PERMISSIONS`` page describes the following: + +.. contents:: + :local: + :depth: 1 + +Syntax +====== + +The following is the correct syntax for using the ``GET_ROLE_PERMISSIONS`` statement: + +.. code-block:: postgres + + select get_role_permissions() + +Example +======= + +The following is an example of using the ``GET_ROLE_PERMISSIONS`` statement: + +.. code-block:: psql + + select get_role_permissions(); + +Parameters +========== + +The following table shows the ``GET_ROLE_PERMISSIONS`` parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``()`` + - The permissions belonging to the role. + +Output +====== + +The following is an example of the output of the ``GET_ROLE_PERMISSIONS`` statement: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + - Example + * - ``permission_type`` + - The permission type granted to the role. + - SUPERUSER + * - ``object_type`` + - The data object type. + - table + * - ``object_name`` + - The name of the object. + - master.public.nba + +Permissions +=========== + +Using the ``GET_ROLE_PERMISSIONS`` statement requires no special permissions. + +For more information, see the following: + +* :ref:`get_role_database_ddl` + + :: + +* :ref:`get_role_global_ddl` \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_total_chunks_size.rst b/reference/sql/sql_statements/utility_commands/get_total_chunks_size.rst new file mode 100644 index 000000000..4398432e8 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/get_total_chunks_size.rst @@ -0,0 +1,64 @@ +:orphan: + +.. _get_total_chunks_size: + +********************** +GET TOTAL CHUNKS SIZE +********************** + +The ``get_total_chunks_size`` function returns the total size of all data chunks saved in the system in both compressed and uncompressed formats. + +Syntax +========== + +.. code-block:: postgres + + SELECT get_total_chunks_size(, [DATABASE_NAME], [SCHEMA_NAME, [TABLE_NAME]]) + +Parameters +============ + +The following table shows the ``SELECT get_total_chunks_size`` parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - State + - Description + * - ``OUTPUT_UNITS`` + - Mandatory + - Specifies the desired unit of measurement for the output size, with valid values of ``BYTE``, ``MB``, ``GB``, ``TB``, or ``PB`` + * - ``DATABASE_NAME`` + - Optional + - Specifies the name of the database to analyze. If not specified, the function will analyze all databases in the cluster. + * - ``SCHEMA_NAME`` + - Optional + - Specifies the name of the schema to analyze. If not specified, the function will analyze all schemas in the specified database. + * - ``TABLE_NAME`` + - Optional + - Specifies the name of a specific table to analyze. If not specified, the function will analyze all tables in the specified schema. + +Example +=========== + +.. code-block:: psql + + SELECT get_total_chunks_size('MB'); + +Output +========== + + +.. code-block:: console + + compression-type | value | size | + -------------------------+--------------------------------+-------+ + Compressed | 0.00036144256591796875 | MB | + Uncompressed | 0.00036144256591796875 | MB | + +Permissions +============= + +Using the ``get_total_chunks_size`` command requires no special permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/get_view_ddl.rst b/reference/sql/sql_statements/utility_commands/get_view_ddl.rst index 5ce56d7af..ac8962edb 100644 --- a/reference/sql/sql_statements/utility_commands/get_view_ddl.rst +++ b/reference/sql/sql_statements/utility_commands/get_view_ddl.rst @@ -1,36 +1,22 @@ -.. _get_view_ddl: - -***************** -GET_VIEW_DDL -***************** - -``GET_VIEW_DDL()`` is a function that shows the :ref:`CREATE VIEW` statement for a view. +:orphan: -.. tip:: - * For tables, see :ref:`GET_DDL`. - * For the entire database, see :ref:`DUMP_DATABASE_DDL`. - * For UDFs, see :ref:`GET_FUNCTION_DDL`. +.. _get_view_ddl: -Permissions -============= +************ +GET VIEW DDL +************ -The role must have the ``CONNECT`` permission at the database level. +``GET_VIEW_DDL`` is a function that shows the :ref:`CREATE VIEW` statement for a view. Syntax -========== +====== .. code-block:: postgres - get_view_ddl_statement ::= - SELECT GET_VIEW_DDL('[schema_name.]view_name') - ; - - schema_name ::= identifier - - view_name ::= identifier + SELECT GET_VIEW_DDL([''.]'') Parameters -============ +========== .. list-table:: :widths: auto @@ -39,40 +25,55 @@ Parameters * - Parameter - Description * - ``schema_name`` - - The name of the schema. + - The name of the schema * - ``view_name`` - - The name of the view. + - The name of the view Examples -=========== - -Getting the DDL for a view ------------------------------ +======== -The result of the ``GET_VIEW_DDL`` function is a verbose version of the CREATE VIEW statement, which may include additional information that was added by SQream DB. For example, schemas and column names will be be specified explicitly. +.. code-block:: postgres -.. code-block:: psql + CREATE VIEW + angry_animals AS + SELECT + * + FROM + cool_animals + WHERE + is_agressive = false; - farm=> CREATE VIEW angry_animals AS SELECT * FROM cool_animals WHERE is_agressive = false; - executed - farm=> SELECT GET_VIEW_DDL('angry_animals'); - create view "public".angry_animals as - select - "cool_animals"."id" as "id", - "cool_animals"."name" as "name", - "cool_animals"."weight" as "weight", - "cool_animals"."is_agressive" as "is_agressive" - from - "public".cool_animals as cool_animals - where - "cool_animals"."is_agressive" = false; - - + SELECT + GET_VIEW_DDL('angry_animals'); + + CREATE VIEW 'public'.angry_animals AS + SELECT + 'cool_animals'.'id' as 'id', + 'cool_animals'.'name' as 'name', + 'cool_animals'.'weight' as 'weight', + 'cool_animals'.'is_agressive' as 'is_agressive' + FROM + 'public'.cool_animals as cool_animals + WHERE + 'cool_animals'.'is_agressive' = false; Exporting view DDL to a file -------------------------------- +---------------------------- .. code-block:: postgres - COPY (SELECT GET_VIEW_DDL('angry_animals')) TO '/home/rhendricks/angry_animals.sql'; + COPY + ( + SELECT + GET_VIEW_DDL('angry_animals') + ) TO + WRAPPER + csv_fdw + OPTIONS + (LOCATION = 's3://sqream-docs/cool_animals_ddl.csv'); + +Permissions +=========== + +The role must have the ``CONNECT`` permission at the database level. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/global_graceful_shutdown.rst b/reference/sql/sql_statements/utility_commands/global_graceful_shutdown.rst new file mode 100644 index 000000000..9e6dd392c --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/global_graceful_shutdown.rst @@ -0,0 +1,55 @@ +:orphan: + +.. _global_graceful_shutdown: + +************************ +GLOBAL GRACEFUL SHUTDOWN +************************ + + +SQream's method for gracefully stopping the all SQream servers in the cluster. Once executed, it causes the servers to wait for any queued statements to complete before shutting down. + +.. contents:: + :local: + :depth: 1 + + +How Does it Work? +======================== +Running the ``GLOBAL GRACEFUL SHUTDOWN`` command gives you more control over the following: + +* Preventing new queries from connecting to the server by: + + * Setting the servers as unavailable in the metadata server. + + :: + + * Unsubscribing the servers from there services. + +* Preventing users from opening new connections to the server. Attempting to connect to the server after activating a graceful shutdown displays the following message: + + .. code-block:: postgres + + Server is shutting down, no new connections are possible at the moment. + + +Syntax +========== +The following is the syntax for using the ``GLOBAL GRACEFUL SHUTDOWN`` command: + +.. code-block:: postgres + + select global_graceful_shutdown(); + +Returns +========== +Running the ``GLOBAL GRACEFUL SHUTDOWN`` command returns no output. + +Parameters +============ +``GLOBAL GRACEFUL SHUTDOWN`` has no input parameters: + + +Permissions +============= +The ``SUPERUSER`` permission is required to execute ``GLOBAL GRACEFUL SHUTDOWN``. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/ldap_get_attr.rst b/reference/sql/sql_statements/utility_commands/ldap_get_attr.rst new file mode 100644 index 000000000..c33068fc8 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/ldap_get_attr.rst @@ -0,0 +1,44 @@ +:orphan: + +.. _ldap_get_attr: + +************* +LDAP GET ATTR +************* + +The ``ldap_get_attr()`` utility function may be used only after having set :ref:`ldap` as your authentication service. This function enables you to specify LDAP attributes you want your SQreamDB role catalog table to include when executing ``SELECT`` on the ``sqream_catalog.roles`` metadata object. + +Syntax +========== + +.. code-block:: postgres + + SELECT ldap_get_attr() + +Example +======= + +Assume the following LDAP attributes are set to be associated with SQreamDB roles: + +* distinguishedName +* primaryGroupID +* userprincipalname + +.. code-block:: psql + + SELECT ldap_get_attr(); + +Output + +.. code-block:: console + + role_name | distinguishedName | primaryGroupID| userprincipalname + ------------------+-------------------------------------------+---------------+--------------------- + public | | | + sqream | CN=sqream,OU=Sqream Users,DC=sqream,DC=loc| 513 | sqream@sqream.loc + test_user | CN=test_user,OU=test,DC=sqream,DC=loc | 513 | test_user@sqream.loc + +Permissions +=========== + +Using the ``ldap_get_attr`` command requires no special permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/list_saved_queries.rst b/reference/sql/sql_statements/utility_commands/list_saved_queries.rst index bb1781840..46ef1f8f3 100644 --- a/reference/sql/sql_statements/utility_commands/list_saved_queries.rst +++ b/reference/sql/sql_statements/utility_commands/list_saved_queries.rst @@ -1,35 +1,31 @@ +:orphan: + .. _list_saved_queries: ******************** -LIST_SAVED_QUERIES +LIST SAVED QUERIES ******************** -``LIST_SAVED_QUERIES`` lists the available :ref:`previously saved queries`. +``LIST_SAVED_QUERIES`` lists the available previously :ref:`saved queries`. This is an alternative way to using the ``savedqueries`` catalog view. Read more in the :ref:`saved_queries` guide. -See also: ref:`save_query`, :ref:`execute_saved_query`, ref:`drop_saved_query`, ref:`show_saved_query`. - -Permissions -============= - -Listing the saved queries requires no special permissions. +See also: :ref:`save_query`, :ref:`execute_saved_query`, :ref:`drop_saved_query`, :ref:`show_saved_query` Syntax ========== -.. code-block:: postgres +.. code-block:: sql list_saved_queries_statement ::= SELECT LIST_SAVED_QUERIES() - ; Returns ========== -List of saved query names, one per row. +A list of saved queries the user has ``SELECT`` permissions on. Parameters ============ @@ -47,16 +43,16 @@ Examples Listing previously saved queries --------------------------------------- -.. code-block:: psql +.. code-block:: sql - t=> SELECT LIST_SAVED_QUERIES(); + SELECT LIST_SAVED_QUERIES(); saved_query ------------------------- select_all select_by_weight select_by_weight_and_team - t=> SELECT SHOW_SAVED_QUERY('select_by_weight_and_team'); + SELECT SHOW_SAVED_QUERY('select_by_weight_and_team'); saved_query ----------------------------------------------- SELECT * FROM nba WHERE Weight > ? AND Team = ? @@ -67,11 +63,16 @@ Listing saved queries with the catalog Using the :ref:`catalog` is also possible: -.. code-block:: psql +.. code-block:: sql - t=> SELECT * FROM sqream_catalog.savedqueries; + SELECT * FROM sqream_catalog.savedqueries; name | num_parameters --------------------------+--------------- select_all | 0 select_by_weight | 1 select_by_weight_and_team | 2 + +Permissions +============= + +Listing saved queries requires no special permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/put.rst b/reference/sql/sql_statements/utility_commands/put.rst new file mode 100644 index 000000000..d42d0bba8 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/put.rst @@ -0,0 +1,51 @@ +:orphan: + +.. _put: + +*** +PUT +*** + +The ``PUT`` function addresses the need to transfer data files stored in a user's local file system to the database's internal staging area. + + +Syntax +====== + +.. code-block:: postgres + + PUT <'local-file-path'> INTO <'SQDB-cluster-relative-file-path'> [OVERWRITE]; + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``local-file-path`` + - Source File path - on the executing client machine. + * - ``SQDB-cluster-relative-file-path`` + - Destination File path - relative to the defined SQDB cluster staging area. + * - ``OVERWRITE`` + - When used existing files will be overridden. + + +Important Considerations +======================== + * File size may be up to 25MB. + * The SQDB cluster staging area's root path is configured using the ``stagingAreaRootPath`` flag. By default, this area is created within the defined ``storageClusterPath`` as a subdirectory named staging_area. This ``staging_area`` directory contains two subdirectories: ``content``, intended for user-uploaded data, and ``temp``, reserved for internal system operations. + * The command is limited to a single file copy per execution. + * File extensions are limited to supported FDWs. + * The command execution is CPU based and does not use GPU Workers. + * Up to 50 concurrent ``PUT`` / ``GET`` / ``REMOVE`` operation are supported per SQDB cluster. + * The feature is supported for the following drivers: PySQream, JDBC, ODBC + + + +Permissions +============= + +The role must have the ``SUPERUSER`` privilege. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/rechunk.rst b/reference/sql/sql_statements/utility_commands/rechunk.rst new file mode 100644 index 000000000..083c9f645 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/rechunk.rst @@ -0,0 +1,69 @@ +:orphan: + +.. _rechunk: + +******* +RECHUNK +******* + +SQreamDB is the most efficient for processing large data chunks. The ``rechunk`` function improves performance when handling tables with small data chunks by allowing you to consolidate these small chunks into larger ones. This function also handles mixed chunks, which include one or more deleted records and/or records marked for deletion but not yet purged (i.e., awaiting the removal of deleted data). When applied to mixed chunks, the function performs a :ref:`cleanup operation`, resulting in clean, large data chunks. + +Syntax +========== + +.. code-block:: postgres + + SELECT rechunk('', '') + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The name of the schema in which the table to rechunk is in + * - ``table_name`` + - The name of the table to rechunk + +Example +======= + +.. code-block:: postgres + + SELECT rechunk('public', 't'); + + +Rechunk Encrypted Columns +========================= + +For tables with encrypted columns, RECHUNK requires the encryption keys for each encrypted column. + +Syntax +========== + +.. code-block:: postgres + + RECHUNK('', '
', '', '', '', '', ...); + +Example +========== + +.. code-block:: postgres + + CREATE TABLE sc.tbl ( + x TEXT ENCRYPT, + y TEXT, + z TEXT ENCRYPT + ); + + RECHUNK('sc', 'tbl', 'x', '[key-for-x]', 'z', '[key-for-z]'); + + +Permissions +============= + +Using the ``rechunk`` command requires no special permissions. diff --git a/reference/sql/sql_statements/utility_commands/recompile_saved_query.rst b/reference/sql/sql_statements/utility_commands/recompile_saved_query.rst index d6b63e30e..524f892b3 100644 --- a/reference/sql/sql_statements/utility_commands/recompile_saved_query.rst +++ b/reference/sql/sql_statements/utility_commands/recompile_saved_query.rst @@ -1,3 +1,5 @@ +:orphan: + .. _recompile_saved_query: ************************** @@ -6,6 +8,8 @@ RECOMPILE_SAVED_QUERY ``RECOMPILE_SAVED_QUERY`` recompiles a saved query that has been invalidated due to a schema change. +Read more in the :ref:`saved_queries` guide. + Permissions ============= diff --git a/reference/sql/sql_statements/utility_commands/recompile_view.rst b/reference/sql/sql_statements/utility_commands/recompile_view.rst index 1abea0d8c..769d27daf 100644 --- a/reference/sql/sql_statements/utility_commands/recompile_view.rst +++ b/reference/sql/sql_statements/utility_commands/recompile_view.rst @@ -1,3 +1,5 @@ +:orphan: + .. _recompile_view: ***************** diff --git a/reference/sql/sql_statements/utility_commands/remove.rst b/reference/sql/sql_statements/utility_commands/remove.rst new file mode 100644 index 000000000..75406c0b3 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/remove.rst @@ -0,0 +1,45 @@ +:orphan: + +.. _remove: + +****** +REMOVE +****** + +The ``REMOVE`` function addresses the need to remove files from the database's internal staging area. + + +Syntax +====== + +.. code-block:: postgres + + REMOVE <'SQDB-cluster-relative-file-path'>; + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``SQDB-cluster-relative-file-path`` + - File path to remove - relative to the defined SQDB cluster staging area. + + +Important Considerations +======================== + * The SQDB cluster staging area's root path is configured using the ``stagingAreaRootPath`` flag. By default, this area is created within the defined ``storageClusterPath`` as a subdirectory named staging_area. This ``staging_area`` directory contains two subdirectories: ``content``, intended for user-uploaded data, and ``temp``, reserved for internal system operations. + * The command is limited to a single file deletion per execution. + * File extensions are limited to supported FDWs. + * The command execution is CPU based and does not use GPU Workers. + * Up to 50 concurrent ``PUT`` / ``GET`` / ``REMOVE`` operation are supported per SQDB cluster. + * The feature is supported for the following drivers: PySQream, JDBC, ODBC + + +Permissions +============= + +The role must have the ``SUPERUSER`` privilege. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/remove__statement_locks.rst b/reference/sql/sql_statements/utility_commands/remove__statement_locks.rst new file mode 100644 index 000000000..69c8988d7 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/remove__statement_locks.rst @@ -0,0 +1,32 @@ +:orphan: + +.. _remove_statement_locks: + +********************** +REMOVE STATEMENT LOCKS +********************** + +The ``REMOVE STATEMENT LOCKS`` utility function clears all orphaned locks that block file cleanup and prevent operations on locked objects within the system. + +To remove specific locks, see :ref:`remove_lock` + +Read more about locks in :ref:`concurrency_and_locks`. + +Syntax +====== + +.. code-block:: postgres + + SELECT REMOVE_STATEMENT_LOCK( [, ]) + +Example +======= + +.. code-block:: postgres + + SELECT REMOVE_STATEMENT_LOCKS (0); + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission on the database level. diff --git a/reference/sql/sql_statements/utility_commands/remove_lock.rst b/reference/sql/sql_statements/utility_commands/remove_lock.rst new file mode 100644 index 000000000..5acf3b210 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/remove_lock.rst @@ -0,0 +1,76 @@ +:orphan: + +.. _remove_lock: + +*********** +REMOVE LOCK +*********** + +The ``REMOVE LOCK`` utility function clears orphaned locks that block file cleanup and prevent operations on locked objects within the system. + +To remove all existing locks, see :ref:`remove_statement_locks` + +Read more about locks in :ref:`concurrency_and_locks`. + +Syntax +====== + +.. code-block:: postgres + + SELECT REMOVE_LOCK(, [, ]) + +Example +======= + +#. Get locked object names: + + .. code-block:: postgres + + SELECT SHOW_LOCKS(); + + Output: + + .. code-block:: console + + statement id |statement string |username |server |port |locked object |lock mode |statement start time |lock start time |is_statement_active |is_snapshot_active + -------------+----------------------------------+---------+-------------+-----+--------------------------+----------+---------------------+--------------------+--------------------+------------------ + 0 |COPY schema.table FROM WRAPPER .. |sqream |192.168.4.35 |5000 |database$master |Inclusive |29-10-2023 14:20:08 |2023-10-29 14:20:08 |1 |1 + 0 |COPY schema.table FROM WRAPPER .. |sqream |192.168.4.35 |5000 |schema$master$schema |Inclusive |29-10-2023 14:20:08 |2023-10-29 14:20:08 |1 |1 + 0 |COPY schema.table FROM WRAPPER .. |sqream |192.168.4.35 |5000 |table$master$schema$table |Inclusive |29-10-2023 14:20:08 |2023-10-29 14:20:08 |1 |1 + +#. Show server status: + + .. code-block:: postgres + + SELECT SHOW_SERVER_STATUS(); + + Output: + + .. code-block:: console + + service |instanceid |connection_id |serverip |serverport |database_name |user_name |clientip |statementid |statement |statementstarttime |statementstatus |statementstatusstart + --------+-----------+--------------+-------------+-----------+--------------+----------+-------------+------------+-------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------+-------------------- + sqream |node_9383 |1 |192.168.4.35 |5000 |master |sqream |192.168.4.35 |0 |COPY schema.table FROM WRAPPER parquet_fdw OPTIONS (location='/abc/*.c000', CONTINUE_ON_ERROR=true, ERROR_LOG='/abc/log_out'); |29-10-2023 14:20:08 |Executing |29-10-2023 14:20:08 + +#. Remove a specific lock: + + .. code-block:: postgres + + SELECT REMOVE_LOCK ('database$master', 0); + + .. code-block:: postgres + + SELECT REMOVE_LOCK ('schema$master$schema', 0); + + .. code-block:: postgres + + SELECT REMOVE_LOCK ('table$master$schema$table', 0); + + + + + +Permissions +=========== + +This utility function requires a ``SUPERUSER`` permission on the database level. diff --git a/reference/sql/sql_statements/utility_commands/save_query.rst b/reference/sql/sql_statements/utility_commands/save_query.rst index be34c33ed..8bc9507cb 100644 --- a/reference/sql/sql_statements/utility_commands/save_query.rst +++ b/reference/sql/sql_statements/utility_commands/save_query.rst @@ -1,28 +1,24 @@ +:orphan: + .. _save_query: ***************** -SAVE_QUERY +SAVE QUERY ***************** -``SAVE_QUERY`` saves a query execution plan. +``SAVE QUERY`` saves a query execution plan. Read more in the :ref:`saved_queries` guide. See also: :ref:`execute_saved_query`, :ref:`drop_saved_query`, :ref:`show_saved_query`, :ref:`list_saved_queries`. -Permissions -============= - -No special permissions are needed to save a query. - Syntax ========== -.. code-block:: postgres +.. code-block:: sql save_query_statement ::= SELECT SAVE_QUERY(saved_query_name, parameterized_query_string) - ; saved_query_name ::= string_literal @@ -56,7 +52,7 @@ Notes * Query parameters can be used as substitutes for literal expressions. Parameters cannot be used to substitute identifiers, column names, table names, or other parts of the query. -* Query parameters of a string datatype (like ``VARCHAR``) must be of a fixed length, and can be used in equality checks, but not patterns (e.g. :ref:`like`, :ref:`rlike`, etc) +* Query parameters of a string datatype (like ``TEXT``) must be of a fixed length, and can be used in equality checks, but not patterns (e.g. :ref:`like`, :ref:`rlike`, etc) * Query parameters' types are inferred at compile time. @@ -66,18 +62,18 @@ Examples Assume a table named ``nba``, with the following structure: -.. code-block:: postgres +.. code-block:: sql CREATE TABLE nba ( - Name varchar(40), - Team varchar(40), + Name text(40), + Team text(40), Number tinyint, - Position varchar(2), + Position text(2), Age tinyint, - Height varchar(4), + Height text(4), Weight real, - College varchar(40), + College text(40), Salary float ); @@ -93,11 +89,11 @@ Here's a peek at the table contents (:download:`Download nba.csv SELECT SAVE_QUERY('select_all','SELECT * FROM nba'); - executed - t=> SELECT EXECUTE_SAVED_QUERY('select_all'); + SELECT SAVE_QUERY('select_all','SELECT * FROM nba'); + + SELECT EXECUTE_SAVED_QUERY('select_all'); Name | Team | Number | Position | Age | Height | Weight | College | Salary -------------------------+------------------------+--------+----------+-----+--------+--------+-----------------------+--------- Avery Bradley | Boston Celtics | 0 | PG | 25 | 6-2 | 180 | Texas | 7730337 @@ -113,15 +109,19 @@ Use parameters to replace them later at execution time. .. tip:: Use dollar quoting (`$$`) to avoid escaping strings. -.. code-block:: psql +.. code-block:: sql - t=> SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); - executed - t=> SELECT EXECUTE_SAVED_QUERY('select_by_weight_and_team', 240, 'Toronto Raptors'); + SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); + + SELECT EXECUTE_SAVED_QUERY('select_by_weight_and_team', 240, 'Toronto Raptors'); Name | Team | Number | Position | Age | Height | Weight | College | Salary ------------------+-----------------+--------+----------+-----+--------+--------+-------------+-------- Bismack Biyombo | Toronto Raptors | 8 | C | 23 | 6-9 | 245 | | 2814000 James Johnson | Toronto Raptors | 3 | PF | 29 | 6-9 | 250 | Wake Forest | 2500000 Jason Thompson | Toronto Raptors | 1 | PF | 29 | 6-11 | 250 | Rider | 245177 Jonas Valanciunas | Toronto Raptors | 17 | C | 24 | 7-0 | 255 | | 4660482 + +Permissions +============= +Saving queries requires no special permissions per se, however, it does require from the user to have permissions to access the tables referenced in the query and other query element permissions. The user who saved the query is granted all permissions on the saved query. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/select_gpu_metrics.rst b/reference/sql/sql_statements/utility_commands/select_gpu_metrics.rst new file mode 100644 index 000000000..474b05ad8 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/select_gpu_metrics.rst @@ -0,0 +1,94 @@ +:orphan: + +.. _select_gpu_metrics: + +************************* +SELECT GPU METRICS +************************* + +The ``SELECT gpu_metrics`` utility function allows you to check your cluster's GPU usage within a specific time frame. This is useful to ensure that GPU usage stays within the allowed license limits. + +An empty result indicates no usage deviation during the specified time. If the GPU usage exceeds the quota, the result will show the highest deviation in this format: Year-Month-Day, GPU deviated usage, and GPU quota limit as per your license plan. + +Syntax +========== + +.. code-block:: sql + + SELECT gpu_metrics(['monthly'] | ['daily'], <'start-date'>, <'end-date'>) + +Parameters +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - State + - Description + * - ``monthly`` or ``daily`` + - Mandatory + - Specifies the time frame within which data was read + * - ``start-date`` + - Mandatory + - The starting date for the data retrieval period + * - ``end-date`` + - Mandatory + - The ending date for the data retrieval period + +Output +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``date`` + - Date and time of highest GPU usage deviation + * - ``actual_number_of_gpus`` + - GPU usage deviation + * - ``data_read_limit_license_value`` + - GPU quota limit as per license plan + +Examples +=========== + +Daily GPU usage: + +.. code-block:: postgres + + SELECT gpu_metrics('daily','2023-05-01', '2023-05-05); + +Output + +.. code-block:: console + + date | actual_number_of_gpus | data_read_limit_license_value + --------------+-------------------------+--------------------------------- + 2023-May-01 | 2 | 1 + 2023-May-02 | 3 | 1 + 2023-May-03 | 3 | 1 + + +Monthly GPU usage: + +.. code-block:: sql + + SELECT gpu_metrics('monthly', '2023-04-01', '2023-06-05'); + +Output + +.. code-block:: console + + date | actual_number_of_gpus | data_read_limit_license_value + --------------+-------------------------+--------------------------------- + 2023 Apr | 2 | 1 + + +Permissions +============= + +Using the ``SELECT gpu_metrics`` command requires ``SUPERUSER`` permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/select_health_check_monitoring.rst b/reference/sql/sql_statements/utility_commands/select_health_check_monitoring.rst new file mode 100644 index 000000000..c5e3bf773 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/select_health_check_monitoring.rst @@ -0,0 +1,347 @@ +:orphan: + +.. _select_health_check_monitoring: + +.. role:: red + :class: red-text + +.. role:: green + :class: green-text + +.. raw:: html + + + +******************************* +HEALTH-CHECK MONITORING +******************************* + +The ``SELECT health_check_monitoring`` command empowers system administrators to comprehensively monitor the database's health across multiple *categories*. + +In the ``storage`` domain, it provides insights into cluster storage chunks and their fragmentation, helping to prevent table reading bottlenecks by alerting in the case of a fragmentation scenario. Additionally, it gives indications per table on when to trigger cleanup executions (to free up storage and improve reading performance). The ``metadata_stats`` category offers information on Worker and metadata reactivity, enabling the identification of system performance during peak loads and the revelation of potential concurrent issues. Addressing licensing concerns, the command gives details on the customer's ``license``, including storage capacity and restrictions, and proactively alerts administrators before reaching limitations. Lastly, under ``self_healing``, it supplies essential details on ETL and load processes, monitors query execution flow, tracks Workers per service, identifies idle Workers, and detects issues like stuck snapshots—crucial for regular monitoring and offering clear insights during the Root Cause Analysis (RCA) process for optimal resource allocation. + +Here, you can discover details on configuring the monitoring for each of the four categories, along with instructions on how to access and interpret the log files for each category. + +.. contents:: + :local: + :depth: 2 + +Before You Begin +================== + +* Download the Health-Check Monitor :download:`input.json ` configuration file and save it anywhere you want. +* The ``SELECT health_check_monitoring`` command requires ``SUPERUSER`` permissions. + +Configuration +-------------- + +There are two types of Health-Check Monitor metrics: one is configurable, and the second is non-configurable. Non-configurable metrics provide information about the system, such as total storage capacity measured in Gigabytes. Configurable metrics are set with low, high, or both thresholds within a valid range to be reported in case of deviation. For example, this could include the number of days remaining on your SQreamDB license. + +Default Metric Values +---------------------- + +The Health-Check Monitor configuration file comes pre-configured with best practices. However, as mentioned before, you have the flexibility to customize any default metric values based on your preferences. All metrics presented below are defined with valid ranges, so any value outside the range triggers a warning. It's important to note that configuring only one threshold will make the Health-Check Monitor assume the ignored threshold is set to *infinity*. + +.. code-block:: json + + { + "totalNumberOfFragmentedChunks":{ + "from":0, + "to":100 + }, + "percentageStorageCapacity":{ + "from":0, + "to":0.9 + }, + "daysForLicenseExpire":{ + "from":60 + }, + "stuckSnapshots":{ + "from":0, + "to":2 + }, + "queriesInQueue":{ + "from":0, + "to":100 + }, + "availableWorkers":{ + "from":0, + "to":5 + }, + "nodeHeartbeatMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "checkLocksMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "keysAndValuesNMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "keysWithPrefixMsgMaxResponseTimeMS":{ + "from":0, + "to":1000 + }, + "nodeHeartbeatMsgVariance":{ + "from":0, + "to":1000 + }, + "checkLocksMsgVariance":{ + "from":0, + "to":1000 + }, + "keysAndValuesNVariance":{ + "from":0, + "to":1000 + }, + "keysWithPrefixMsgVariance":{ + "from":0, + "to":1000 + } + } + +General Syntax +=============== + +.. code-block:: postgres + + SELECT health_check_monitoring('', '', '') + + category :: = { storage | metadata_stats | license | self_healing } + +.. list-table:: Parameters + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``category`` + - Specifies the system domain for which health information is to be retrieved. + * - ``input_file`` + - Specifies the path to the configuration file of the designated *category* for which you want to obtain information. + * - ``export_path`` + - Specifies the directory path where you want the monitoring log file to be extracted. + + +Health-Check Log Structure +============================= + +After executing the ``SELECT health_check_monitoring`` command, a health-check log file and a CLI result set are generated. When reading your health-check log through the CLI, in addition to the metric values, it also showcases your initially set metric range configuration and the location of your exported log file. It's important to note that logs are separately generated for each of the four Health-Check Monitor *categories*. + +The log file and the result set both output the following information: + +.. list-table:: Log Output + :widths: auto + :header-rows: 1 + + * - Log Column Name + - Description + * - ``metric_time`` + - The time when the specific metric was checked + * - ``metric_category`` + - The system domain for which health information is retrieved; either ``storage``, ``metadata_stats``, ``license``, or ``self_healing`` + * - ``metric_name`` + - The specific metric that is being evaluated + * - ``metric_description`` + - For metrics that need a detailed analysis breakdown, this column will showcase the breakdown alongside any additional information + * - ``metric_value`` + - The value of the specific metric + * - ``metric_validation_status`` + - One of three statuses: + * :green:`info`, metric value is within its defined valid range + * none, the metric provides information about the system and has no valid range + * :red:`warn`, metric deviates from its defined valid range + * - ``response_time_sec`` + - Indicates the time taken to gather information for a specific metric. This is helpful for timing health-check executions + +Handling Warnings +------------------- + +Upon reviewing your log output, you'll observe that the ``metric_validation_status`` column reflects one of three potential statuses: :green:`info`, none, or :red:`warn`. This section offers guidance on effectively managing warnings whenever a :red:`warn` status is encountered. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Health-Check Category + - Metric Name + - How to Handle :red:`warn` + * - Storage + - ``No. fragmented chunks`` + - Recreating the table for triggering defragmentation + * - Metadata Statistics + - ``NodeHeartbeatMsg``, ``CheckLocksMsg``, ``KeysAndValuesNMsg``, ``KeysWithPrefixMsg`` + - Gather your metadata statistics by executing the following commands and send the information to `SQreamDB Support `_: + + .. code-block:: sql + + SELECT export_leveldb_stats(''); + SELECT export_statement_queue_stats(''); + SELECT export_conn_stats(''); + * - License + - ``% of used storage capacity``, ``License expiration date`` + - Contact `SQreamDB Support `_ for license expansion + * - Self Healing + - ``Queries in queue`` + - To prevent bottlenecks in the service, reallocate service Workers. Distributing or reallocating service Workers strategically can help optimize performance and mitigate potential bottlenecks. Learn more about :ref:`Worker allocation` + * - Self Healing + - ``Available workers per service`` + - Efficiently utilize resources by reallocating idle workers to a busier service. This approach optimizes resource consumption and helps balance the workload across services. Learn more about :ref:`Worker allocation` + * - Self Healing + - ``Stuck snapshots`` + - The Healer is designed to autonomously address stuck snapshots based on its configured timeout. The session flag, :ref:`healerDetectionFrequencySeconds`, determines when the Healer detects and takes corrective actions for stuck snapshots. To manually address a situation, execute a :ref:`graceful shutdown` of the statement's Worker + +Health-Check Category Specifications +======================================== + +Storage +-------- + +Provides insights into cluster storage chunks and their fragmentation process. Offers an indication of irrelevant storage files in the cluster, preventing potential bottlenecks in chunk iteration during table readings in advance. + +``storage`` monitoring has a lengthy execution time, necessitating low-frequency checks to prevent undue strain on your environment. + +.. code-block:: sql + + SELECT health_check_monitoring('storage', 'path/to/my/input.json', 'directory/where/i/save/logs') + +When monitoring your storage health, you may also filter information retrieval by database, schema, table, or all three. + +.. code-block:: sql + + SELECT health_check_monitoring('storage', 'master', 'path/to/my/input.json', 'path/to/where/i/save/logs') + + SELECT health_check_monitoring('storage', 'master', 'schema1', 'path/to/my/input.json', 'path/to/where/i/save/logs') + + SELECT health_check_monitoring('storage', 'master', 'schema1', 'table1', 'path/to/my/input.json', 'path/to/where/i/save/logs') + +.. list-table:: Storage Metrics + :widths: auto + :header-rows: 1 + + * - Metric + - Configuration Flag + - Default Value + - Description + * - ``No. storage chunks`` + - NA + - NA + - Chunk status; categorized as either ``clean``, ``mixed``, or ``deleted``. This classification aids in comprehending potential slowdowns when reading from a table. ``Clean`` indicates that your table is free of physically lingering deleted data. ``Mixed`` suggests that your table contains data marked for deletion but not yet purged (awaiting the removal of deleted data). Meanwhile, ``deleted`` signifies that the table has undergone the cleanup process. This categorization proves valuable for scrutinizing deletion and clean-up practices, particularly when visualizing data through dedicated tools + * - ``No. fragmented chunks`` + - ``totalNumberOfFragmentedChunks`` + - ``"from":0, "to":100`` + - Defines the number of fragmented chunks + +Metadata Statistics +-------------------- + +Provides information on Worker and metadata reactivity. Regular monitoring allows for the identification of the system's performance during peak loads, indicating periods of heavy system load. This insight can be invaluable for uncovering potential concurrent issues. + +.. code-block:: sql + + SELECT health_check_monitoring('metadata_stats', 'path/to/my/input.json', 'directory/where/i/save/logs') + +.. list-table:: Metadata Statistics Metrics + :widths: auto + :header-rows: 1 + + * - Metric + - Configuration Flag + - Default Value + - Description + * - ``NodeHeartbeatMsg`` + - ``nodeHeartbeatMsgMaxResponseTimeMS``, ``nodeHeartbeatMsgVariance`` + - ``"from":0, "to":1000`` + - Ensures worker vitality through metadata pings. ``max response time`` indicates the peak time for the monitored *category*, while ``variance`` represents the standard deviation between the peak time and the monitoring time. + * - ``CheckLocksMsg`` + - ``checkLocksMsgMaxResponseTimeMS``, ``checkLocksMsgVariance`` + - ``"from":0, "to":1000`` + - Provides details on current locks at the metadata to determine the feasibility of executing the statement. ``max response time`` indicates the peak time for the monitored *category*, while ``variance`` represents the standard deviation between the peak time and the monitoring time. + * - ``KeysAndValuesNMsg`` + - ``keysAndValuesNMaxResponseTimeMS``, ``keysAndValuesNVariance`` + - ``"from":0, "to":1000`` + - Iterates through metadata keys and values. ``max response time`` indicates the peak time for the monitored *category*, while ``variance`` represents the standard deviation between the peak time and the monitoring time. + * - ``KeysWithPrefixMsg`` + - ``keysWithPrefixMsgMaxResponseTimeMS``, ``keysWithPrefixMsgVariance`` + - ``"from":0, "to":1000`` + - Iterates through metadata keys and values with a specific prefix. ``max response time`` indicates the peak time for the monitored *category*, while ``variance`` represents the standard deviation between the peak time and the monitoring time. + + +License +-------- + +Provides details about the customer's license, including database storage capacity and licensing restrictions. Proactively alerts the customer before reaching license limitations, ensuring awareness and timely action. + +.. code-block:: sql + + SELECT health_check_monitoring('license', 'path/to/my/input.json', 'directory/where/i/save/logs') + +.. list-table:: License Metrics + :widths: auto + :header-rows: 1 + + * - Metric + - Configuration Flag + - Default Value + - Description + * - ``Total storage capacity`` + - NA + - NA + - Indicates your licensed storage capacity, outlining the permissible limit for your usage + * - ``Used storage capacity`` + - NA + - NA + - Indicates current storage utilization + * - ``% of used storage capacity`` + - ``percentageStorageCapacity`` + - ``"from":0, "to":0.9`` + - Indicates current storage utilization percentage + * - ``License expiration date`` + - ``daysForLicenseExpire`` + - ``"from":60`` + - Indicates how many days until your license expires + +self_healing +-------------- + +Supplies details on customer ETLs and loads, monitors the execution flow of queries over time, tracks the number of Workers per service, identifies idle Workers, and detects potential issues such as stuck snapshots. It is imperative to regularly monitor this data. During the Root Cause Analysis (RCA) process, it provides a clear understanding of executed operations at specific times, offering customers guidance on optimal resource allocation, particularly in terms of Workers per service. + +Monitoring ``self_healing`` frequently is a best practice to maximize its value. + +.. code-block:: sql + + SELECT health_check_monitoring('self_healing', 'path/to/my/input.json', 'directory/where/i/save/logs') + + +.. list-table:: self_healing Metrics + :widths: auto + :header-rows: 1 + + * - Metric + - Configuration Flag + - Default Value + - Description + * - ``Queries in queue`` + - ``queriesInQueue`` + - ``"from":0, "to":100`` + - Indicates the number of currently queued queries + * - ``Available workers per service`` + - ``availableWorkers`` + - ``"from":0, "to":5`` + - Indicates the number of unused Workers per service + * - ``Stuck snapshots`` + - ``stuckSnapshots`` + - ``"from":0, "to":2`` + - Indicates the number of currently stuck snapshots + + diff --git a/reference/sql/sql_statements/monitoring_commands/show_connections.rst b/reference/sql/sql_statements/utility_commands/show_connections.rst similarity index 99% rename from reference/sql/sql_statements/monitoring_commands/show_connections.rst rename to reference/sql/sql_statements/utility_commands/show_connections.rst index 1bd320a4c..3a498caeb 100644 --- a/reference/sql/sql_statements/monitoring_commands/show_connections.rst +++ b/reference/sql/sql_statements/utility_commands/show_connections.rst @@ -1,3 +1,5 @@ +:orphan: + .. _show_connections: ******************** diff --git a/reference/sql/sql_statements/utility_commands/show_locks.rst b/reference/sql/sql_statements/utility_commands/show_locks.rst new file mode 100644 index 000000000..30e5c3589 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/show_locks.rst @@ -0,0 +1,74 @@ +:orphan: + +.. _show_locks: + +********** +SHOW LOCKS +********** + +``SHOW_LOCKS`` returns a list of locks from across the cluster. + +Read more about locks in :ref:`concurrency_and_locks`. + +Syntax +====== + +.. code-block:: postgres + + SELECT SHOW_LOCKS() + +Output +====== + +This function returns a list of active locks. If no locks are active in the cluster, the result set will be empty. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``stmt_id`` + - Statement ID that caused the lock. + * - ``stmt_string`` + - Statement text + * - ``username`` + - The role that executed the statement + * - ``server`` + - The worker node's IP + * - ``port`` + - The worker node's port + * - ``locked_object`` + - The full qualified name of the object being locked, separated with ``$`` (e.g. ``table$t$public$nba2`` for table ``nba2`` in schema ``public``, in database ``t`` + * - ``lockmode`` + - The locking mode (:ref:`inclusive` or :ref:`exclusive`). + * - ``statement_start_time`` + - Timestamp the statement started + * - ``lock_start_time`` + - Timestamp the lock was obtained + * - ``is_statement_active`` + - Is the statement causing the lock running or not + * - ``is_snapshot_active`` + - Is the snapshot of the metadata keys, created by the statement, still active or not + + +Examples +======== + +.. code-block:: postgres + + SELECT SHOW_LOCKS(); + +Output: + +.. code-block:: console + + statement id |statement string |username |server |port |locked object |lock mode |statement start time |lock start time |is_statement_active |is_snapshot_active + -------------+----------------------------------+---------+---------+-----+---------------+----------+---------------------+-------------------+--------------------+------------------ + 2 |create or replace table t (x int);|sqream |127.0.0.1|5000 |database$master|Inclusive |2024-07-04 15:07:02 |2024-07-04 15:07:02|1 |1 + + +Permissions +=========== + +This utility function requires a ``CONNECT`` permission on the database level. diff --git a/reference/sql/sql_statements/monitoring_commands/show_node_info.rst b/reference/sql/sql_statements/utility_commands/show_node_info.rst similarity index 97% rename from reference/sql/sql_statements/monitoring_commands/show_node_info.rst rename to reference/sql/sql_statements/utility_commands/show_node_info.rst index 345d16440..2672153af 100644 --- a/reference/sql/sql_statements/monitoring_commands/show_node_info.rst +++ b/reference/sql/sql_statements/utility_commands/show_node_info.rst @@ -1,10 +1,12 @@ +:orphan: + .. _show_node_info: ******************** SHOW_NODE_INFO ******************** -``SHOW_NODE_INFO`` returns a snapshot of the current query plan, similar to ``EXPLAIN ANALYZE`` from other databases. +``SHOW_NODE_INFO`` returns a snapshot of the current query plan, similarly to the ``EXPLAIN ANALYZE`` function used in other databases. The snapshot provides information about execution which can be used for monitoring and troubleshooting slow running statements by helping identify long-running execution nodes (components that process data), etc. @@ -108,7 +110,7 @@ This is a full list of node types: - Compress data with both CPU and GPU schemes * - ``CpuDecompress`` - CPU - - Decompression operation, common for longer ``VARCHAR`` types + - Decompression operation, common for longer ``TEXT`` types * - ``CpuLoopJoin`` - CPU - A non-indexed nested loop join, performed on the CPU @@ -283,8 +285,8 @@ Getting execution details for a statement t=> SELECT SHOW_SERVER_STATUS(); service | instanceid | connection_id | serverip | serverport | database_name | user_name | clientip | statementid | statement | statementstarttime | statementstatus | statementstatusstart --------+------------+---------------+--------------+------------+---------------+-----------+--------------+-------------+-----------------------------------------------------------------+---------------------+-----------------+--------------------- - sqream | | 152 | 192.168.1.91 | 5000 | t | sqream | 192.168.1.91 | 176 | SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | 25-12-2019 23:53:13 | Executing | 25-12-2019 23:53:13 - sqream | | 151 | 192.168.1.91 | 5000 | t | sqream | 192.168.0.1 | 177 | SELECT show_server_status() | 25-12-2019 23:51:31 | Executing | 25-12-2019 23:53:13 + sqream | | 152 | 192.168.1.91 | 5000 | t | sqream | 192.168.1.91 | 176 | SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1; | 2019-12-25 23:53:13 | Executing | 2019-12-25 23:53:13 + sqream | | 151 | 192.168.1.91 | 5000 | t | sqream | 192.168.0.1 | 177 | SELECT show_server_status() | 2019-12-25 23:51:31 | Executing | 2019-12-25 23:53:13 The statement ID we want to reserach is ``176``, running on worker ``192.168.1.91``. diff --git a/reference/sql/sql_statements/utility_commands/show_saved_query.rst b/reference/sql/sql_statements/utility_commands/show_saved_query.rst index 15ac4c1bd..2062dcb3a 100644 --- a/reference/sql/sql_statements/utility_commands/show_saved_query.rst +++ b/reference/sql/sql_statements/utility_commands/show_saved_query.rst @@ -1,28 +1,24 @@ +:orphan: + .. _show_saved_query: ******************** -SHOW_SAVED_QUERY +SHOW SAVED QUERY ******************** -``SHOW_SAVED_QUERY`` shows the query text for a :ref:`previously saved query`. +``SHOW_SAVED_QUERY`` shows the query text for a previously :ref:`saved query`. Read more in the :ref:`saved_queries` guide. -See also: ref:`save_query`, :ref:`execute_saved_query`, ref:`drop_saved_query`, ref:`list_saved_queries`. - -Permissions -============= - -Showing a saved query requires no special permissions. +See also: :ref:`save_query`, :ref:`execute_saved_query`, :ref:`drop_saved_query`, :ref:`list_saved_queries`. Syntax ========== -.. code-block:: postgres +.. code-block:: sql show_saved_query_statement ::= SELECT SHOW_SAVED_QUERY(saved_query_name) - ; saved_query_name ::= string_literal @@ -50,12 +46,16 @@ Examples Showing a previously saved query --------------------------------------- -.. code-block:: psql +.. code-block:: sql - t=> SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); - executed - t=> SELECT SHOW_SAVED_QUERY('select_by_weight_and_team'); + SELECT SAVE_QUERY('select_by_weight_and_team',$$SELECT * FROM nba WHERE Weight > ? AND Team = ?$$); + + SELECT SHOW_SAVED_QUERY('select_by_weight_and_team'); saved_query ----------------------------------------------- SELECT * FROM nba WHERE Weight > ? AND Team = ? +Permissions +============= + +Showing a saved query requires ``SELECT`` permissions on the saved query. diff --git a/reference/sql/sql_statements/utility_commands/show_server_status.rst b/reference/sql/sql_statements/utility_commands/show_server_status.rst new file mode 100644 index 000000000..c5090765c --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/show_server_status.rst @@ -0,0 +1,114 @@ +:orphan: + +.. _show_server_status: + +******************** +SHOW_SERVER_STATUS +******************** + +``SHOW_SERVER_STATUS`` returns a list of active sessions across the cluster. + +To list active statements on the current worker only, see :ref:`show_connections`. + +Syntax +========== + +The following is the correct syntax when showing your server status: + +.. code-block:: postgres + + SELECT SHOW_SERVER_STATUS() + +Parameters +============ + +The Parameters section is not relevant for the ``SHOW_SERVER_STATUS`` statement. + +Returns +========= + +The ``SHOW_SERVER_STATUS`` function returns a list of active sessions. If no sessions are active across the cluster, the result set will be empty. + +The following table shows the ``SHOW_SERVER_STATUS`` result columns; + +.. list-table:: Result Columns + :widths: auto + :header-rows: 1 + + * - service + - Statement Service Name + * - ``instance`` + - Shows the worker ID. + * - ``connection_id`` + - Shows the connection ID. + * - ``serverip`` + - Shows the worker end-point IP. + * - ``serverport`` + - Shows the worker end-point port. + * - ``database_name`` + - Shows the statement's database name. + * - ``user_name`` + - Shows the username running the statement. + * - ``clientip`` + - Shows the client IP. + * - ``statementid`` + - Shows the statement ID. + * - ``statement`` + - Shows the statement text. + * - ``statementstarttime`` + - Shows the statement start timestamp. + * - ``statementstatus`` + - Shows the statement status (see table below). + * - ``statementstatusstart`` + - Shows the most recently updated timestamp. + +.. include from here: 66 + +The following table shows the statement status values: + +.. list-table:: Statement Status Values + :widths: auto + :header-rows: 1 + + * - Status + - Description + * - ``Preparing`` + - The statement is being prepared. + * - ``In queue`` + - The statement is waiting for execution. + * - ``Initializing`` + - The statement has entered execution checks. + * - ``Executing`` + - The statement is executing. + * - ``Stopping`` + - The statement is in the process of stopping. + +.. include until here 86 + +Notes +=========== + +This utility shows the active sessions. Some sessions may be actively connected, but not running any statements. + +Example +=========== + +Using SHOW_SERVER_STATUS to Get Statement IDs +---------------------------------------------------- +The following example shows how to use the ``SHOW_SERVER_STATUS`` statement to get statement IDs: + +.. code-block:: psql + + SELECT SHOW_SERVER_STATUS(); + service | instanceid | connection_id | serverip | serverport | database_name | user_name | clientip | statementid | statement | statementstarttime | statementstatus | statementstatusstart + --------+------------+---------------+---------------+------------+---------------+------------------+---------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------+-----------------+--------------------- + sqream | sqream_2 | 19 | 192.168.0.111 | 5000 | master | etl | 192.168.0.011 |2484923 | SELECT t1.account, t1.msisd from table a t1 join table b t2 on t1.id = t2.id where t1.msid='123123'; | 2022-01-17 16:19:31 | Executing | 2022-01-17 16:19:32 + sqream | sqream_1 | 2 | 192.168.1.112 | 5000 | master | etl | 192.168.1.112 |2484924 | select show_server_status(); | 2022-01-17 16:19:39 | Executing | 2022-01-17 16:19:39 + sqream | None | 248 | 192.168.1.112 | 5007 | master | maintenance_user | 192.168.1.112 |2484665 | select * from sqream_catalog.tables; | 2022-01-17 15:55:01 | In Queue | 2022-01-17 15:55:02 + +The statement ID is ``128``, running on worker ``192.168.1.91``. + +Permissions +============= + +The role must have the ``SUPERUSER`` permissions. diff --git a/reference/sql/sql_functions/system_functions/show_version.rst b/reference/sql/sql_statements/utility_commands/show_version.rst similarity index 98% rename from reference/sql/sql_functions/system_functions/show_version.rst rename to reference/sql/sql_statements/utility_commands/show_version.rst index 8c3e7565e..3a6769569 100644 --- a/reference/sql/sql_functions/system_functions/show_version.rst +++ b/reference/sql/sql_statements/utility_commands/show_version.rst @@ -1,3 +1,5 @@ +:orphan: + .. _show_version: ***************** diff --git a/reference/sql/sql_statements/utility_commands/shutdown_server_command.rst b/reference/sql/sql_statements/utility_commands/shutdown_server_command.rst new file mode 100644 index 000000000..d46cd592d --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/shutdown_server_command.rst @@ -0,0 +1,113 @@ +:orphan: + +.. _shutdown_server_command: + +******************** +SHUTDOWN SERVER +******************** + + +SQream's method for stopping the SQream server is running the ``shutdown_server()`` utility command. Because this command abruptly shuts down the server while executing operations, it has been modified to perform a graceful shutdown by setting it to ``select shutdown_server([is_graceful, [timeout]]);``. This causes the server to wait for any queued statements to complete before shutting down. + +.. contents:: + :local: + :depth: 1 + + +How Does it Work? +======================== +Running the ``SHUTDOWN_SERVER`` command gives you more control over the following: + +* Preventing new queries from connecting to the server by: + + * Setting the server as unavailable in the metadata server. + + :: + + * Unsubscribing the server from its service. + +* Stopping users from making new connections to the server. Attempting to connect to the server after activating a graceful shutdown displays the following message: + + .. code-block:: postgres + + Server is shutting down, no new connections are possible at the moment. + +* The amount of time to wait before shutting down the server. + + :: + +* Configurations related to shutting down the server. + +Syntax +========== +The following is the syntax for using the ``SHUTDOWN_SERVER`` command: + +.. code-block:: postgres + + select shutdown_server([true/false, [timeout]]); + +Returns +========== +Running the ``shutdown_server`` command returns no output. + +Parameters +============ +The following table shows the ``shutdown_server`` parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + - Example + - Default + * - ``is_graceful`` + - Determines the method used to shut down the server. + - Selecting ``false`` shuts down the server while queries are running. Selecting ``true`` uses the graceful shutdown method. + - NA + * - ``timeout`` + - Sets the maximum amount of minutes for the graceful shutdown method to run before the server is shut down using the standard method. + - ``([is_graceful, [30]]);`` + - Five minutes + +.. note:: Setting ``is_graceful`` to ``false`` and defining the ``timeout`` value shuts the server down mid-query after the defined time. + +You can define the ``timeout`` argument as the amount minutes after which a forceful shutdown will run, even if a graceful shutdown is in progress. + +Note that activating a forced shutdown with a timeout, such as ``select shutdown_server(false, 30)``, outputs the following error message: + +.. code-block:: postgres + + forced shutdown has no timeout timer + +.. note:: You can set the timeout value using the ``defaultGracefulShutdownTimeoutMinutes`` flag in the Acceleration Studio. + + +Examples +=================== +This section shows the following examples: + +**Example 1 - Activating a Forceful Shutdown** + +.. code-block:: postgres + + shutdown_server() + +**Example 2 - Activating a Graceful Shutdown** + +.. code-block:: postgres + + shutdown_server (true) + +**Example 3 - Overriding the timeout Default with Another Value** + +.. code-block:: postgres + + shutdown_server (500) + +The ``timeout`` unit is minutes. + +Permissions +============= +The ``SUPERUSER`` permission is required to execute ``shutdown_server``. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/stop_statement.rst b/reference/sql/sql_statements/utility_commands/stop_statement.rst new file mode 100644 index 000000000..a49865065 --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/stop_statement.rst @@ -0,0 +1,74 @@ +:orphan: + +.. _stop_statement: + +******************** +STOP_STATEMENT +******************** + +``STOP_STATEMENT`` stops or aborts an active statement. + +Syntax +========== + +.. code-block:: sql + + stop_statement_statement ::= + SELECT STOP_STATEMENT(stmt_id) + + stmt_id ::= bigint + +Parameters +============ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``stmt_id`` + - The statement ID to stop + + +Notes +===== + +This utility always succeeds even if the statement does not exist, or has already stopped. + +Example +======= + +1. Check your server status: + +.. code-block:: psql + + SELECT SHOW_SERVER_STATUS(); + service | instanceid | connection_id | serverip | serverport | database_name | user_name | clientip | statementid | statement | statementstarttime | statementstatus | statementstatusstart + --------+------------+---------------+---------------+------------+---------------+------------------+---------------+-------------+-------------------------------------------------------------------------------------------------------+---------------------+-----------------+--------------------- + sqream | sqream_2 | 19 | 192.168.0.111 | 5000 | master | etl | 192.168.0.011 |2484923 | SELECT t1.account, t1.msisd from table a t1 join table b t2 on t1.id = t2.id where t1.msid='123123'; | 2022-01-17 16:19:31 | Executing | 2022-01-17 16:19:32 + sqream | sqream_1 | 2 | 192.168.1.112 | 5000 | master | etl | 192.168.1.112 |2484924 | select show_server_status(); | 2022-01-17 16:19:39 | Executing | 2022-01-17 16:19:39 + sqream | None | 248 | 192.168.1.112 | 5007 | master | maintenance_user | 192.168.1.112 |2484665 | select * from sqream_catalog.tables; | 2022-01-17 15:55:01 | In Queue | 2022-01-17 15:55:02 + +2. Retrieve stuck statement ID: + +.. code-block:: psql + + SELECT SHOW_CONNECTIONS(); + + ip | conn_id | conn_start_time | stmt_id | stmt_start_time | stmt + --------------+----------+---------------------+---------+---------------------+----------------------------------------------------------------------------------------------------- + 192.168.0.111 | 19 | 2022-01-17 15:50:05 | 2484923 | 2022-01-17 16:19:31 | SELECT t1.account, t1.msisd from table a t1 join table b t2 on t1.id = t2.id where t1.msid='123123'; + 192.168.1.112 | 2 | 2022-01-17 15:50:05 | 2484924 | 2022-01-17 16:19:39 | select show_server_status(); + 192.168.1.112 | 248 | 2022-01-17 15:50:05 | 2484665 | 2022-01-17 15:55:01 | select * from sqream_catalog.tables; + +3. Stop stuck query: + +.. code-block:: sql + + SELECT STOP_STATEMENT(2484923); + +Permissions +============= + +The role must have the ``SUPERUSER`` permissions. \ No newline at end of file diff --git a/reference/sql/sql_statements/utility_commands/swap_table_names.rst b/reference/sql/sql_statements/utility_commands/swap_table_names.rst new file mode 100644 index 000000000..5aba6933a --- /dev/null +++ b/reference/sql/sql_statements/utility_commands/swap_table_names.rst @@ -0,0 +1,47 @@ +:orphan: + +.. _swap_table_names: + +**************** +SWAP TABLE NAMES +**************** + +The ``SWAP_TABLE_NAMES`` command enables you to swap the names of two tables within a schema. + +Syntax +====== + +.. code-block:: postgres + + SELECT SWAP_TABLE_NAMES ('', '', '') + +Parameters +========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``schema_name`` + - The name of the schema both tables are contained within + * - ``table_name`` + - The table name you wish to swap names for + +Notes +===== + +When setting views or performing any operation that points to a table whose name has been swapped, the view will address the same table name. If no columns are specified, the operation will succeed. However, if the operation references specific columns, it will fail because the view points to a table containing different columns. + +Examples +======== + +.. code-block:: postgres + + SELECT SWAP_TABLE_NAMES ('public', 'table1', 'table2'); + +Permissions +=========== + +This utility command requires a permission to execute utility functions. \ No newline at end of file diff --git a/reference/sql/sql_statements/wlm_commands/show_subscribed_instances.rst b/reference/sql/sql_statements/wlm_commands/show_subscribed_instances.rst index 95fd55022..301bb55b5 100644 --- a/reference/sql/sql_statements/wlm_commands/show_subscribed_instances.rst +++ b/reference/sql/sql_statements/wlm_commands/show_subscribed_instances.rst @@ -1,10 +1,10 @@ .. _show_subscribed_instances : *************************** -SHOW_SUBSCRIBED_INSTANCES +SHOW SUBSCRIBED INSTANCES *************************** -``SHOW_SUBSCRIBED_INSTANCES`` lists the cluster workers and their service queues. +``SHOW SUBSCRIBED INSTANCES`` lists the cluster workers and their service queues. .. note:: If you haven't already, read the :ref:`Workload manager guide`. diff --git a/reference/sql/sql_statements/wlm_commands/subscribe_service.rst b/reference/sql/sql_statements/wlm_commands/subscribe_service.rst index a36fcf3a4..143ef4acc 100644 --- a/reference/sql/sql_statements/wlm_commands/subscribe_service.rst +++ b/reference/sql/sql_statements/wlm_commands/subscribe_service.rst @@ -1,10 +1,10 @@ .. _subscribe_service : ******************* -SUBSCRIBE_SERVICE +SUBSCRIBE SERVICE ******************* -``SUBSCRIBE_SERVICE`` subscribes a worker to a service queue for the duration of the connected session. +``SUBSCRIBE SERVICE`` subscribes a worker to a service queue for the duration of the connected session. .. note:: If you haven't already, read the :ref:`Workload manager guide`. diff --git a/reference/sql/sql_statements/wlm_commands/unsubscribe_service.rst b/reference/sql/sql_statements/wlm_commands/unsubscribe_service.rst index a26df0554..91019889d 100644 --- a/reference/sql/sql_statements/wlm_commands/unsubscribe_service.rst +++ b/reference/sql/sql_statements/wlm_commands/unsubscribe_service.rst @@ -1,10 +1,10 @@ .. _unsubscribe_service : ******************** -UNSUBSCRIBE_SERVICE +UNSUBSCRIBE SERVICE ******************** -``UNSUBSCRIBE_SERVICE`` unsubscribes a worker from a service queue for the duration of the connected session. +``UNSUBSCRIBE SERVICE`` unsubscribes a worker from a service queue for the duration of the connected session. .. note:: If you haven't already, read the :ref:`Workload manager guide`. @@ -47,7 +47,7 @@ Notes * If the service name does not currently exist, it will be created -.. warning:: ``UNSUBSCRIBE_SERVICE`` applies the service subscription immediately, but the setting applies for the duration of the session. To apply a persistent setting, use the ``initialSubscribedServices`` configuration setting. Read the :ref:`Workload manager guide` for more information. +.. warning:: ``UNSUBSCRIBE_SERVICE`` removes the service subscription immediately, but the setting applies for the duration of the session. To apply a persistent setting, use the ``initialSubscribedServices`` configuration setting. Read the :ref:`Workload manager guide` for more information. Examples =========== diff --git a/reference/sql/sql_syntax/common_table_expressions.rst b/reference/sql/sql_syntax/common_table_expressions.rst index 8ea74dba7..63b2a1b2b 100644 --- a/reference/sql/sql_syntax/common_table_expressions.rst +++ b/reference/sql/sql_syntax/common_table_expressions.rst @@ -1,15 +1,15 @@ -.. _common_table_expressions: +:orphan: -********************************* -Common table expressions (CTEs) -********************************* +.. _common_table_expressions: -Common table expressions or CTEs allow a complex subquery to be represented in a short way later on for improved readability, and reuse multiple times in a query. +************************ +Common Table Expressions +************************ -CTEs do not affect query performance. +A Common Table Expression (CTE) is a temporary named result set that can be referenced within a statement, allowing for more readable and modular queries. CTEs do not affect query performance. Syntax -========== +====== .. code-block:: postgres @@ -39,62 +39,103 @@ Syntax | ( VALUES ( value_expr [, ... ] ) [, ... ] ) Examples -========== +======== + +Create the following ``nba`` table: + +.. code-block:: psql + + CREATE OR REPLACE TABLE nba ( + Name TEXT, + Team TEXT, + Number INTEGER, + Position TEXT, + Age INTEGER, + Height TEXT, + Weight INTEGER, + College TEXT, + Salary INTEGER, + name0 TEXT + ); + + INSERT INTO nba (Name, Team, Number, Position, Age, Height, Weight, College, Salary, name0) + VALUES + ('Carmelo Anthony', 'New York Knicks', 7, 'SF', 32, '6-8', 240, 'Syracuse', 22875000, 'Carmelo Anthony'), + ('Chris Bosh', 'Miami Heat', 1, 'PF', 32, '6-11', 235, 'Georgia Tech', 22192730, 'Chris Bosh'), + ('Chris Paul', 'Los Angeles Clippers', 3, 'PG', 31, '6-0', 175, 'Wake Forest', 21468695, 'Chris Paul'), + ('Derrick Rose', 'Chicago Bulls', 1, 'PG', 27, '6-3', 190, 'Memphis', 20093064, 'Derrick Rose'), + ('Dwight Howard', 'Houston Rockets', 12, 'C', 30, '6-11', 265, NULL, 22359364, 'Dwight Howard'), + ('Kevin Durant', 'Oklahoma City Thunder', 35, 'SF', 27, '6-9', 240, 'Texas', 20158622, 'Kevin Durant'), + ('Kobe Bryant', 'Los Angeles Lakers', 24, 'SF', 37, '6-6', 212, NULL, 25000000, 'Kobe Bryant'), + ('LeBron James', 'Cleveland Cavaliers', 23, 'SF', 31, '6-8', 250, NULL, 22970500, 'LeBron James') + ('Stanley Johnson', 'Detroit Pistons', 3, 'SF', 26, '6-7', 245, 'Connecticut', 3120360, 'Stanley Johnson'), + ('Andre Drummond', 'Detroit Pistons', 0, 'C', 28, '6-10', 279, 'Connecticut', 27093019, 'Andre Drummond'), + ('Aaron Gordon', 'Orlando Magic', 0, 'PF', 26, '6-8', 235, 'Arizona', 18136364, 'Aaron Gordon'), + ('Shabazz Napier', 'Orlando Magic', 13, 'PG', 31, '6-1', 175, 'Connecticut', 1378242, 'Shabazz Napier'); Simple CTE --------------- +---------- .. code-block:: psql - nba=> WITH s AS (SELECT "Name" FROM nba WHERE "Salary" > 20000000) - . SELECT * FROM nba AS n, s WHERE n."Name" = s."Name"; - Name | Team | Number | Position | Age | Height | Weight | College | Salary | name0 - ----------------+-----------------------+--------+----------+-----+--------+--------+--------------+----------+---------------- - Carmelo Anthony | New York Knicks | 7 | SF | 32 | 6-8 | 240 | Syracuse | 22875000 | Carmelo Anthony - Chris Bosh | Miami Heat | 1 | PF | 32 | 6-11 | 235 | Georgia Tech | 22192730 | Chris Bosh - Chris Paul | Los Angeles Clippers | 3 | PG | 31 | 6-0 | 175 | Wake Forest | 21468695 | Chris Paul - Derrick Rose | Chicago Bulls | 1 | PG | 27 | 6-3 | 190 | Memphis | 20093064 | Derrick Rose - Dwight Howard | Houston Rockets | 12 | C | 30 | 6-11 | 265 | | 22359364 | Dwight Howard - Kevin Durant | Oklahoma City Thunder | 35 | SF | 27 | 6-9 | 240 | Texas | 20158622 | Kevin Durant - Kobe Bryant | Los Angeles Lakers | 24 | SF | 37 | 6-6 | 212 | | 25000000 | Kobe Bryant - LeBron James | Cleveland Cavaliers | 23 | SF | 31 | 6-8 | 250 | | 22970500 | LeBron James - -In this example, the ``WITH`` clause defines the temporary name ``r`` for the subquery which finds salaries over $20 million. The result set becomes a valid table reference in any table expression of the subsequent SELECT clause. + WITH s AS (SELECT Name FROM nba WHERE Salary > 20000000) + SELECT * FROM nba AS n, s WHERE n.Name = s.Name; + name |team |number|position|age|height|weight|college |salary |name0 |name1 | + ---------------+---------------------+------+--------+---+------+------+------------+--------+---------------+---------------+ + Kobe Bryant |Los Angeles Lakers | 24|SF | 37|6-6 | 212| |25000000|Kobe Bryant |Kobe Bryant | + LeBron James |Cleveland Cavaliers | 23|SF | 31|6-8 | 250| |22970500|LeBron James |LeBron James | + Dwight Howard |Houston Rockets | 12|C | 30|6-11 | 265| |22359364|Dwight Howard |Dwight Howard | + Carmelo Anthony|New York Knicks | 7|SF | 32|6-8 | 240|Syracuse |22875000|Carmelo Anthony|Carmelo Anthony| + Chris Bosh |Miami Heat | 1|PF | 32|6-11 | 235|Georgia Tech|22192730|Chris Bosh |Chris Bosh | + Chris Paul |Los Angeles Clippers | 3|PG | 31|6-0 | 175|Wake Forest |21468695|Chris Paul |Chris Paul | + Kevin Durant |Oklahoma City Thunder| 35|SF | 27|6-9 | 240|Texas |20158622|Kevin Durant |Kevin Durant | + Derrick Rose |Chicago Bulls | 1|PG | 27|6-3 | 190|Memphis |20093064|Derrick Rose |Derrick Rose | + +In this example, the ``WITH`` clause defines the temporary name ``s`` for the subquery which finds salaries over $20 million. The result set becomes a valid table reference in any table expression of the subsequent ``SELECT`` clause. Nested CTEs ---------------- +----------- -SQream DB also supports any amount of nested CTEs, such as this: +SQreamDB also supports any amount of nested CTEs, such as this: .. code-block:: postgres WITH w AS (SELECT * FROM - (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY "Salary" DESC)) - SELECT * FROM w ORDER BY "Weight" DESC; + (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY Salary DESC)) + SELECT * FROM w ORDER BY Weight DESC; + name |team |number|position|age|height|weight|college |salary |name0 | + ---------------+---------------------+------+--------+---+------+------+------------+--------+---------------+ + Dwight Howard |Houston Rockets | 12|C | 30|6-11 | 265| |22359364|Dwight Howard | + LeBron James |Cleveland Cavaliers | 23|SF | 31|6-8 | 250| |22970500|LeBron James | + Carmelo Anthony|New York Knicks | 7|SF | 32|6-8 | 240|Syracuse |22875000|Carmelo Anthony| + Kevin Durant |Oklahoma City Thunder| 35|SF | 27|6-9 | 240|Texas |20158622|Kevin Durant | + Chris Bosh |Miami Heat | 1|PF | 32|6-11 | 235|Georgia Tech|22192730|Chris Bosh | + Kobe Bryant |Los Angeles Lakers | 24|SF | 37|6-6 | 212| |25000000|Kobe Bryant | + Derrick Rose |Chicago Bulls | 1|PG | 27|6-3 | 190|Memphis |20093064|Derrick Rose | + Chris Paul |Los Angeles Clippers | 3|PG | 31|6-0 | 175|Wake Forest |21468695|Chris Paul | Reusing CTEs ----------------- +------------ -SQream DB supports reusing CTEs several times in a query. +SQreamDB supports reusing CTEs several times in a query. CTEs are separated with commas. .. code-block:: psql - nba=> WITH - . nba_ct AS (SELECT "Name", "Team" FROM nba WHERE "College"='Connecticut'), - . nba_az AS (SELECT "Name", "Team" FROM nba WHERE "College"='Arizona') - . SELECT * FROM nba_az JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; - Name | Team | name0 | team0 - ----------------+-----------------+----------------+---------------- - Stanley Johnson | Detroit Pistons | Andre Drummond | Detroit Pistons - Aaron Gordon | Orlando Magic | Shabazz Napier | Orlando Magic + WITH + nba_ct AS (SELECT "Name", "Team" FROM nba WHERE "College"='Connecticut'), + nba_az AS (SELECT "Name", "Team" FROM nba WHERE "College"='Arizona') + SELECT * FROM nba_az JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; + name |team |name0 |team0 | + ------------+-------------+--------------+-------------+ + Aaron Gordon|Orlando Magic|Shabazz Napier|Orlando Magic| -Using CTEs with :ref:`create_table_as` ----------------------------------------- +Using CTEs with ``CREATE TABLE AS`` +----------------------------------- When used with :ref:`create_table_as`, the ``CREATE TABLE`` statement should appear before ``WITH``. @@ -103,6 +144,55 @@ When used with :ref:`create_table_as`, the ``CREATE TABLE`` statement should app CREATE TABLE weights AS WITH w AS - (SELECT * FROM - (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY "Salary" DESC)) - SELECT * FROM w ORDER BY "Weight" DESC; + (SELECT * FROM + (WITH x AS (SELECT * FROM nba) SELECT * FROM x ORDER BY Salary DESC)) + SELECT * FROM w ORDER BY Weight DESC; + + SELECT * FROM weights; + + name |team |number|position|age|height|weight|college |salary |name0 | + ---------------+---------------------+------+--------+---+------+------+------------+--------+---------------+ + Andre Drummond |Detroit Pistons | 0|C | 28|6-10 | 279|Connecticut |27093019|Andre Drummond | + Dwight Howard |Houston Rockets | 12|C | 30|6-11 | 265| |22359364|Dwight Howard | + LeBron James |Cleveland Cavaliers | 23|SF | 31|6-8 | 250| |22970500|LeBron James | + Stanley Johnson|Detroit Pistons | 3|SF | 26|6-7 | 245|Connecticut | 3120360|Stanley Johnson| + Carmelo Anthony|New York Knicks | 7|SF | 32|6-8 | 240|Syracuse |22875000|Carmelo Anthony| + Kevin Durant |Oklahoma City Thunder| 35|SF | 27|6-9 | 240|Texas |20158622|Kevin Durant | + Chris Bosh |Miami Heat | 1|PF | 32|6-11 | 235|Georgia Tech|22192730|Chris Bosh | + Aaron Gordon |Orlando Magic | 0|PF | 26|6-8 | 235|Arizona |18136364|Aaron Gordon | + Kobe Bryant |Los Angeles Lakers | 24|SF | 37|6-6 | 212| |25000000|Kobe Bryant | + Derrick Rose |Chicago Bulls | 1|PG | 27|6-3 | 190|Memphis |20093064|Derrick Rose | + Chris Paul |Los Angeles Clippers | 3|PG | 31|6-0 | 175|Wake Forest |21468695|Chris Paul | + Shabazz Napier |Orlando Magic | 13|PG | 31|6-1 | 175|Connecticut | 1378242|Shabazz Napier | + +Using CTEs with ``INSERT`` +-------------------------- + +The :ref:`insert` statement should appear before ``WITH``. + +.. code-block:: postgres + + CREATE OR REPLACE TABLE nba_archive ( + Name TEXT, + Team TEXT, + Number INTEGER, + Position TEXT, + Age INTEGER, + Height TEXT, + Weight INTEGER, + College TEXT, + Salary INTEGER, + name0 TEXT + ); + + INSERT INTO nba_archive + WITH nba_info AS( + SELECT * + FROM nba + ) + SELECT * + FROM nba_info; + + SELECT * FROM nba_archive ; + + \ No newline at end of file diff --git a/reference/sql/sql_syntax/cross_database_query.rst b/reference/sql/sql_syntax/cross_database_query.rst new file mode 100644 index 000000000..822cf4920 --- /dev/null +++ b/reference/sql/sql_syntax/cross_database_query.rst @@ -0,0 +1,173 @@ +:orphan: + +.. _cross_database_query: + +*************************** +Cross-Database Query +*************************** + +Cross-database queries allow the retrieval and manipulation of data from different databases within a single SQreamDB cluster, through the execution of a single SQL statement or transaction. This capability is crucial when information relevant to a query spans multiple databases. By specifying the database context and employing fully qualified object names, such as database.schema.table, it becomes possible to seamlessly integrate and analyze data distributed across diverse databases. + +To ensure optimal performance, it is advised to refrain from querying more than 10 databases in a single query. + + +Syntax +========== + +.. code-block:: sql + + -- SELECT statement + + SELECT + , + , + ... + FROM + .. AS + JOIN + .. AS + ON + . = . + WHERE + + AND + + -- CREATE TABLE statement + + CREATE TABLE + .. ( + , + , + ... + ) + + -- CREATE FOREIGN TABLE statement + + CREATE FOREIGN TABLE + .. ( + , + , + ... + ) + + -- ALTER TABLE statement + + ALTER TABLE + .. + ADD COLUMN + + + -- CREATE VIEW statement + + CREATE VIEW + .. (, , ...) + AS + SELECT + ., + ., + ... + FROM + .. AS + JOIN + .. AS + ON + . = . + WHERE + + AND + + -- INSERT INTO statement + + INSERT INTO + .. (, , ...) + VALUES + (, , ...) + + -- UPDATE statement + + UPDATE + .. + SET + = , + = + WHERE + + + -- DELETE statement + + DELETE FROM + .. + WHERE + + + -- TRUNCATE TABLE statement + + TRUNCATE TABLE + .. + + -- DROP TABLE statement + + DROP TABLE + .. + + +Parameters +=========== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``column_name`` + - The name of a specific column to read or write data from + * - ``database_name`` + - The name of a specific database to read or write data from + * - ``schema_name`` + - The name of a specific schema to read or write data within + * - ``table_name`` + - The name of a specific table to read or write data from + * - ``condition`` + - The condition for performing a specific operation + +Examples +========= + +Querying data from two tables in different databases: + +.. code-block:: sql + + SELECT * + FROM database1.schema1.table1 t1 + JOIN database2.schema2.table2 t2 + ON t1.id = t2.id + WHERE t1.date >= '2022-01-01' AND t2.status = 'active'; + +Querying data from two tables in different schemas and databases: + +.. code-block:: sql + + SELECT * + FROM database1.schema1.table1 t1 + JOIN database2.schema2.table2 t2 + ON t1.id = t2.id + WHERE t1.date >= '2022-01-01' AND t2.status = 'active'; + +Querying data from three tables in different databases: + +.. code-block:: sql + + SELECT t1.*, t2.*, t3.* + FROM database1.schema1.table1 t1 + JOIN database2.schema2.table2 t2 + ON t1.id = t2.id + JOIN database3.schema3.table3 t3 + ON t2.id = t3.id + WHERE t1.date >= '2022-01-01' AND t2.status = 'active' AND t3.quantity > 10; + +Limitation +========== + +The cross-database syntax is not supported for querying SQreamDB's logical schema, ``sqream_catalog``. + diff --git a/reference/sql/sql_syntax/index.rst b/reference/sql/sql_syntax/index.rst index 90caf01e5..498fec6c3 100644 --- a/reference/sql/sql_syntax/index.rst +++ b/reference/sql/sql_syntax/index.rst @@ -4,18 +4,70 @@ SQL Syntax Features ********************** -SQream DB supports SQL from the ANSI 92 syntax. +SQreamDB supports SQL from the ANSI 92 syntax. + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Features + - Description + * - :ref:`keywords_and_identifiers` + - Keywords are reserved words with specific meanings, while identifiers are used to name database objects like tables and columns. + * - :ref:`literals` + - Literals are fixed values representing specific data types, such as numbers or strings, used directly in SQL statements. + * - :ref:`scalar_expressions` + - Scalar expressions are single-value computations that operate on one or more values to produce a single result. + * - :ref:`cross_database_query` + - Cross-database queries involve accessing and manipulating data from multiple databases within a single SQL statement or operation. + * - :ref:`joins` + - Joins combine rows from two or more tables based on a related column to retrieve data from multiple sources in a single result set. + * - :ref:`common_table_expressions` + - Common Table Expressions (CTEs) are named temporary result sets that simplify complex queries by allowing the definition of subqueries for better readability and reusability. + * - :ref:`window_functions` + - Window Functions perform calculations across a specified range of rows related to the current row, offering advanced analytics and aggregation within result sets. + * - :ref:`subqueries` + - Subqueries are nested queries that are embedded within a larger query to retrieve data, perform calculations, or filter results based on the outcome of the inner query. + * - :ref:`null_handling` + - Null handling involves managing and evaluating the presence of null values, representing unknown or undefined data, to avoid unexpected results in queries and expressions. + * - :ref:`sqream_scripting` + - Metalanguage scripting enhances your interaction with SQL by providing conventions which allow dynamic generation, management, and automation of SQL code. + * - :ref:`pivot_unpivot` + - convert row-level data into columnar representation. + + .. toctree:: - :maxdepth: 2 - :caption: SQL Syntax Topics + :caption: :glob: - + :maxdepth: 6 + :titlesonly: + :hidden: + keywords_and_identifiers literals scalar_expressions + cross_database_query joins common_table_expressions window_functions subqueries null_handling + sqream_scripting + pivot_unpivot + + + + + + + + + + + + + + + + diff --git a/reference/sql/sql_syntax/joins.rst b/reference/sql/sql_syntax/joins.rst index 2563e7a5d..e14b69495 100644 --- a/reference/sql/sql_syntax/joins.rst +++ b/reference/sql/sql_syntax/joins.rst @@ -1,8 +1,8 @@ -.. _joins: +:orphan: -*************************** +***** Joins -*************************** +***** The ``JOIN`` clause combines results from two or more table expressions (tables, external tables, views) based on a related column or other condition. Performing a join outputs a new result set. For example, two tables containing one or more columns in common can be joined to match or correlate with rows from another table. @@ -10,7 +10,8 @@ The ``JOIN`` clause combines results from two or more table expressions (tables, Syntax -========== +====== + The following shows the correct syntax for creating a **join**: .. code-block:: postgres @@ -30,7 +31,8 @@ The following shows the correct syntax for creating a **join**: MERGE | LOOP Join Types -------------- +---------- + The **Join Types** section describes the following join types: * :ref:`Inner joins` @@ -41,13 +43,13 @@ The **Join Types** section describes the following join types: .. _inner_joins: Inner Joins -^^^^^^^^^^^^ +^^^^^^^^^^^ + The following shows the correct syntax for creating an **inner join**: .. code-block:: postgres - left_side [ INNER ] JOIN right_side ON value_expr - left_side [ INNER ] JOIN right_side USING ( join_column [, ... ] ) + left_side [ INNER ] JOIN left_side ON value_expr Inner joins are the default join type and return rows from the ``left_side`` and ``right_side`` based on a matching condition. @@ -60,7 +62,6 @@ An inner join can also be specified by listing several tables in the ``FROM`` cl [ { INNER JOIN | LEFT [OUTER] JOIN | RIGHT [OUTER] JOIN - | FULL [OUTER] JOIN } table2 ON table1.column1 = table2.column1 ] Omitting the ``ON`` or ``WHERE`` clause creates a ``CROSS JOIN``, where every ``left_side`` row is matched with every ``right_side`` row. @@ -72,13 +73,13 @@ For an inner join example, see :ref:`Inner Join Example`. .. _left_outer_joins: Left Outer Joins -^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^ + The following shows the correct syntax for creating an **left outer join**: .. code-block:: postgres left_side LEFT [ OUTER ] JOIN right_side ON value_expr - left_side LEFT [ OUTER ] JOIN right_side USING ( join_column [, ... ] ) Left outer joins are similar to inner joins, except that for every ``left_side`` row without a matching condition, a ``NULL`` value is returned for the corresponding ``right_side`` column. @@ -88,13 +89,13 @@ For a left inner join example, see :ref:`Left Join Example`. .. _right_outer_joins: Right Outer Joins -^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^ + The following shows the correct syntax for creating an **right outer join**: .. code-block:: postgres left_side RIGHT [ OUTER ] JOIN right_side ON value_expr - left_side RIGHT [ OUTER ] JOIN right_side USING ( join_column [, ... ] ) Right outer joins are similar to inner joins, except that for every ``right_side`` row without a matching condition, a ``NULL`` value is returned for the corresponding ``left_side`` column. @@ -104,7 +105,8 @@ For a right outer join example, see :ref:`Right Join Example .. _cross_joins: Cross Joins -^^^^^^^^^^^^^ +^^^^^^^^^^^ + The following shows the correct syntax for creating an **cross join**: .. code-block:: postgres @@ -117,7 +119,7 @@ The ``CROSS JOIN`` clause cannot have an ``ON`` clause, but the ``WHERE`` clause The following is an example of two tables that will be used as the basis for a cross join: -.. image:: /_static/images/joins/color_table.png +.. image:: /_static/images/color_table.png The following is the output result of the cross join: @@ -149,7 +151,7 @@ For a cross join example, see :ref:`Cross Join Example`. The ON Condition -------------- +---------------- The ``ON`` condition is a value expression that generates a Boolean output to identify whether rows match. @@ -162,12 +164,11 @@ For example, the following is displayed when two name columns match: The ``ON`` clause is optional for ``LEFT`` and ``RIGHT`` joins. However, excluding it results in a computationally intensive cross join. -.. tip:: SQream DB does not support the ``USING`` syntax. However, queries can be easily rewritten. ``left_side JOIN right_side using (name)`` is equivalent to ``ON left_side.name = right_side.name`` - Join Type Examples -============= +================== + The examples in this section are based on a pair of tables with the following structure and content: .. code-block:: postgres @@ -181,7 +182,8 @@ The examples in this section are based on a pair of tables with the following st .. _inner_join_example: Inner Join Example ------------- +------------------ + The following is an example of an inner join. .. code-block:: psql @@ -199,7 +201,8 @@ Notice in the example above that values with no matching conditions do not appea .. _left_join_example: Left Join Example ------------- +----------------- + The following is an example of a left join: .. code-block:: psql @@ -218,7 +221,8 @@ The following is an example of a left join: .. _right_join_example: Right Join Example ------------- +------------------ + The following is an example of a right join: .. code-block:: psql @@ -238,7 +242,8 @@ The following is an example of a right join: .. _cross_join_example: Cross Join Example -------------- +------------------ + The following is an example of a cross join: .. code-block:: psql @@ -303,7 +308,7 @@ Specifying multiple comma-separated tables is equivalent to a cross join, which 5 | 5 Join Hints -------------- +---------- **Join hints** can be used to override the query compiler and choose a particular join algorithm. The available algorithms are ``LOOP`` (corresponding to non-indexed nested loop join algorithm), and ``MERGE`` (corresponding to sort merge join algorithm). If no algorithm is specified, a loop join is performed by default. @@ -323,4 +328,4 @@ The following is an example of using a join hint: --+--- 2 | 2 4 | 4 - 5 | 5 + 5 | 5 \ No newline at end of file diff --git a/reference/sql/sql_syntax/keywords_and_identifiers.rst b/reference/sql/sql_syntax/keywords_and_identifiers.rst index e5d1a6fcf..8d35e7bea 100644 --- a/reference/sql/sql_syntax/keywords_and_identifiers.rst +++ b/reference/sql/sql_syntax/keywords_and_identifiers.rst @@ -1,70 +1,154 @@ +:orphan: + .. _keywords_and_identifiers: -*************************** +************************ +Identifiers and Keywords +************************ + Identifiers -*************************** +=========== + **Identifiers** are names given to SQL entities, such as tables, columns, databases, and variables. Identifiers must be unique so that entities can be correctly identified during the execution of a program. Identifiers can also be used to change a column name in the result (column alias) in a ``SELECT`` statement. Identifiers can be either quoted or unquoted and a maximum 128 characters long. Identifiers are sometimes referred to as "names". Regular identifiers must follow these rules: -* Must not contain any special characters except for underscores (``_``). -* Must be case-insensitive. SQream converts all identifiers to lowercase unless quoted. -* Does not equal any keywords, such as ``SELECT``, ``OR``, or ``AND``, etc. +* Must not contain a whitespace character or any special characters except for underscores (``_``) +* Must be case-insensitive. SQream converts all identifiers to lowercase unless quoted +* Does not equal any keywords, such as ``SELECT``, ``OR``, or ``AND``, etc To bypass the rules above you can surround an identifier with double quotes (``"``). Quoted identifiers must follow these rules: -* Must be surrounded with double quotes (``"``). +* Must be surrounded with double quotes (``"``) * May contain any ASCII character except ``@``, ``$`` or ``"``. -* Must be case-sensitive and referenced with double quotes. +* Must be case-sensitive and referenced with double quotes (``"``) + +Examples +-------- + +Creating quoted and unquoted identifiers: + +.. code-block:: postgres + + CREATE ROLE "developer"; --quoted identifiers preserves case - will create "developer" + CREATE ROLE "Developer"; --quoted identifiers preserves case - will create "Developer" + CREATE ROLE Developer; --unquoted identifiers ignores case - will create "developer" + +These are all valid examples when quoted, but are invalid when unquoted: + +.. code-block:: postgres + + CREATE SCHEMA "my schema"; + + CREATE SCHEMA "123schema"; + + CREATE SCHEMA my schema; --invalid + + CREATE SCHEMA 123schema; --invalid + +Use of invalid characters, such as ``@``: + +.. code-block:: postgres + + CREATE SCHEMA "my schema@master"; + +Provides the following error message: + +.. code-block:: console + + Status: Ended with errorError preparing statement: Unsupported character '@' in identifier: "my schema@master" + Quoted identifiers cannot contain the character '@'. + Quoted identifiers may contain any ASCII character with code between 32 and 126 except for: + - @ + - $ + - " + + +Keywords +======== Identifiers are different than **keywords**, which are predefined words reserved with specific meanings in a statement. Some examples of keywords are ``SELECT``, ``CREATE``, and ``WHERE``. Note that keywords **cannot** be used as identifiers. -The following table shows a full list of the reserved keywords: - -+-------------------------------------------------------------------------------------------------+ -| **Keywords** | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ALL`` | ``CURRENT_CATALOG`` | ``HASH`` | ``NOT`` | ``SIMILAR`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ANALYSE`` | ``CURRENT_ROLE`` | ``HAVING`` | ``NOTNULL`` | ``SOME`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ANALYZE`` | ``CURRENT_TIME`` | ``ILIKE`` | ``NULL`` | ``SYMMETRIC`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``AND`` | ``CURRENT_USER`` | ``IN`` | ``OFFSET`` | ``SYMMETRIC`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ANY`` | ``DEFAULT`` | ``INITIALLY`` | ``ON`` | ``TABLE`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ARRAY`` | ``DEFERRABLE`` | ``INNER`` | ``ONLY`` | ``THEN`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``AS`` | ``DESC`` | ``INTERSECT`` | ``OPTION`` | ``TO`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``ASC`` | ``DISTINCT`` | ``INTO`` | ``OR`` | ``TRAILING`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``AUTHORIZATION`` | ``DO`` | ``IS`` | ``ORDER`` | ``TRUE`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``BINARY`` | ``ELSE`` | ``ISNULL`` | ``OUTER`` | ``UNION`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``BOTH`` | ``END`` | ``JOIN`` | ``OVER`` | ``UNIQUE`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CASE`` | ``EXCEPT`` | ``LEADING`` | ``OVERLAPS`` | ``USER`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CAST`` | ``FALSE`` | ``LEFT`` | ``PLACING`` | ``USING`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CHECK`` | ``FETCH`` | ``LIKE`` | ``PRIMARY`` | ``VARIADIC`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``COLLATE`` | ``FOR`` | ``LIMIT`` | ``REFERENCES`` | ``VERBOSE`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``COLUMN`` | ``FREEZE`` | ``LOCALTIME`` | ``RETURNING`` | ``WHEN`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CONCURRENTLY`` | ``FROM`` | ``LOCALTIMESTAMP`` | ``RIGHT`` | ``WHERE`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CONSTRAINT`` | ``FULL`` | ``LOOP`` | ``RLIKE`` | ``WINDOW`` | -+-------------------+---------------------+--------------------+------------------+---------------+ -| ``CREATE`` | ``GRANT`` | ``MERGE`` | ``SELECT`` | ``WITH`` | -+-------------------+---------------------+--------------------+------------------+ | -| ``CROSS`` | ``GROUP`` | ``NATURAL`` | ``SESSION_USER`` | | -+-------------------+---------------------+--------------------+------------------+---------------+ +SQreamDB reserved keywords: + +.. glossary:: + + A + ``ABORT``, ``ADD``, ``ALL``, ``ALTER``, ``ANALYSE``, ``ANALYZE``, ``AND``, ``ANY``, ``ARRAY``, ``AS``, ``ASC``, ``AUDITLOG``, ``AUTHORIZATION`` + + B + ``BACKUP``, ``BEGIN``, ``BETWEEN``, ``BIGINT``, ``BINARY``<, ``BOTH``, ``BREAK``, ``BROWSE``, ``BULK``, ``BY`` + + C + ``CASE``, ``CAST``, ``CASCADE``, ``CHECK``, ``CHECKPOINT``, ``CLOSE``, ``CLUSTERED``, ``COLLATE``, ``COLUMN``, ``COMMENT``<, ``COMPUTE``, ``CONCURRENTLY``, ``CONSTRAINT``, ``CONTAINSTABLE``, ``CONTINUE``, ``CONVERT``, ``CREATE``, ``CROSS``, ``CURRENT``, ``CURRENT_CATALOG``, ``CURRENT_ROLE``, ``CURRENT_TIME``, ``CURRENT_USER``, ``CURSOR`` + + D + ``DATABASE``, ``DBCC``, ``DEALLOCATE``, ``DECLARE``, ``DEFAULT``, ``DEFERRABLE``, ``DELETE``, ``DENY``, ``DESC``, ``DISTINCT``, ``DISTRIBUTED``, ``DO``<, ``DROP``, ``DUMP`` + + E + ``ELSE``, ``END``, ``ERRLVL``, ``ESCAPE``, ``EXEC``, ``EXECUTE``, ``EXCEPT``, ``EXISTS``, ``EXIT``, ``EXTERNAL`` + + F + ``FALSE``, ``FETCH``, ``FILLFACTOR``, ``FILE``, ``FOR``, ``FOREIGN``, ``FREEZE``, ``FREETEXT``, ``FREETEXTTABLE``, ``FROM``, ``FULL``, ``FUNCTION`` + + G + ``GOTO``, ``GRANT``, ``GROUP``, ``HASH``, ``HAVING``, ``HOLDLOCK`` + + H + ``HASH``, ``HAVING``, ``HOLDLOCK`` + + I + ``IDENTITY``, ``IDENTITYCOL``, ``IDENTITY_INSERT``, ``IF``, ``ILIKE``, ``IN``, ``INITIALLY``, ``INNER``, ``INDEX``, ``INSERT``, ``IS``, ``ISCASTABLE``, ``ISNULL``< + + J + ``JOIN`` + + K + ``KEY``, ``KILL`` + + L + ``LEFT``, ``LEADING``, ``LIKE``, ``LIMIT``, ``LINENO``, ``LOAD``, ``LOCALTIME``, ``LOCALTIMESTAMP``, ``LOOP`` + + M + ``MERGE`` + + N + ``NATIONAL``, ``NATURAL``, ``NOCHECK``, ``NONCLUSTERED``, ``NOT``, ``NOTNULL``<, ``NULL``, ``NULLIF`` + + O + ``OFF``, ``OFFSET``, ``OFFSETS``, ``OF``, ``ON``, ``ONLY``, ``OPEN``, ``OPENDATASOURCE``, ``OPENQUERY``, ``OPENROWSET``, ``OPENXML``, ``OPTION``, ``OR``, ``ORDER``, ``OUTER``, ``OVER``, ``OVERLAPS`` + + P + ``PERCENT``, ``PLACING``, ``PLAIN``, ``PLAINS``, ``PLAINTEXT``, ``PLB``, ``PLI``, ``PLM``, ``PLP``, ``PLSQL``, ``PRECISION``<<, ``PRIMARY``, ``PRINT``, ``PROC``, ``PROCEDURE``, ``PUBLICATION``, ``PUBLISH``, ``PUBLICIZE`` + + R + ``RAISEERROR``, ``READ``, ``READTEXT``, ``REFERENCES``, ``RECONFIGURE``, ``REPLICATION``, ``RESTORE``, ``RESTRICT``, ``RETURN``, ``RETURNING``, ``REVERT``, ``REVOKE``, ``RIGHT``, ``RLIKE``, ``ROLLBACK``, ``ROWCOUNT``, ``ROWGUIDCOL``, ``RULE`` + + S + ``SAVE``, ``SCHEMA``, ``SECURITYAUDIT``, ``SELECT``, ``SESSION_USER``, ``SET``, ``SETUSER``, ``SHUTDOWN``, ``SIMILAR``, ``SOME``, ``STATISTICS``, ``SYMMETRIC`` + + T + ``TABLE``, ``TABLESAMPLE``, ``TEXTSIZE``, ``THEN``, ``TO``, ``TOP``, ``TRANSACTION``, ``TRAN``, ``TRIGGER``, ``TRUNCATE``, ``TRUE`` + + U + ``UNION``, ``UNIQUE``, ``UNPIVOT``, ``UPDATE``, ``UPDATETEXT``, ``USE``, ``USER``, ``USING`` + + V + ``VARIADIC``, ``VERBOSE``, ``VIEW``, ``VALUES``, ``VARYING`` + + W + ``WAITFOR``, ``WHEN``, ``WHERE``, ``WHILE``, ``WINDOW``, ``WITH``, ``WRITETEXT`` + + + + + + + + + + diff --git a/reference/sql/sql_syntax/literals.rst b/reference/sql/sql_syntax/literals.rst index 906684590..8dff12132 100644 --- a/reference/sql/sql_syntax/literals.rst +++ b/reference/sql/sql_syntax/literals.rst @@ -1,8 +1,10 @@ +:orphan: + .. _literals: -*************************** +********* Literals -*************************** +********* Literals represent constant values. @@ -55,7 +57,7 @@ Examples .. note:: The actual data type of the value changes based on context, the format used, and the value itself. For example, any number containing the decimal point will be considered ``FLOAT`` by default. - Any whole number will considered ``INT``, unless the value is larger than the :ref:`maximum value`, in which case the type will become a ``BIGINT``. + Any whole number will considered ``INT``, unless the value is larger than the :ref:`maximum value`, in which case the type will become a ``BIGINT``. .. note:: A numeric literal that contains neither a decimal point nor an exponent is considered ``INT`` by default if its value fits in type ``INT`` (32 bits). If not, it is considered ``BIGINT`` by default if its value fits in type ``BIGINT`` (64 bits). If neither are true, it is considered ``FLOAT``. Literals that contain decimal points and/or exponents are always considered ``FLOAT``. @@ -86,7 +88,7 @@ Examples '1997-01-01' -- This is a string -The actual data type of the value changes based on context, the format used, and the value itself. In the example below, the first value is interpreted as a ``DATE``, while the second is interpreted as a ``VARCHAR``. +The actual data type of the value changes based on context, the format used, and the value itself. In the example below, the first value is interpreted as a ``DATE``, while the second is interpreted as a ``TEXT``. .. code-block:: postgres @@ -103,6 +105,7 @@ This section describes the following types of literals: Regular String Literals ----------------------- + In SQL, a **regular string literal** is a sequence of zero or more characters bound by single quotes (``'``): .. code-block:: postgres @@ -135,7 +138,8 @@ The following are some examples of regular string literals: .. _dollar_quoted_string_literals: Dollar-Quoted String Literals ------------------------ +----------------------------- + **Dollar-quoted string literals** consist of a dollar sign (``$``), an optional "tag" of zero or more characters, another dollar sign, an arbitrary sequence of characters that make up the string content, a dollar sign, the same tag at the beginning of the dollar quote, and another dollar sign. @@ -211,7 +215,7 @@ Typed Literals literal :: type_name -See also :ref:`cast` for more information about supported casts. +See also :ref:`supported_casts` for more information about supported casts. Syntax Reference ------------------- @@ -239,7 +243,6 @@ The following is a syntax reference for typed literals: | REAL | DATE | DATETIME - | VARCHAR ( digits ) | TEXT ( digits ) Examples diff --git a/reference/sql/sql_syntax/null_handling.rst b/reference/sql/sql_syntax/null_handling.rst index c8ef596a2..83403a980 100644 --- a/reference/sql/sql_syntax/null_handling.rst +++ b/reference/sql/sql_syntax/null_handling.rst @@ -1,3 +1,5 @@ +:orphan: + .. _null_handling: *************************** diff --git a/reference/sql/sql_syntax/pivot_unpivot.rst b/reference/sql/sql_syntax/pivot_unpivot.rst new file mode 100644 index 000000000..b7d44f71b --- /dev/null +++ b/reference/sql/sql_syntax/pivot_unpivot.rst @@ -0,0 +1,142 @@ +:orphan: + +.. _pivot_unpivot: + +******************** +PIVOT & UNPIVOT +******************** + +``PIVOT`` allows to convert row-level data into columnar representation. This technique is particularly useful when you need to summarize and visualize data. +``UNPIVOT`` does the opposite by transforming columnar data into rows. This operation is invaluable for scenarios where you wish to explore data in a more granular manner. + + +Syntax +======== + +.. code-block:: postgres + + SELECT + + FROM + [ + PIVOT + ( AS [, AS , .... , AS ] + FOR + IN ([ AS] , [ AS] , ... , [ AS] ) + ) + [AS ] + ] + [ + UNPIVOT + ( + FOR + IN ([ AS] expression1, [ AS] expression2, ... , [ AS] ) + ) + [AS ] + ] + + + + pivot_expression := ( ) + + unpivot_expression := + + +Limitations +================= +* The number of resulting columns for ``PIVOT`` is limited to 8,000. +* The number of resulting columns for ``UNPIVOT`` is limited to 2,000. + + + +PIVOT Example +================= +Create a sales table + +.. code-block:: postgres + + CREATE OR REPLACE TABLE Sales ( + ProductID int, + ProductName varchar(50), + SalesDate date, + Revenue decimal(10, 2) + ); + +Populate data + +.. code-block:: postgres + + INSERT INTO Sales (ProductID, ProductName, SalesDate, Revenue) VALUES + (1, 'Product A', '2024-01-01', 100.00), + (2, 'Product B', '2024-01-01', 150.00), + (3, 'Product C', '2024-01-01', 200.00), + (1, 'Product A', '2024-01-02', 120.00), + (2, 'Product B', '2024-01-02', 180.00); + + +Pivots the SalesDate column, creating new columns for each specified date. +The ``SUM(Revenue)`` aggregates the Revenue for each product and date combination. +The ``PIVOT`` operation creates a new table with ProductName as the first column and additional columns for each specified SalesDate. The values in these columns are the summed Revenue for each product on that date. + +.. code-block:: postgres + + SELECT * FROM ( + SELECT ProductName, SalesDate, Revenue + FROM Sales + ) AS SourceTable + PIVOT ( + SUM(Revenue) AS RevenueSum + FOR SalesDate IN ("2024-01-01", "2024-01-02") + ) AS PivotTable; + + Product A ,100.00,120.00 + Product B ,150.00,180.00 + Product C ,200.00,\N + 3 rows + +UNPIVOT Example +================= +Create a sales table + +.. code-block:: postgres + + CREATE OR REPLACE TABLE Sales ( + ProductID int, + ProductName varchar(50), + JanuaryRevenue decimal(10, 2), + FebruaryRevenue decimal(10, 2), + MarchRevenue decimal(10, 2) + ); + +Populate data + +.. code-block:: postgres + + INSERT INTO Sales (ProductID, ProductName, JanuaryRevenue, FebruaryRevenue, MarchRevenue) VALUES + (1, 'Product A', 100.00, 120.00, 150.00), + (2, 'Product B', 150.00, 180.00, 200.00), + (3, 'Product C', 200.00, 220.00, 250.00); + +Unpivots the JanuaryRevenue, FebruaryRevenue, and MarchRevenue columns, creating a new column Month and a column Revenue to store the corresponding values. The ``UNPIVOT`` operation creates a new table with ProductID, ProductName, Month, and Revenue columns, effectively transforming the column-based data into a row-based format. + +.. code-block:: postgres + + SELECT ProductID, ProductName, Month, Revenue + FROM ( + SELECT ProductID, ProductName, JanuaryRevenue, FebruaryRevenue, MarchRevenue + FROM Sales + ) AS SourceTable + UNPIVOT ( + Revenue FOR Month IN (JanuaryRevenue, FebruaryRevenue, MarchRevenue) + ) AS UnpivotTable; + + 1,Product A ,JanuaryRevenue,100.00 + 2,Product B ,JanuaryRevenue,150.00 + 3,Product C ,JanuaryRevenue,200.00 + 1,Product A ,FebruaryRevenue,120.00 + 2,Product B ,FebruaryRevenue,180.00 + 3,Product C ,FebruaryRevenue,220.00 + 1,Product A ,MarchRevenue,150.00 + 2,Product B ,MarchRevenue,200.00 + 3,Product C ,MarchRevenue,250.00 + 9 rows diff --git a/reference/sql/sql_syntax/scalar_expressions.rst b/reference/sql/sql_syntax/scalar_expressions.rst index 21cb38ab2..dc86debf6 100644 --- a/reference/sql/sql_syntax/scalar_expressions.rst +++ b/reference/sql/sql_syntax/scalar_expressions.rst @@ -1,3 +1,5 @@ +:orphan: + .. _scalar_expressions: *************************** diff --git a/reference/sql/sql_syntax/sqream_scripting.rst b/reference/sql/sql_syntax/sqream_scripting.rst new file mode 100644 index 000000000..7a14e9aa6 --- /dev/null +++ b/reference/sql/sql_syntax/sqream_scripting.rst @@ -0,0 +1,142 @@ +:orphan: + +.. _sqream_scripting: + +**************** +SQream Scripting +**************** + +The Java- based SQreamDB scripting enhances your interaction with SQL by providing conventions which allow dynamic generation, management, and automation of SQL code and database operations. + +Syntax +====== + +.. code-block:: psql + + -- Double curly brackets + + {{ ... }} + + -- Parallel: + + @@ Parallel $$ ... $$ + + -- Declare: + + @@ Declare '' = + + -- SetResults: + + @@ SetResults + + -- SplitQueryByDateTime + + @@ SplitQueryByDateTime instances = , from = , to = + + -- SplitQueryByDate + + @@ SplitQueryByDate instances = , from = , to = + + -- SplitQueryByNumber + + @@ SplitQueryByNumber instances = , from = , to = + + -- ${ ... } + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``{{ … }}`` + - Double brackets can contain JavaScript code to be executed through the Editor + * - ``Parallel`` + - Runs specified queries in parallel + * - ``Declare`` + - Declares a variable value + * - ``SetResults`` + - Saves specified query results as a variable + * - ``SplitQueryByDateTime`` + - Splits query execution by a predefined number of instances and by specific ``DATETIME`` column values + * - ``SplitQueryByDate`` + - Splits query execution by a predefined number of instances and by specific ``DATE`` column values + * - ``SplitQueryByNumber`` + - Splits query execution by a predefined number of instances and by specific ``NUMERIC`` column values + +Usage Notes +=========== + +.. glossary:: + + **Execution** + Metalanguage scripting is available only through the SQreamDB web interface and cannot be used via the CLI. + +Examples +======== + +Double Curly Brackets +--------------------- + +.. code-block:: cry + + {{ + return 1; + }} + +``Parallel`` +------------ + +.. code-block:: cry + + @@ Parallel + $$ + SELECT * FROM my_table; + SELECT * FROM our_table; + SELECT * FROM that_table; + $$; + +``Declare`` +----------- + +.. code-block:: psql + + @@ Declare myVar = 3; + SELECT '${myVar}'; + +``SetResults`` +-------------- + +.. code-block:: cry + + @@ SetResults tableAverage + SELECT AVG(col1) AS avg_salary FROM my_table; + + SELECT col1 FROM my_table WHERE col1 > ${tableAverage[0].avg_salary}; + + +``SplitQueryByDateTime`` +------------------------ + +.. code-block:: psql + + @@ SplitQueryByDateTime instances = 4, from = '2021-01-01 00:00:00', to = '2022-01-01 00:00:00' + SELECT ${from}, ${to}; + + +``SplitQueryByDate`` +-------------------- + +.. code-block:: psql + + @@ SplitQueryByDateTime instances = 4, from = '2021-01-01', to = '2022-01-01' + SELECT ${from}, ${to}; + + +``SplitQueryByNumber`` +---------------------- + +.. code-block:: cry + + @@ SplitQueryByDateTime instances = 4, from = 0, to = 100 + SELECT ${from}, ${to}; diff --git a/reference/sql/sql_syntax/subqueries.rst b/reference/sql/sql_syntax/subqueries.rst index 4cd995977..7325b0924 100644 --- a/reference/sql/sql_syntax/subqueries.rst +++ b/reference/sql/sql_syntax/subqueries.rst @@ -1,26 +1,24 @@ +:orphan: + .. _subqueries: -*************************** +********** Subqueries -*************************** +********** -Subqueries allows you to reuse of results from another query. +Subqueries enable the reuse of results of other queries. -SQream DB supports relational (also called *derived table*) subqueries, which appear as :ref:`select` queries as part of a table expression. +SQreamDB supports relational (also called *derived table*) subqueries, which appear as :ref:`select` queries as part of a table expression. -SQream DB also supports :ref:`common_table_expressions`, which are a form of subquery. With CTEs, a subquery can be named for reuse in a query. +SQreamDB also supports :ref:`common_table_expressions`, which are a form of subquery. With CTEs, a subquery can be named for reuse in a query. .. note:: - * SQream DB does not currently support correlated subqueries or scalar subqueries. - * There is no limit to the number of subqueries or nesting limits in a statement - - - + You may include an unlimited number of subqueries within a single SQL statement, and you can also nest subqueries to an unlimited depth Table Subqueries -=========================== +================ The following is an example of table named ``nba`` with the following structure: @@ -28,15 +26,15 @@ The following is an example of table named ``nba`` with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), - "Number" tinyint, - "Position" varchar(2), - "Age" tinyint, - "Height" varchar(4), - "Weight" real, - "College" varchar(40), - "Salary" float + "Name" TEXT, + "Team" TEXT, + "Number" TINYINT, + "Position" TEXT, + "Age" TINYINT, + "Height" TEXT, + "Weight" REAL, + "College" TEXT, + "Salary" FLOAT ); @@ -48,29 +46,57 @@ To see the table contents, click :download:`Download nba.csv SELECT AVG("Age") FROM - . (SELECT "Name","Team","Age" FROM nba WHERE "Height" > '7-0'); - avg - --- - 26 + SELECT + AVG("Age") + FROM + ( + SELECT + "Name", + "Team", + "Age" + FROM + nba + WHERE + "Height" > '7-0' + ); + +.. code-block:: none + + avg + --- + 26 Combining a Subquery with a Join ----------------------------------- - -.. code-block:: psql +-------------------------------- + +.. code-block:: sql + + SELECT + * + FROM + ( + SELECT + "Name" + FROM + nba + WHERE + "Height" > '7-0' + ) AS t(name), + nba AS n + WHERE + n."Name" = t.name; + +.. code-block:: none - t=> SELECT * FROM - . (SELECT "Name" FROM nba WHERE "Height" > '7-0') AS t(name) - . , nba AS n - . WHERE n."Name"=t.name; name | Name | Team | Number | Position | Age | Height | Weight | College | Salary -------------------+--------------------+------------------------+--------+----------+-----+--------+--------+------------+--------- Alex Len | Alex Len | Phoenix Suns | 21 | C | 22 | 7-1 | 260 | Maryland | 3807120 @@ -89,17 +115,76 @@ Combining a Subquery with a Join Walter Tavares | Walter Tavares | Atlanta Hawks | 22 | C | 24 | 7-3 | 260 | \N | 1000000 ``WITH`` subqueries ---------------------- +------------------- See :ref:`common_table_expressions` for more information. -.. code-block:: psql +.. code-block:: sql + + WITH nba_ct AS ( + SELECT + "Name", + "Team" + FROM + nba + WHERE + "College" = 'Connecticut' + ), + nba_az AS ( + SELECT + "Name", + "Team" + FROM + nba + WHERE + "College" = 'Arizona' + ) + SELECT + * + FROM + nba_az + JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; + +.. code-block:: none + + Name | Team | name0 | team0 + ----------------+-----------------+----------------+---------------- + Stanley Johnson | Detroit Pistons | Andre Drummond | Detroit Pistons + Aaron Gordon | Orlando Magic | Shabazz Napier | Orlando Magic + +Correlated subqueries +===================== - nba=> WITH - . nba_ct AS (SELECT "Name", "Team" FROM nba WHERE "College"='Connecticut'), - . nba_az AS (SELECT "Name", "Team" FROM nba WHERE "College"='Arizona') - . SELECT * FROM nba_az JOIN nba_ct ON nba_ct."Team" = nba_az."Team"; - Name | Team | name0 | team0 - ----------------+-----------------+----------------+---------------- - Stanley Johnson | Detroit Pistons | Andre Drummond | Detroit Pistons - Aaron Gordon | Orlando Magic | Shabazz Napier | Orlando Magic \ No newline at end of file +Correlated subqueries are currently not supported. However, you may use the following workaround: + +.. code-block:: sql + + # Unsupported correlated subquery + + SELECT + x, + y, + z + FROM + t + WHERE + x in ( + SELECT + x + FROM + t1 + ); + + # Correlated subquery workaround + SELECT + x, + y, + z + FROM + t + JOIN ( + SELECT + x + FROM + t1 + ) t1 ON t.x = t1.x; \ No newline at end of file diff --git a/reference/sql/sql_syntax/window_functions.rst b/reference/sql/sql_syntax/window_functions.rst index cb87e085e..e2f66e829 100644 --- a/reference/sql/sql_syntax/window_functions.rst +++ b/reference/sql/sql_syntax/window_functions.rst @@ -1,3 +1,5 @@ +:orphan: + .. _window_functions: ******************** @@ -152,6 +154,7 @@ Window frame functions allows a user to perform rolling operations, such as calc ``PARTITION BY`` ------------------ + The ``PARTITION BY`` clause groups the rows of the query into partitions, which are processed separately by the window function. ``PARTITION BY`` works similarly to a query-level ``GROUP BY`` clause, but expressions are always just expressions and cannot be output-column names or numbers. @@ -161,17 +164,13 @@ Without ``PARTITION BY``, all rows produced by the query are treated as a single ``ORDER BY`` ---------------------- -The ``ORDER BY`` clause determines the order in which the rows of a partition are processed by the window function. It works similarly to a query-level ``ORDER BY`` clause, but cannot use output-column names or numbers. +The ``ORDER BY`` clause determines the order in which the rows of a partition are processed by the window function. It works similarly to a query-level ``ORDER BY`` clause, but cannot use output-column names or indexes. Without ``ORDER BY``, rows are processed in an unspecified order. Frames ------- - - -.. note:: Frames and frame exclusions have been tested extensively, but are a complex feature. They are released as a preview in v2020.1 pending longer-term testing. - The ``frame_clause`` specifies the set of rows constituting the window frame, which is a subset of the current partition, for those window functions that act on the frame instead of the whole partition. The set of rows in the frame can vary depending on which row is the current row. The frame can be specified in ``RANGE`` or ``ROWS`` mode; in each case, it runs from the ``frame_start`` to the ``frame_end``. If ``frame_end`` is omitted, the end defaults to ``CURRENT ROW``. @@ -207,10 +206,9 @@ Frame Exclusion The ``frame_exclusion`` option allows rows around the current row to be excluded from the frame, even if they would be included according to the frame start and frame end options. ``EXCLUDE CURRENT ROW`` excludes the current row from the frame. ``EXCLUDE GROUP`` excludes the current row and its ordering peers from the frame. ``EXCLUDE TIES`` excludes any peers of the current row from the frame, but not the current row itself. ``EXCLUDE NO OTHERS`` simply specifies explicitly the default behavior of not excluding the current row or its peers. Limitations -================== -Window functions do not support the Numeric data type. - +================= +Window functions do not support the Numeric data type. Examples ========== @@ -221,14 +219,14 @@ For these examples, assume a table named ``nba``, with the following structure: CREATE TABLE nba ( - "Name" varchar(40), - "Team" varchar(40), + "Name" text(40), + "Team" text(40), "Number" tinyint, - "Position" varchar(2), + "Position" text(2), "Age" tinyint, - "Height" varchar(4), + "Height" text(4), "Weight" real, - "College" varchar(40), + "College" text(40), "Salary" float ); @@ -332,3 +330,35 @@ This example calculates the salary between two players, starting from the highes Dwyane Wade | 20000000 | 19689000 | 311000 Brook Lopez | 19689000 | 19689000 | 0 DeAndre Jordan | 19689000 | 19689000 | 0 + +Window funtion alias +======================= + + +The window funtion alias allows to specify a parameter within the window function definition. This eliminates the need to repeatedly input the same SQL code in queries that use multiple window functions with identical definitions. + +.. code-block:: psql + + t=> SELECT SUM("Salary") OVER w + WINDOW w AS + FROM nba + (PARTITION BY "Team" ORDER BY "Age"); + sum + --------- + 1763400 + 5540289 + 5540289 + 5540289 + 5540289 + 7540289 + 18873622 + 18873622 + 30873622 + 60301531 + 60301531 + 60301531 + 64301531 + 72902950 + 72902950 + [...] + \ No newline at end of file diff --git a/reference/sql_feature_support.rst b/reference/sql_feature_support.rst index ba4ca39e4..75dff3614 100644 --- a/reference/sql_feature_support.rst +++ b/reference/sql_feature_support.rst @@ -1,20 +1,17 @@ .. _sql_feature_support: -************************* +********************* SQL Feature Checklist -************************* +********************* - -To understand which ANSI SQL and other SQL features SQream DB supports, use the tables below. +To understand which ANSI SQL and other SQL features SQreamDB supports, use the tables below. .. contents:: In this topic: :local: Data Types and Values -========================= - -Read more about :ref:`supported data types`. +===================== .. list-table:: Data Types and Values :widths: auto @@ -24,88 +21,85 @@ Read more about :ref:`supported data types`. - Supported - Further information * - ``BOOL`` - - ✓ + - Yes - Boolean values * - ``TINTINT`` - - ✓ + - Yes - Unsigned 1 byte integer (0 - 255) * - ``SMALLINT`` - - ✓ + - Yes - 2 byte integer (-32,768 - 32,767) * - ``INT`` - - ✓ + - Yes - 4 byte integer (-2,147,483,648 - 2,147,483,647) * - ``BIGINT`` - - ✓ + - Yes - 8 byte integer (-9,223,372,036,854,775,808 - 9,223,372,036,854,775,807) * - ``REAL`` - - ✓ + - Yes - 4 byte floating point * - ``DOUBLE``, ``FLOAT`` - - ✓ + - Yes - 8 byte floating point * - ``DECIMAL``, ``NUMERIC`` - - ✓ + - Yes - Fixed-point numbers. - * - ``VARCHAR`` - - ✓ - - Variable length string - ASCII only * - ``TEXT`` - - ✓ + - Yes - Variable length string - UTF-8 encoded * - ``DATE`` - - ✓ + - Yes - Date * - ``DATETIME``, ``TIMESTAMP`` - - ✓ + - Yes - Date and time * - ``NULL`` - - ✓ + - Yes - ``NULL`` values * - ``TIME`` - - ✗ + - No - Can be stored as a text string or as part of a ``DATETIME`` -Contraints -=============== +Constraints +=========== -.. list-table:: Contraints +.. list-table:: Constraints :widths: auto :header-rows: 1 * - Item - Supported - Further information - * - Not null - - ✓ + * - ``Not null`` + - Yes - ``NOT NULL`` - * - Default values - - ✓ + * - ``Default values`` + - Yes - ``DEFAULT`` * - ``AUTO INCREMENT`` - - ✓ Different name + - Yes (different name) - ``IDENTITY`` -Transactions -================ +.. _transactions: -SQream DB treats each statement as an auto-commit transaction. Each transaction is isolated from other transactions with serializable isolation. +Transactions +============ -If a statement fails, the entire transaction is cancelled and rolled back. The database is unchanged. +SQreamDB treats each statement as an auto-commit transaction. Each transaction is isolated from other transactions with serializable isolation. -Read more about :ref:`transactions in SQream DB`. +If a statement fails, the entire transaction is canceled and rolled back. The database is unchanged. Indexes -============ +======= -SQream DB has a range-index collected on all columns as part of the metadata collection process. +SQreamDB has a range-index collected on all columns as part of the metadata collection process. -SQream DB does not support explicit indexing, but does support clustering keys. +SQreamDB does not support explicit indexing, but does support clustering keys. -Read more about :ref:`clustering keys` and our :ref:`metadata system`. +Read more about :ref:`clustering keys`. Schema Changes ================ @@ -118,43 +112,43 @@ Schema Changes - Supported - Further information * - ``ALTER TABLE`` - - ✓ + - Yes - :ref:`alter_table` - Add column, alter column, drop column, rename column, rename table, modify clustering keys * - Rename database - - ✗ + - No - * - Rename table - - ✓ + - Yes - :ref:`rename_table` * - Rename column - - ✓ + - Yes - :ref:`rename_column` * - Add column - - ✓ + - Yes - :ref:`add_column` * - Remove column - - ✓ + - Yes - :ref:`drop_column` * - Alter column data type - - ✗ + - No - * - Add / modify clustering keys - - ✓ + - Yes - :ref:`cluster_by` * - Drop clustering keys - - ✓ + - Yes - :ref:`drop_clustering_key` * - Add / Remove constraints - - ✗ + - No - * - Rename schema - - ✗ - - + - Yes + - :ref:`rename_schema` * - Drop schema - - ✓ + - Yes - :ref:`drop_schema` * - Alter default schema per user - - ✓ + - Yes - :ref:`alter_default_schema` @@ -169,28 +163,28 @@ Statements - Supported - Further information * - SELECT - - ✓ + - Yes - :ref:`select` * - CREATE TABLE - - ✓ + - Yes - :ref:`create_table` * - CREATE FOREIGN / EXTERNAL TABLE - - ✓ + - Yes - :ref:`create_foreign_table` * - DELETE - - ✓ + - Yes - :ref:`delete_guide` * - INSERT - - ✓ + - Yes - :ref:`insert`, :ref:`copy_from` * - TRUNCATE - - ✓ + - Yes - :ref:`truncate` * - UPDATE - - ✗ + - Yes - * - VALUES - - ✓ + - Yes - :ref:`values` Clauses @@ -204,19 +198,19 @@ Clauses - Supported - Further information * - ``LIMIT`` / ``TOP`` - - ✓ + - Yes - * - ``LIMIT`` with ``OFFSET`` - - ✗ + - No - * - ``WHERE`` - - ✓ + - Yes - * - ``HAVING`` - - ✓ + - Yes - * - ``OVER`` - - ✓ + - Yes - Table Expressions @@ -230,19 +224,19 @@ Table Expressions - Supported - Further information * - Tables, Views - - ✓ + - Yes - * - Aliases, ``AS`` - - ✓ + - Yes - * - ``JOIN`` - ``INNER``, ``LEFT [ OUTER ]``, ``RIGHT [ OUTER ]``, ``CROSS`` - - ✓ + - Yes - * - Table expression subqueries - - ✓ + - Yes - * - Scalar subqueries - - ✗ + - No - @@ -259,34 +253,34 @@ Read more about :ref:`scalar_expressions`. - Supported - Further information * - Common functions - - ✓ + - Yes - ``CURRENT_TIMESTAMP``, ``SUBSTRING``, ``TRIM``, ``EXTRACT``, etc. * - Comparison operators - - ✓ + - Yes - ``<``, ``<=``, ``>``, ``>=``, ``=``, ``<>, !=``, ``IS``, ``IS NOT`` * - Boolean operators - - ✓ + - Yes - ``AND``, ``NOT``, ``OR`` * - Conditional expressions - - ✓ + - Yes - ``CASE .. WHEN`` * - Conditional functions - - ✓ + - Yes - ``COALESCE`` * - Pattern matching - - ✓ + - Yes - ``LIKE``, ``RLIKE``, ``ISPREFIXOF``, ``CHARINDEX``, ``PATINDEX`` * - REGEX POSIX pattern matching - - ✓ + - Yes - ``RLIKE``, ``REGEXP_COUNT``, ``REGEXP_INSTR``, ``REGEXP_SUBSTR``, * - ``EXISTS`` - - ✗ + - No - * - ``IN``, ``NOT IN`` - Partial - Literal values only * - Bitwise arithmetic - - ✓ + - Yes - ``&``, ``|``, ``XOR``, ``~``, ``>>``, ``<<`` @@ -294,7 +288,7 @@ Read more about :ref:`scalar_expressions`. Permissions =============== -Read more about :ref:`access_control` in SQream DB. +Read more about :ref:`access_control` in SQreamDB. .. list-table:: Permissions :widths: auto @@ -304,16 +298,16 @@ Read more about :ref:`access_control` in SQream DB. - Supported - Further information * - Roles as users and groups - - ✓ + - Yes - * - Object default permissions - - ✓ + - Yes - * - Column / Row based permissions - - ✗ + - No - * - Object ownership - - ✗ + - No - @@ -329,20 +323,20 @@ Extra Functionality - Supported - Further information * - Information schema - - ✓ + - Yes - :ref:`catalog_reference` * - Views - - ✓ + - Yes - :ref:`create_view` * - Window functions - - ✓ + - Yes - :ref:`window_functions` * - CTEs - - ✓ + - Yes - :ref:`common_table_expressions` * - Saved queries, Saved queries with parameters - - ✓ + - Yes - :ref:`saved_queries` * - Sequences - - ✓ + - Yes - :ref:`identity` diff --git a/releases/2019.2.1.rst b/releases/2019.2.1.rst deleted file mode 100644 index c9c96b59b..000000000 --- a/releases/2019.2.1.rst +++ /dev/null @@ -1,94 +0,0 @@ -.. _2019.2.1: - -****************************** -Release Notes 2019.2.1 -****************************** - -* 250 bugs fixed. Thanks to all of our customers and an unprecedented number of deployments for helping us find and fix these! -* Improved Unicode text handling on the GPU -* Improved logging and monitoring of statements -* Alibaba DataX connector - - -Improvements -===================== - -* We’ve updated the ``show_server_status()`` function to more accurately reflect the status of statements across the cluster: - - * Preparing – Initial validation - * In queue – Waiting for execution - * Initializing – Pre-execution processing - * Executing – statement is running - -* We’ve improved our log files and have unified them into a single file per worker, per date. Each message type has a unique code which can help identify potential issues. See the documentation for full details on the changes to the log structures. - -* ``WITH ADMIN OPTION`` added in ``GRANT``/``REVOKE`` operations, allowing roles to grant their own permissions to others. - -* HA cluster fully supports qualified hostnames, and no longer requires explicit IP addresses. - -* SQream DB CLI’s history can be disabled, by passing ``./ClientCmd --no-history`` - - -Behaviour Changes -===================== - -* SQream DB no longer applies an implicit cast from a long text column to a shorter text column (``VARCHAR``/``TEXT``). This means some ``INSERT``/``COPY`` operations will now error instead of truncating the text. This is intended to prevent accidental truncation of text columns. If you want the old truncation behaviour, you can use the ``SUBSTRING`` function to truncate the text. - - -Operations -===================== - -* The client-server protocol has been updated to support a wider range of encodings. End users are required to use only the latest ClientCmd, JDBC, and ODBC drivers delivered with this version. - -* Clients such as SecureCRT and other shells must have locale set as ``cp874`` or equivalent - -* When upgrading from SQream DB v3.2 or lower, the storage version must be upgraded using the :ref:`upgrade_storage_cli_reference` utility: ``./bin/upgrade_storage /path/to/storage/sqreamdb/`` - - -Known Issues and Limitations -=================================== - -* TEXT columns cannot be used as a ``GROUP BY`` key when there are multiple ``COUNT (DISTINCT …)`` operations in a query - -* TEXT columns cannot be used in a statement containing window functions - -* TEXT is not supported as a join key - -* The following functions are not supported on ``TEXT`` column types: ``chr``, ``min``, ``max``, ``patindex``, ``to_binary``, ``to_hex``, ``rlike``, ``regexp_count``, ``regexp_instr``, ``regexp_substr`` - -* SQream Dashboard: Only works with a HA clustered installation - -* SQream Editor: External tables and UDFs don’t appear in the DB Tree but do appear in the relevant sqream_catalog entries. - - -Fixes -===================== - -250 bugs and issues fixed, including: - -* Variety of performance improvements: - -* Improved performance of ``TEXT`` by up to 315% for a variety of scenarios, including ``COPY FROM``, ``INNER JOIN``, ``LEFT JOIN``. - -* Improved load performance from previous versions - -* Faster compilation times for very complex queries - -* DWLM: - - * Fixed situation where queries were not distributed correctly among all available workers - * Fixed ``cannot execute - reconnectDb error`` error - * Fixed occasional hanging statement - * Fixed occasional ``Connection refused`` - -* Window functions: - - * Fixed window function edge-case ``error WindowA with no functions`` - * Fixed situations where the SUM window function is applied on a column, partitioned by a second, and sorted by a third would return wrong results when scanning very large datasets - -* Other bugs: - - * Fixed situation where many concurrent statements running would result in ``map::at`` appearing - * Fixed situation where SQream DB would restart when force-stopping an ``INSERT`` over the network - * Fixed situation where RAM wasn’t released immediately after statement has been executed - * Fixed Type doesn’t have a fixed size error that appeared when using an external table joined with a standard SQream DB table diff --git a/releases/2020.1.rst b/releases/2020.1.rst deleted file mode 100644 index e4928855e..000000000 --- a/releases/2020.1.rst +++ /dev/null @@ -1,188 +0,0 @@ -.. _2020.1: - -************************** -Release Notes 2020.1 -************************** - -SQream DB v2020.1 contains lots of new features, improved performance, and bug fixes. - -This is the first release of 2020, with a strong focus on integration into existing environments. The release includes connectivity to Hadoop and other legacy data warehouse ecosystems. We’re also bringing lots of new capabilities to our analytics engine, to empower data users to analyze more data with less friction. - -The latest release vastly improves reliability and performance, and makes getting more data into SQream DB easier than ever. - -The core of SQream DB v2020.1 contains new integration features, more analytics capabilities, and better drivers and connectors. - - -New features -================ - -Integrations ------------------ - -* Load files directly from :ref:`S3 buckets`. Customers with columnar data in S3 data lakes can now access the data directly. All that is needed is to simply point an external table to an S3 bucket with Parquet, ORC, or CSV objects. This feature is available on all deployments of SQream DB – in the cloud and on-prem. - -* Load files directly from :ref:`HDFS`. SQream DB now comes with built-in, native HDFS support for directly loading data from Hadoop-based data lakes. Our focus on helping Hadoop customers do more with their data led us to develop this feature, which works out of the box. As a result, SQream DB can now not only read but also write data, and intermediate results back to HDFS for HIVE and other data consumers. SQream DB now fits seamlessly into a Hadoop data pipeline. - - -* Import :ref:`ORC files`, through :ref:`external_tables`. ORC files join Parquet as files that can be natively accessed and inserted into SQream DB tables. - -* :ref:`Python driver (pysqream)` is now DB-API v2.0 compliant. Customers can write high-performance Python applications that make full use of SQream DB - connect, query, delete, and insert data. Data scientists can use pysqream with Pandas, Numpy, and AI/ML frameworks like TensorFlow for direct queries of huge datasets. - -* Certified :ref:`Tableau JDBC connector (taco)`, now also :ref:`supported on MacOS`. Users are encouraged to install the new JDBC connector. - -* - All logs are now unified into one log, which can be analyzed with SQream DB directly. - See :ref:`logging` for more information. - - -SQL support ---------------- - -* - Added frames and frame exclusions to :ref:`window_functions`. This is available for preview, with more features coming in the next version. - - The new frames and frame exclusionsfeature adds complex analytics capabilities to the already powerful window functions. - -* - New datatype - ``TEXT``, which replaces ``NVARCHAR`` directly with UTF-8 support and improved performance. - - Unlike ``VARCHAR``, the new ``TEXT`` data type has no restrictions on size, and carries no performance overhead as the text sizes grow. - -* ``TEXT`` join keys are now supported - -* Added lots of new :ref:`aggregate functions`, including ``VAR_SAMP``, ``VAR_POP``, ``COVAR_POP``, etc. - - -Improvements and fixes -======================== - -SQream DB v2020.1 includes hundreds of small new features and tunable parameters that improve performance, reliability, and stability. Existing SQream DB users can expect to see a general speedup of around 10% on most statements and queries! - -* 207 bug fixes, including: - - - Improved performance of both inner and outer joins - - Fixed wrong results on STDDEV (0 instead of ``NULL``) - - Fixed wrong results on nested Parquet files - - Fixed failing cast from ``VARCHAR`` to ``FLOAT`` - - Fix ``INSERT`` that would fail on nullable values and non-nullable columns in some scenarios - - Improved memory consumption, so ``Out of GPU memory`` errors should not occur anymore - - Reduced long compilation times for very complex queries - - Improved ODBC reliability - - Fixed situation where some logs would clip very long queries - - Improved error messages when dropping a schema with many objects - - Fixed situation where Spotfire would not show table names - - Fixed situation where some queries with UTF-8 literals wouldn't run through Tableau over ODBC - - Significantly improved cache freeing and memory allocation - - Fixed situation in which a malformed time (``24:00:00``) would get incorrectly inserted from a CSV - - Fixed race condition in which loading thousands of small files from HDFScaused a memory leak - -* The :ref:`saved query` feature can now be used with :ref:`insert` statements - -* Faster "Deferred gather" algorithm for joins with text keys - -* Faster filtering when using :ref:`datepart` - -* Faster metadata tagging during load - -* Fixed situation where some queries would get compiled twice - -* :ref:`saved_queries` now support :ref:`insert` statements - -* ``highCardinalityColumns`` can be configured to tell the system about :ref:`high selectivity` columns - -* :ref:`sqream sql` starts up faster, can run on any Linux machine - -* Additional CSV date formats (date parsers) added for compatibility - -Behaviour changes -======================== - -* ``ClientCmd`` is now known as :ref:`sqream sql` - -* ``NVARCHAR`` columns are now known as ``TEXT`` internally - -* - Deprecated the ability to run ``SELECT`` and ``COPY`` at the same time on the same worker. This change is designed to protect against ``out of GPU memory`` issues. - This comes with a configuration change, namely the ``limitQueryMemoryGB`` setting. See the operations section for more information. - -* All logs are now unified into one log. See :ref:`logging` for more information - -* Compression changes: - - - The latest version of SQream DB could select a different compression scheme if data is reloaded, compared to previous versions of SQream DB. This internal change improves performance. - - - With ``LZ4`` compression, the maximum chunk size is limited to 2.1GB. If the chunk size is bigger, another compression may be selected - primarily ``SNAPPY``. - -* The following configuration flags have been deprecated: - - - ``addStatementRechunkerAfterGpuToHost`` - - ``increasedChunkSizeFactor`` - - ``gpuReduceMergeOutputFactor`` - - ``fullSortInputMemFactor`` - - ``reduceInputMemFactor`` - - ``distinctInputMemFactor`` - - ``useAutoMemFactors`` - - ``autoMemFactorsVramFactor`` - - ``catchNotEnoughVram`` - - ``useNetworkRechunker`` - - ``useMemFactorInJoinOutput`` - -Operations -======================== - -* The client-server protocol has been updated to support faster data flow, and more reliable memory allocations on the client side. End users are required to use only the latest :ref:`sqream sql`, :ref:`java_jdbc`, and :ref:`odbc` drivers delivered with this version. See the :ref:`client driver download page` for the latest drivers and connectors. - -* When upgrading from a previous version of SQream DB (for example, v2019.2), the storage version must be upgraded using the :ref:`upgrade_storage_cli_reference` utility: ``./bin/upgrade_storage /path/to/storage/sqreamdb/`` - -* - A change in memory allocation behaviour in this version sees the introduction of a new setting, ``limitQueryMemoryGB``. This is an addition to the previous ``spoolMemoryGB`` setting. - - A good rule-of-thumb is to allow 5% system memory for other processes. The spool memory allocation should be around 90% of the total memory allocated. - - - ``limitQueryMemoryGB`` defines how much total system memory is used by the worker. The recommended setting is (``total host memory`` - 5%) / ``sqreamd workers on host``. - - - ``spoolMemoryGB`` defines how much memory is set aside for spooling, out of the total system memory allocated in ``limitQueryMemoryGB``. The recommended setting is 90% of the ``limitQueryMemoryGB``. - - This setting must be set lower than the ``limitQueryMemoryGB`` setting. - - For example, for a machine with 512GB of RAM and 4 workers, the recommended settings are: - - - ``limitQueryMemoryGB`` - ``⌊(512 * 0.95 / 4)⌋ → ~ 486 / 4 → 121``. - - - ``spoolMemoryGB`` - ``⌊( 0.9 * limitQueryMemoryGB )⌋ → ⌊( 0.9 * 121 )⌋ → 108`` - - Example settings per-worker, for 512GB of RAM and 4 workers: - - .. code-block:: none - - "runtimeGlobalFlags": { - "limitQueryMemoryGB" : 121, - "spoolMemoryGB" : 108 - - - - -Known Issues & Limitations -================================ - -* An invalid formatted CSV can cause an ``insufficient memory`` error on a :ref:`copy_from` statement if a quote isn't closed and the file is much larger than system memory. - -* ``TEXT`` columns cannot be used in a window functions' partition - -* Parsing errors are sometimes hard to read - the location points to the wrong part of the statement - -* LZ4 compression may not be applied correctly on very large ``VARCHAR`` columns, which decreases performance - -* Using ``SUM`` on very large numbers in window functions can error (``overflow``) when not used with an ``ORDER BY`` clause - -* Slight performance decrease with :ref:`dateadd` in this version (<4%) - -* Operations on Snappy-compressed ORC files are slower than their Parquet equivalents. - - -Upgrading to v2020.1 -======================== - -Versions are available for IBM POWER9, RedHat (CentOS) 7, Ubuntu 18.04, and other OSs via Docker. - -Contact your account manager to get the latest release of SQream DB. diff --git a/releases/2020.2.rst b/releases/2020.2.rst deleted file mode 100644 index 3dc25b78a..000000000 --- a/releases/2020.2.rst +++ /dev/null @@ -1,115 +0,0 @@ -.. _2020.2: - -************************** -Release Notes 2020.2 -************************** - -SQream v2020.2 contains some new features, improved performance, and bug fixes. - -This version has new window ranking function and a new editor UI to empower data users to analyze more data with less friction. - -As always, the latest release improves reliability and performance, and makes getting more data into SQream easier than ever. - - -New Features -================ - -UI ----------- - -* New :ref:`sqream_studio` replaces the previous Statement Editor. - -Integrations ------------------ - -* Our :ref:`Python driver (pysqream)` now has an SQLAlchemy dialect. Customers can write high-performance Python applications that make full use of SQream - connect, query, delete, and insert data. Data scientists can use pysqream with Pandas, Numpy, and AI/ML frameworks like TensorFlow for direct queries of huge datasets. - -SQL Support ---------------- - -* Added :ref:`lag`/:ref:`lead` ranking functions to our :ref:`window_functions` support. We will have more features coming in the next version. - -* - New syntax preview for :ref:`external_tables`. Foreign tables replace external tables, with improved functionality. - - You can keep using the existing foreign table syntax for now, but it may be deprecated in the future. - - .. code-block:: postgres - - CREATE FOREIGN TABLE orc_example - ( - name varchar(40), - Age tinyint, - Salary float - ) - WRAPPER orc_fdw - OPTIONS - ( LOCATION = 'hdfs://hadoop-nn.piedpiper.com:8020/demo-data/example.orc' ); - - -Improvements and Fixes -======================== - -SQream v2020.2 includes hundreds of small new features and tunable parameters that improve performance, reliability, and stability. - -* ~100 bug fixes, including: - - - Fixed CSV handling for DOS newlines - - Fixed "out of bounds" message when several layers of nested ``substring``, ``cast``, and ``to_hex`` were used to produce one value. - - Fixed "Illegal memory access" that would occur in extremely rare situations on all-text tables - - Window functions can now be used with all aggregations - - Fixed situation where a single worker may use more than one GPU that isn't allocated to it - - Text columns can now be added to existing tables with :ref:`alter_table` - -* New :ref:`data_clustering` syntax that can improve query performance for unsorted data - - -Operations -======================== - -* When upgrading from a previous version of SQream (for example, v2019.2), the storage version must be upgraded using the :ref:`upgrade_storage_cli_reference` utility: ``./bin/upgrade_storage /path/to/storage/sqreamdb/`` - -* - A change in memory allocation behaviour in this version sees the introduction of a new setting, ``limitQueryMemoryGB``. This is an addition to the previous ``spoolMemoryGB`` setting. - - A good rule-of-thumb is to allow 5% system memory for other processes. The spool memory allocation should be around 90% of the total memory allocated. - - - ``limitQueryMemoryGB`` defines how much total system memory is used by the worker. The recommended setting is (``total host memory`` - 5%) / ``sqreamd workers on host``. - - - ``spoolMemoryGB`` defines how much memory is set aside for spooling, out of the total system memory allocated in ``limitQueryMemoryGB``. The recommended setting is 90% of the ``limitQueryMemoryGB``. - - This setting must be set lower than the ``limitQueryMemoryGB`` setting. - - For example, for a machine with 512GB of RAM and 4 workers, the recommended settings are: - - - ``limitQueryMemoryGB`` - ``⌊(512 * 0.95 / 4)⌋ → ~ 486 / 4 → 121``. - - - ``spoolMemoryGB`` - ``⌊( 0.9 * limitQueryMemoryGB )⌋ → ⌊( 0.9 * 121 )⌋ → 108`` - - Example settings per-worker, for 512GB of RAM and 4 workers: - - .. code-block:: none - - "runtimeFlags": { - "limitQueryMemoryGB" : 121, - "spoolMemoryGB" : 108 - - - - -Known Issues and Limitations -================================ - -* An invalid formatted CSV can cause an ``insufficient memory`` error on a :ref:`copy_from` statement if a quote isn't closed and the file is much larger than system memory. - -* Multiple ``COUNT( distinct ... )`` operations within the same query are limited to "developer mode" due to an instability that was identified. If you rely on this feature, contact your SQream account manager to enable this feature. - -* ``TEXT`` columns can't be used with an outer join together with an inequality check (``!= , <>``) - - -Upgrading to Version 2020.2 -======================== - -Versions are available for IBM POWER9, RedHat (CentOS) 7, Ubuntu 18.04, and other OSs via Docker. - -Contact your account manager to get the latest release of SQream. diff --git a/releases/2020.3.1.rst b/releases/2020.3.1.rst deleted file mode 100644 index 0667306d7..000000000 --- a/releases/2020.3.1.rst +++ /dev/null @@ -1,72 +0,0 @@ -.. _2020.3.1: - -************************** -Release Notes 2020.3.1 -************************** -The 2020.3.1 release notes were released on October 8, 2020 and describe the following: - -.. contents:: - :local: - :depth: 1 - - - -New Features -------------- -The following list describes the new features: - - -* TEXT data type: - * Full support for ``MIN`` and ``MAX`` aggregate functions on ``TEXT`` columns in ``GROUP BY`` queries. - * Support Text-type as window partition keys (e.g., select distinct name, max(id) over (partition by name) from ``textTable;``). - * Support Text-type fields in windows order by keys. - * Support join on ``TEXT`` columns (such as ``t1.x = t2.y`` where ``x`` and ``y`` are columns of type ``TEXT``). - * Complete the implementation of ``LIKE`` on ``TEXT`` columns (previously limited to prefix and suffix). - * Support for cast fromm ``TEXT`` to ``REAL/FLOAT``. - * New string function - ``REPEAT`` for repeating a string value for a specified number of times. - -* Support mapping ``DECIMAL ORC`` columns to SQream's floating-point types. - -* Support ``LIKE`` on non-literal patterns (such as columns and complex expressions). - -* Catch OS signals and save the signal along with the stack trace in the SQream debug log. - -* Support equijoin conditions on columns with different types (such as ``tinyint``, ``smallint``, ``int`` and ``bigint``). - -* ``DUMP_DATABASE_DDL`` now includes foreign tables in the output. - -* New utility function - ``TRUNCATE_IF_EXISTS``. - - -Performance Enhancements -------------- -The following list describes the performance enhancements: - - -* Introduced the "MetaData on Demand" feature which results in signicant proformance improvements. - -* Implemented regex functions (``RLIKE``, ``REGEXP_COUNT``, ``REGEXP_INSTR``, ``REGEXP_SUBSTR``, ``PATINDEX``) for ``TEXT`` columns on GPU. - - -Resolved Issues -------------- -The following list describes the resolved issues: - - -* Multiple distinct aggregates no longer need to be used with developerMode flag. -* In some scenarios, the ``statement_id`` and ``connection_id values`` are incorrectly recorded as ``-1`` in the log. -* ``NOT RLIKE`` is not supported for ``TEXT`` in the compiler. -* Casting from ``TEXT`` to ``date/datetime`` returns an error when the ``TEXT`` column contains ``NULL``. - - -Known Issues and Limitations -------------- -No known issues and limitations. - - -Upgrading to v2020.3.1 ----------------- - -Versions are available for IBM POWER9, RedHat (CentOS) 7, Ubuntu 18.04, and other OSs via Docker. - -Contact your account manager to get the latest release of SQream DB. \ No newline at end of file diff --git a/releases/2020.3.2.1.rst b/releases/2020.3.2.1.rst deleted file mode 100644 index 3c551b636..000000000 --- a/releases/2020.3.2.1.rst +++ /dev/null @@ -1,31 +0,0 @@ -.. _2020.3.2.1: - -************************** -Release Notes 2020.3.2.1 -************************** -The 2020.3.2.1 release notes were released on October 8, 2020 and describe the following: - -.. contents:: - :local: - :depth: 1 - - -Overview ------------------ -SQream DB v2020.3.2.1 contains major performance improvements and some bug fixes. - -Performance Enhancements -------------- -* Metadata on Demand optimization resulting in reduced latency and improved overall performance. - - -Known Issues and Limitations -------------- -* Multiple count distinct operations is enabled for all data types. - -Upgrading to v2020.3.2.1 -------------- - -Versions are available for IBM POWER9, RedHat (CentOS) 7, Ubuntu 18.04, and other OSs via Docker. - -Contact your account manager to get the latest release of SQream DB. \ No newline at end of file diff --git a/releases/2020.3.rst b/releases/2020.3.rst deleted file mode 100644 index d072b15da..000000000 --- a/releases/2020.3.rst +++ /dev/null @@ -1,102 +0,0 @@ -.. _2020.3: - -************************** -Release Notes 2020.3 -************************** -The 2020.3 release notes were released on October 8, 2020 and describes the following: - -.. contents:: - :local: - :depth: 1 - - -Overview ------------- -SQream DB v2020.3 contains new features, performance enhancements, and resolved issues. - - -New Features ----------- -The following list describes the new features: - - -* Parquet and ORC files can now be exported to local storage, S3, and HDFS with :ref:`copy_to` and foreign data wrappers. - -* New error tolerance features when loading data with foreign data wrappers. - -* ``TEXT`` is ramping up with new features (previously only available with VARCHARs): - - * :ref:`substring`, :ref:`lower`, :ref:`ltrim`, :ref:`charindex`, :ref:`replace`, etc. - - * Binary operators - :ref:`concat`, :ref:`like`, etc. - - * Casts to and from ``TEXT`` - -* :ref:`sqream_studio` v5.1 - - * New log viewer helps you track and debug what's going on in SQream DB. - - * Dashboard now also available for non-k8s deployments. - - * The editor contains a new query concurrency tool for date and numeric ranges. - - - -Performance Enhancements ----------- -The following list describes the performance enhancements: - - -* Error handling for CSV FDW. -* Enable logging errors - ORC, Parquet, CSV. -* Add limit and offset options to ``csv_fdw`` import. -* Enable logging errors to an external file when skipping CSV, Parquet, and ORC errors. -* Option to specify date format to the CSV FDW. -* Support all existing ``VARCHAR`` functions with ``TEXT`` on GPU. -* Support ``INSERT INTO`` + ``ORDER BY`` optimization for non-clustered tables. -* Performance improvements with I/O. - -Resolved Issues ---------------- -The following list describes the resolved issues: - - -* Better error message when passing the max errors limit. This was fixed. -* ``showFullExceptionInfo`` is no longer restricted to Developer Mode. This was fixed. -* An ``StreamAggregateA`` reduction error occured when performing aggregation on a ``NULL`` column. This was fixed. -* Insert into query fails with ""Error at Sql phase during Stages ""rewriteSqlQuery"". This was fixed. -* Casting from ``VARCHAR`` to ``TEXT`` does not remove the spaces. This was fixed. -* An ``Internal Runtime Error t1.size() == t2.size()`` occurs when querying the ``sqream_catalog.delete_predicates``. This was fixed. -* ``spoolMemoryGB`` and ``limitQueryMemoryGB`` show incorrectly in the **runtime global** section of ``show_conf.`` This was fixed. -* Casting empty text to ``int`` causes illegal memory access. This was fixed. -* Copying from the ``TEXT`` field is 1.5x slower than the ``VARCHAR`` equivalent. This was fixed. -* ``TPCDS 10TB - Internal runtime error (std::bad_alloc: out of memory)`` occurs on 2020.1.0.2. This was fixed. -* An unequal join on non-existing ``TEXT`` caused a system crash. This was fixed. -* An ``Internal runtime time error`` occured when using ``TEXT (tpcds)``. This was fixed. -* Copying CSV with a quote in the middle of a field to a ``TEXT`` field does not produce the required error. This was fixed. -* Cannot monitor long network insert loads with SQream. This was fixed. -* Upper and like performance on ``TEXT``. This was fixed. -* Insert into from 4 instances would get stuck (hanging). This was fixed. -* An invalid formatted CSV would cause an insufficient memory error on a ``COPY FROM`` statement if a quote was not closed and the file was much larger than system memory. This was fixed. -* ``TEXT`` columns cannot be used with an outer join together with an inequality check (!= , <>). This was fixed. - -Known Issues And Limitations ----------- -The following list describes the known issues and limitations: - - -* Cast from ``TEXT`` to a ``DATE`` or ``DATETIME`` errors when the ``TEXT`` column contains ``NULL`` - -* Casting an empty ``TEXT`` field to an ``INT`` type returns ``0`` instead of erroring - -* Multiple ``COUNT( distinct ... )`` operations on the ``TEXT`` data type are currently unsupported - -* Multiple ``COUNT( distinct ... )`` operations within the same query are limited to "developer mode" due to an instability that was identified. If you rely on this feature, contact your SQream account manager to enable this feature. - - -Upgrading to v2020.3 ----------- - -Versions are available for IBM POWER9, RedHat (CentOS) 7, Ubuntu 18.04, and other OSs via Docker. - -Contact your account manager to get the latest release of SQream. diff --git a/releases/2020.3_index.rst b/releases/2020.3_index.rst deleted file mode 100644 index b13340b52..000000000 --- a/releases/2020.3_index.rst +++ /dev/null @@ -1,18 +0,0 @@ -.. _2020.3_index: - -************************** -Release Notes 2020.3 -************************** -The 2020.3 Release Notes describe the following releases: - -.. contents:: - :local: - :depth: 1 - -.. toctree:: - :maxdepth: 1 - :glob: - - 2020.3.2.1 - 2020.3.1 - 2020.3 \ No newline at end of file diff --git a/releases/2021.1.1.rst b/releases/2021.1.1.rst deleted file mode 100644 index 8e6417a43..000000000 --- a/releases/2021.1.1.rst +++ /dev/null @@ -1,64 +0,0 @@ -.. _2021.1.1: - -************************** -Release Notes 2021.1.1 -************************** -The 2021.1.1 release notes were released on 7/27/2021 and describe the following: - -.. contents:: - :local: - :depth: 1 - -New Features -------------- -The 2021.1.1 Release Notes include the following new features: - -.. contents:: - :local: - :depth: 1 - -Complete Ranking Function Support -************ -SQream now supports the following new ranking functions: - -.. list-table:: - :widths: 1 23 76 - :header-rows: 1 - - * - Function - - Return Type - - Description - * - first_value - - Same type as value - - Returns the value in the first row of a window. - * - last_value - - Same type as value - - Returns the value in the last row of a window. - * - nth_value - - Same type as value - - Returns the value in a specified (``n``) row of a window. if the specified row does not exist, this function returns ``NULL``. - * - dense_rank - - bigint - - Returns the rank of the current row with no gaps. - * - percent_rank - - double - - Returns the relative rank of the current row. - * - cume_dist - - double - - Returns the cumulative distribution of rows. - * - ntile(buckets) - - integer - - Returns an integer ranging between ``1`` and the argument value, dividing the partitions as equally as possible. - -For more information, navigate to Windows Functions and scroll to the `Ranking Functions table `_. - - -Resolved Issues -------------- -The following list describes the resolved issues: - -* SQream did not support exporting and reading **Int64** columns as **bigint** in Parquet. This was fixed. -* The Decimal column was not supported when inserting data from Parquet files. This was fixed. -* Values in Parquet Numeric columns were not being converted correctly. This was fixed. -* Converting ``string`` data type to ``datetime`` was not working correctly. This was fixed. -* Casting ``datetime`` to ``text`` truncated the time. This was fixed. \ No newline at end of file diff --git a/releases/2021.1.2.rst b/releases/2021.1.2.rst deleted file mode 100644 index 43ce6db7d..000000000 --- a/releases/2021.1.2.rst +++ /dev/null @@ -1,61 +0,0 @@ -.. _2021.1.2: - -************************** -Release Notes 2021.1.2 -************************** -The 2021.1.2 release notes were released on 8/9/2021 and describe the following: - -.. contents:: - :local: - :depth: 1 - -New Features -------------- -The 2021.1.2 Release Notes include the following new features: - -.. contents:: - :local: - :depth: 1 - -Aliases Added to SUBSTRING Function and Length Argument -************ -The following aliases have been added: - -* length - ``len`` -* substring - ``substr`` - -Data Type Aliases Added -************ -The following data type aliases have been added: - -* INTEGER - ``int`` -* DECIMAL - ``numeric`` -* DOUBLE PRECISION - ``double`` -* CHARACTER/CHAR - ``text`` -* NATIONAL CHARACTER/NATIONAL CHAR/NCHAR - ``text`` -* CHARACTER VARYING/CHAR VARYING - ``text`` -* NATIONAL CHARACTER VARYING/NATIONAL CHAR VARYING/NCHAR VARYING - ``text`` - -String Literals Containing ASCII Characters Interepreted as TEXT -************ -SQream now interprets all string literals, including those containing ASCII characters, as ``text``. - -For more information, see `String Types `_. - -Decimal Literals Interpreted as Numeric Columns -************ -SQream now interprets literals containing decimal points as ``numeric`` instead of as ``double``. - -For more information, see `Data Types `_. - -Roles Area Added to Studio Version 5.3.3 -**************** -The **Roles** area has been added to `Studio version 5.3.3 `_. From the Roles area users can create and assign roles and manage user permissions. - -Resolved Issues -------------- -The following list describes the resolved issues: - -* In Parquet files, ``float`` columns could not be mapped to SQream ``double`` columns. This was fixed. -* The ``REPLACE`` function only supported constant values as arguments. This was fixed. -* The ``LIKE`` function did not check for incorrect patterns or handle escape characters. This was fixed. \ No newline at end of file diff --git a/releases/2021.1.rst b/releases/2021.1.rst deleted file mode 100644 index b2b0dcfd8..000000000 --- a/releases/2021.1.rst +++ /dev/null @@ -1,213 +0,0 @@ -.. _2021.1: - -************************** -Release Notes 2021.1 -************************** -The 2021.1 release notes were released on 6/13/2021 and describe the following: - -.. contents:: - :local: - :depth: 1 - - -Version Content ----------- -The 2021.1 Release Notes describes the following: - -* Major feature release targeted for all on-premises customers. -* Basic Cloud functionality. - - -New Features ----------- -The 2021.1 Release Notes include the following new features: - - - -.. contents:: - :local: - :depth: 1 - -SQream DB on Cloud -************ -SQream DB can now be run on AWS, GCP, and Azure. - -Numeric Data Types -************ -SQream now supports Numeric Data types for the following operations: - - * All join types. - * All aggregation types (not including Window functions). - * Scalar functions (not including some trigonometric and logarithmic functions). - -For more information, see `Numeric Data Types `_. - -Text Data Type -************ -SQream now supports TEXT data types in all operations, which is default string data type for new projects. - - - * Sqream supports VARCHAR functionalty, but recommends using TEXT. - - * TEXT data enhancements introduced in Release Notes version 2020.3.1: - - * Support text columns in queries with multiple distinct aggregates. - * Text literal support for all functions. - -For more information, see `String Types `_. - - -Supports Scalar Subqueries -************ -SQream now supports running initial scalar subqueries. - -For more information, see `Subqueries `_. - -Literal Arguments -************ - -SQream now supports literal arguments for functions in all cases where column/scalar arguments are supported. - -Simple Scalar SQL UDFs -************ -SQream now supports simple scalar SQL UDF's. - -For more information, see `Simple Scalar SQL UDF’s `_. - -Logging Enhancements -************ -The following log information has been added for the following events: - - * Compilation start time. - * When the first metadata callback in the compiler (if relevant). - * When the last metadata callback in the compiler (if relevant). - * When the log started attempting to apply locks. - * When a statement entered the queue. - * When a statement exited the queue. - * When a client has connected to an instance of **sqreamd** (if it reconnects). - * When the log started executing. - -Improved Presented License Information -************ -SQream now displays information related to data size limitations, expiration date, type of license shown by the new UF. The **Utility Function (UF)** name is ``get_license_info()``. - -For more information, see `GET_LICENSE_INFO `_. - - - - -Optimized Foreign Data Wrapper Export -************ -Sqream now supports exporting to multiple files concurrently. This is useful when you need to reduce file size to more easily export multiple files. - -The following is the correct syntax for exporting multiple files concurrently: - -.. code-block:: none - - COPY table_name TO fdw_name OPTIONS(max_file_size=size_in_bytes,enforce_single_file={TRUE|FALSE}); - -The following is an example of the correct syntax for exporting multiple files concurrently: - -.. code-block:: none - - COPY my_table1 TO my_ext_table OPTIONS(max_file_size=500000,enforce_single_file=TRUE); - -The following apply: - -* Both of the parameters in the above example are optional. - -* The ``max_file_size`` value is specified in bytes and can be any positive value. The default value is ``16*2^20`` (16MB). - -* When the ``enforce_single_file`` value is set to ``TRUE``, only one file is created, and its size is not limited by the ``max_file_size`` value. Its default value is ``TRUE``. - -Main Features --------- -The following list describes the main features: - -* SQreamDB available on AWS. -* SQreamDB available on GCP. -* SQreamDB available on Azure. -* SQream usages storage located on Object Store (as opposed to local disks) for the above three cloud providers. -* SQream now supports Microstrategy. -* Supports MVP licensing system. -* A new literal syntax containing character escape semantics for string literals has been added. -* Supports optimizing exporting foreign data wrappers. -* Supports truncating Numeric values when ingested from ORC and CSV files. -* Supports catalog Utility Function that accepts valid SQL patterns and escape characters. -* Supports creating a basic random data foreign data wrapper for non-text types. -* The new foreign data wrapper ``random_fdw`` has been introduced for non-text types. -* Supports simple scalar SQL UDF's. -* SQream parses its own logs as CSV's. - - -Resolved Issues ---------- -The following list describes the resolved issues: - -* Copying text from a CSV file to the TEXT column without closing quotes caused SQream to crash. This was fixed. -* Using an unsupported function call generated an incorrect insert error. This was fixed. -* Using the ``insert into`` function from ``table_does_not_exist`` generated an incorrect error. -* SQream treated inserting ``*`` in ``select_distinct`` as one column. This was fixed. -* Using certain encodeKey functions generated errors. This was fixed. -* Compile errors occurred while running decimal datatype sets. This was fixed. -* Running the ``select table_name,row_count from sqream_catalog.tables order by row_count limit 5`` query generated an internal runtime error. -* Using wildcards (such as ``*.x.y``) did not work in parquet files. This was fixed. -* Executing ``log*(x,y)`` generated an incorrect error message. This was fixed. -* The ``internal runtime error`` type doesn't have a fixed size when doing max on text on develop. -* The ``min`` and ``max`` on ``TEXT`` were significantly slower than ``varchar``. This was fixed. -* Running ``regexp_instr`` generated an empty regular expression. This was fixed. -* Schemas with foreign tables could be dropped. This was fixed. - - - - - - - - - -Operations and Configuration Changes --------- -Recommended SQream Configuration on Cloud -************ - -For more information about AWS, see `Amazon S3 `_. - - - - -Optimized Foreign Data Wrapper Export Configuration Flag -************ - -SQream now has a new ``runtimeGlobalFlags`` flag called ``WriteToFileThreads``. - -This flag configures the number of threads in the **WriteToFile** function. The default value is ``16``. - -For more information about the ``runtimeGlobalFlags`` flag, see the **Runtime Global Flags** table in `Configuration `_. - - - - -Naming Changes -------- -No relevant naming changes were made. - -Deprecated Features -------- -No features were depecrated. - -Known Issues and Limitations --------- -The the list below describes the following known issues and limitations: - -* In cases when selecting top 1 from foreign table using the Parquet format with an hdfs path, SQream experienced an error. -* Internal Runtime Error occurred when SQream was unable to find column in reorder columns. -* Casting datetime to text truncates the time segment. -* In the **select** list, the compiler generates an error when a count is used as an alias. -* Performance degradation occurred when joins made on small tables. -* SQream causes a logging error when using copy from logs. -* Deploying S3 requires setting the ``ObjectStoreClients`` parameter to ``40``. - -Upgrading to v2021.1 -------- -Due to the known issue of a limitation on the amount of access requests that can be simultaneously sent to AWS, deploying S3 requires setting the ``ObjectStoreClients`` parameter to ``40``. diff --git a/releases/2021.1_index.rst b/releases/2021.1_index.rst deleted file mode 100644 index 64b06e1d1..000000000 --- a/releases/2021.1_index.rst +++ /dev/null @@ -1,18 +0,0 @@ -.. _2021.1_index: - -************************** -Release Notes 2021.1 -************************** -The 2021.1 Release Notes describe the following releases: - -.. contents:: - :local: - :depth: 1 - -.. toctree:: - :maxdepth: 1 - :glob: - - 2021.1.2 - 2021.1.1 - 2021.1 \ No newline at end of file diff --git a/releases/2021.2.1.rst b/releases/2021.2.1.rst deleted file mode 100644 index f17bdd516..000000000 --- a/releases/2021.2.1.rst +++ /dev/null @@ -1,81 +0,0 @@ -.. _2021.2.1: - -************************** -Release Notes 2021.2.1 -************************** -The 2021.2.1 release notes were released on 15/12/2021 and describes the following: - -.. contents:: - :local: - :depth: 1 - -New Features -------------- -The 2021.2.1 Release Notes include the following new features: - -.. contents:: - :local: - :depth: 1 - -CREATE TABLE -************ -SQream now supports duplicating the column structure of an existing table using the ``LIKE`` clause. - -For more information, see `Duplicating the Column Structure of an Existing Table `_. - -PERCENTILE FUNCTIONS -************ -SQream now supports the following aggregation functions: - -* :ref:`percentile_cont` -* :ref:`percentile_disc` -* :ref:`mode` - -REGEX REPLACE -************ -SQream now supports the ``REGEXP_REPLACE`` function for finding and replacing text column substrings. - -For more information, see :ref:`regexp_replace`. - -Delete Optimization -************ -The ``DELETE`` statement can now delete values that contain multi-table conditions. - -For more information, see `Deleting Values that Contain Multi-Table Conditions `_. - -For more information, see :ref:`regexp_replace`. - - -Performance Enhancements ------- -The **Performance Enhancements** section is not relevant to Version 2021.2.1. - -Resolved Issues -------------- -The following table lists the issues that were resolved in Version 2021.2.1: - -.. list-table:: - :widths: 17 200 - :header-rows: 1 - - * - SQ No. - - Description - * - SQ-8267 - - A method has been provided for including the ``GROUP BY`` and ``DISTINCT COUNT`` statements. - - -Known Issues ------- -The **Known Issues** section is not relevant to 2021.2.1. - -Naming Convention Modifications ------- -The **Naming Convention Modifications** section is not relevant to Version 2021.2.1. - -End of Support ------- -The **End of Support** section is not relevant to Version 2021.2.1. - -Deprecated Features ------- -The **Deprecated Components** section is not relevant to Version 2021.2.1. \ No newline at end of file diff --git a/releases/2021.2.rst b/releases/2021.2.rst deleted file mode 100644 index ec4773669..000000000 --- a/releases/2021.2.rst +++ /dev/null @@ -1,172 +0,0 @@ -.. _2021.2: - -************************** -Release Notes 2021.2 -************************** -The 2021.2 release notes were released on 13/9/2021. - -.. contents:: - :local: - :depth: 1 - -New Features ----------- -The 2021.2 Release Notes include the following new features: - -.. contents:: - :local: - :depth: 1 - -New Driver Compatibility -************ -The 2021.2 release supports the following drivers: - -* **JDBC** - new driver version (JDBC 4.5) with important bug fixes. -* **ODBC** - ODBC 4.1.1. available on request. -* **NodeJS** - all versions starting with NodeJS 4.0. SQream recommends the latest version (NodeJS 4.2.4). -* **Dot Net** - SQream recommends version version 3.02 (compatible with DotNet version 48). -* **Pysqream** - pysqream 3.1.2 - -Centralized Configuration System -************ -SQream now uses a new configuration system based on centralized configuration accessible from SQream Studio. - -For more information, see the following: - -* `Configuration `_ - describes how to configure your instance of SQream from a centralized location. -* `SQream Studio 5.4.2 `_ - configure your instance of SQream from Studio. - -Qualifying Schemas Without Providing an Alias -************ -When running queries, SQream now supports qualifying schemas without providing an alias. - -For more information, see :ref:`create_schema`. - - - - - -Double-Quotations Supported When Importing and Exporting CSVs -************ -When importing and exporting CSVs, SQream now supports using quotation characters other than double quotation marks (``"``). - -For more information, see the following: - -* :ref:`copy_from` - -* :ref:`copy_to` - - -Note the following: - -* Leaving ** unspecified uses the default value of standard double quotations ``”``. - - :: - -* The quotation character must be a single, 1-byte printable ASCII character. The same octal syntax of the copy command can be used. - - :: - -* The quote character cannot be contained in the field delimiter, record delimiter, or null marker. - - :: - -* Double-quotations can be customized when the ``csv_fdw`` value is used with the ``COPY FROM`` and ``CREATE FOREIGN TABLE`` statements. - - :: - -* The default escape character always matches the quote character, and can be overridden by using the ``ESCAPE = {'\\' | E'\XXX')`` syntax as shown in the following examples: - - .. code-block:: postgres - - copy t from wrapper csv_fdw options (location = '/tmp/file.csv', escape='\\'); - - .. code-block:: postgres - - copy t from wrapper csv_fdw options (location = '/tmp/file.csv', escape=E'\017'); - - .. code-block:: postgres - - copy t to wrapper csv_fdw options (location = '/tmp/file.csv', escape='\\'); - -For more information, see the following statements: - - -* :ref:`copy_from` - -* :ref:`create_foreign_table` - -Performance Enhancements ------- -In Version 2021.2, an advanced smart spooling mechanism splits spool memory based on required CP usage. - -Resolved Issues ------- -The following table lists the issues that were resolved in Version 2021.2: - -.. list-table:: - :widths: 17 200 - :header-rows: 1 - - * - SQ No. - - Description - * - SQ-8294 - - Quote qualifiers were not present in exported file, preventing it from being reloaded. - * - SQ-8288 - - Saved ``TEXT`` query parameters were not supported. - * - SQ-8266 - - A data loading issue occurred related to column order. - - -Known Issues ------- -The **Known Issues** section is not relevant to Version 2021.2. - - -Naming Convention Modifications ------- -The **Naming Convention Modifications** describes SQream features, such as data types or statements, that have been renamed. - -NVARCHAR Data Type Renamed TEXT -************ -The ``NVARCHAR`` data type has been renamed ``TEXT``. - - -For more information on the ``TEXT`` data type, see `String (TEXT) `_ - -End of Support ------- -The **End of Support** section is not relevant to Version 2021.2. - -Deprecated Features ------- -The **Deprecated Components** section is not relevant to Version 2021.2. - -Upgrading Your SQream Version ------- -The **Upgrading Your SQream Version** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Upgrading Your Storage Version -************ -When upgrading from a SQream version earlier than 2021.2 you must upgrade your storage version, as shown in the following example: - - .. code-block:: console - - $ cat /etc/sqream/sqream1_config.json |grep cluster - $ ./upgrade_storage - -For more information on upgrading your SQream version, see `Upgrading SQream Version `_. - -Upgrading Your Client Drivers -************ -For more information on the client drivers for version 2021.2, see `Client Drivers for 2021.2 `_. - -Configuring Your Instance of SQream -************ -A new configuration method is used starting with Version 2021.2. - -For more information about configuring your instance of SQream, see :ref:`configuration`. \ No newline at end of file diff --git a/releases/2021.2_index.rst b/releases/2021.2_index.rst deleted file mode 100644 index 77a22b0ae..000000000 --- a/releases/2021.2_index.rst +++ /dev/null @@ -1,17 +0,0 @@ -.. _2021.2_index: - -************************** -Release Notes 2021.2 -************************** -The 2021.2 Release Notes describe the following releases: - -.. contents:: - :local: - :depth: 1 - -.. toctree:: - :maxdepth: 1 - :glob: - - 2021.2.1 - 2021.2 \ No newline at end of file diff --git a/releases/4.0.rst b/releases/4.0.rst new file mode 100644 index 000000000..f69d01f5e --- /dev/null +++ b/releases/4.0.rst @@ -0,0 +1,126 @@ +.. _4.0: + +***************** +Release Notes 4.0 +***************** + +SQream is introducing a new version release system that follows the more commonly used Major.Minor versioning schema. The newly released **4.0 version** is a minor version upgrade and does not require considerable preparation. + +The 4.0 release notes were released on 01/25/2023 and describe the following: + +.. contents:: + :local: + :depth: 1 + +New Features +------------ + + * Re-enabling an enhanced version of the :ref:`License Storage Capacity` feature + + * :ref:`Lightweight Directory Access Protocol(LDAP)` may be used to authenticate SQream roles + + * :ref:`Physical deletion performance enhancement` by supporting file systems with parallelism capabilities + +Storage Version +--------------- + +The storage version presently in effect is version 45. + + +SQream Studio Updates and Improvements +-------------------------------------- + + * When creating a **New Role**, you may now create a group role by selecting **Set as a group role**. + + * When editing an **Existing Role**, you are no longer obligated to update the role's password. + +Known Issues +------------ + +:ref:`Percentile` is not supported for Window functions. + +Version 4.0 resolved Issues +--------------------------- + ++-----------------+---------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++=================+=======================================================================================+ +| SQ-10544 | SQream Studio dashboard periodic update enhancement | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-11296 | Slow catalog queries | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-11772 | Slow query performance when using ``JOIN`` clause | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12318 | JDBC ``insertBuffer`` parameter issue | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12364 | ``GET DDL`` foreign table output issue | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12446 | SQream Studio group role modification issue | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12468 | Internal compiler error | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12580 | Server Picker GPU dependency | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12598 | Executing ``SELECT`` on a foreign table with no valid path produces no error message | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-12652 | SQream Studio result panel adjustment | ++-----------------+---------------------------------------------------------------------------------------+ +| SQ-13055 | NULL issue when executing query with pysqream | ++-----------------+---------------------------------------------------------------------------------------+ + + + +Configuration Changes +--------------------- + +No configuration changes were made. + +Naming Changes +-------------- + +No relevant naming changes were made. + +Deprecated Features +------------------- + +SQream is declaring end of support of VARCHAR data type, the decision resulted by SQream's effort to enhance its core functionalities and with respect to ever changing echo system requirements. + +VARCHAR is no longer supported for new customers - effective from Version 2022.1.3 (September 2022). + +TEXT data type is replacing VARCHAR and NVARCHAR - SQream will maintain VARCHAR data type support until 09/30/2023. + + +End of Support +-------------- + +No End of Support changes were made. + +Upgrading to version 4.0 +------------------------ + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQream recommends storing the generated back-up locally in case needed. + + SQream runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQream services. + +3. Extract the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQream package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQream Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.0_index.rst b/releases/4.0_index.rst new file mode 100644 index 000000000..de53a91ce --- /dev/null +++ b/releases/4.0_index.rst @@ -0,0 +1,21 @@ +.. _4.x_index: + +***************** +4.x Release Notes +***************** + +.. contents:: + :local: + :depth: 1 + +.. toctree:: + :maxdepth: 1 + :glob: + + 4.3 + 4.5 + 4.8 + 4.9 + 4.10 + 4.11 + 4.12 diff --git a/releases/4.1.rst b/releases/4.1.rst new file mode 100644 index 000000000..0b88617bb --- /dev/null +++ b/releases/4.1.rst @@ -0,0 +1,129 @@ +.. _4.1: + +***************** +Release Notes 4.1 +***************** + +SQream is introducing a new version release system that follows the more commonly used Major.Minor.Patch versioning schema. The newly released **4.0 version** is a minor version upgrade and does not require considerable preparation. + +The 4.1 release notes were released on 03/01/2023 and describe the following: + +.. contents:: + :local: + :depth: 1 + +New Features +------------ + + * :ref:`Lightweight Directory Access Protocol (LDAP)` management enhancement + + * A new brute-force attack protection mechanism locks out user accounts for 15 minutes following 5 consecutive failed login attempts + +Newly Released Connector Drivers +-------------------------------- + +JDBC 4.5.7 `.jar file `_ + +Storage Version +--------------- + +The storage version presently in effect is version 45. + +SQream Studio Updates and Improvements +-------------------------------------- + +SQream Studio v5.5.4 has been released. + +Known Issues +------------ + +:ref:`Percentile` is not supported for Window functions. + + +Version 4.1 resolved Issues +--------------------------- + ++------------------------+------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++========================+==========================================================================================+ +| SQ-11287 | Function definition SQL UDF parenthesis issue | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-11296 | Slow catalog queries | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-12255 | Text column additional characters when using ``COPY TO`` statement | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-12510 | Encryption memory issues | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-13219 | JDBC ``supportsSchemasInDataManipulation()`` method issue | ++------------------------+------------------------------------------------------------------------------------------+ + +Configuration Changes +--------------------- + +No configuration changes + + +Naming Changes +-------------- +No naming changes + + +Deprecated Features +------------------- + +► Square Brackets ``[]`` + +The ``[]``, which are frequently used to delimit :ref:`identifiers` such as column names, table names, and other database objects, will soon be deprecated to facilitate the use of the ``ARRAY`` data type. + +* Support in ``[]`` for delimiting database object identifiers ends on June 1st, 2023. + +* To delimit database object identifiers, you will be able to use double quotes ``""``. + + +► ``VARCHAR`` + +The ``VARCHAR`` data type is deprecated to improve the core functionalities of the platform and to align with the constantly evolving ecosystem requirements. + +* Support in the ``VARCHAR`` data type ends at September 30th, 2023. + +* ``VARCHAR`` is no longer supported for new customers, effective from Version 2022.1.3. + +* The ``TEXT`` data type is replacing the ``VARCHAR`` and ``NVARCHAR`` data types. + + + + +End of Support +-------------- + +No End of Support changes were made. + +Upgrading to v4.1 +----------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQream recommends storing the generated back-up locally in case needed. + + SQream runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQream services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQream package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQream Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.10.rst b/releases/4.10.rst new file mode 100644 index 000000000..deeb125b0 --- /dev/null +++ b/releases/4.10.rst @@ -0,0 +1,120 @@ +.. _4.10: + +***************** +Release Notes 4.10 +***************** + +The 4.10 release notes were released on January 20th, 2025 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL 8.9 | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 12.3.2 | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 58 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.4.2 | +| | * ODBC 4.4.4 | +| | * NodeJS 4.2.4 | +| | * .NET 5.0.0 | +| | * Pysqream 5.3.0 | +| | * SQLAlchemy 1.4 | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.3.1 | +| | * Java CLI 2.2 | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- +► :ref:`Alter Default Permissions` is now enhanced to include support for ``USER DEFINED FUNCTIONS``. + +► Enahnced Regular Expresseion (RegEx) support. + +► Python version for :ref:`Python User Defined Functions` has been upgraded to Python 3.11. + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.10 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-18800 | Running Create Table As Select from a table with delete predicates causes worker abnormal behavior | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-18797 | Worker failure leaves orphan locks | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-18555 | Enhance set parameter command efficiency | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-18504 | Access Control Permissions - Functions | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-18253 | A network insert query fails | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16436 | Right function error | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-14398 | Improve orphan locks handling | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + + + +Deprecations +------------------- + +► **Haskell CLI** + +Starting October 2024, support for the Haskell CLI is discontinued, and it is replaced by the :ref:`Multi Platform CLI` that is compatible with the Haskell CLI with the added value of ``Table-View`` and cross platform compatability. + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* REHL 8.x is now officially supported. + +► **DateTime2 Alias** + +Starting April 2025, The alias ``DateTime2`` for the ``DateTime`` Data Type is being deprecated to simplify our data model and make way for the introduction of a new data type named ``DateTime2``. While ``DateTime2`` will remain functional as an alias until April 2025, we recommend updating your code to use ``DateTime`` directly to ensure compatibility. + +Upgrading to Version 4.10 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/upgrade_guide/version_upgrade.html>`_ procedure. + diff --git a/releases/4.11.rst b/releases/4.11.rst new file mode 100644 index 000000000..b9c6ff319 --- /dev/null +++ b/releases/4.11.rst @@ -0,0 +1,113 @@ +.. _4.11: + +***************** +Release Notes 4.11 +***************** + +The 4.11 release notes were released on April 9th, 2025 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL 8.9 | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 12.3.2 | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 59 | ++-------------------------+------------------------------------------------------------------------+ + + +New Features and Enhancements +----------------------------- +► SQDB's new ``DATETIME2`` data type delivers nanosecond precision and timezone notation, for reliable and accurate time-based data. + +► Array decompress to run on GPU (Performance improvement). + + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.11 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-18621 | ``DELETE`` statement causes Worker stability issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19275 | Compression issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19415 | Problem granting column DDL privilege to new roles | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19485 | Issue using ``COPY FROM`` JSON file containing ``ARRAY`` data | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19534 | Issue with ``ARRAY`` containing large ``TEXT`` | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19541 | Issue using ``DISTINCT`` clause on ``ARRAY`` | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19571 | ``LZ4`` is not permitted for ``FLOAT`` cell of ``ARRAY`` | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19635 | ``DICT`` Compression issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19869 | ``STDDEV`` functions slow compilation time | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19879 | Issue with Saved Queries Default permissions during database creation | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + + +Deprecation +------------------- + +► **Haskell CLI** + +Starting October 2024, support for the Haskell CLI is discontinued, and it is replaced by the :ref:`Multi Platform CLI` that is compatible with the Haskell CLI with the added value of ``Table-View`` and cross platform compatability. + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x has reached its End of Life and is no longer supported by SQreamDB. + +► **DateTime2 Alias** + +As of April 2025, the alias ``DateTime2`` for the ``DateTime`` Data Type has been deprecated to simplify our data model and make way for the introduction of a new data type named ``DateTime2``. + +Upgrading to Version 4.11 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/upgrade_guide/version_upgrade.html>`_ procedure. + diff --git a/releases/4.12.rst b/releases/4.12.rst new file mode 100644 index 000000000..e5e9bc02c --- /dev/null +++ b/releases/4.12.rst @@ -0,0 +1,125 @@ +.. _4.12: + +***************** +Release Notes 4.12 +***************** + +The 4.12 release notes were released on July 3rd, 2025 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+----------------------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================================+ +| Supported OS | RHEL 8.9 / 8.10 | ++-------------------------+----------------------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 12.3.2 / 12.6.1 | ++-------------------------+----------------------------------------------------------------------------------------+ +| Storage version | 62 | ++-------------------------+----------------------------------------------------------------------------------------+ +| Driver compatibility | The new features in this version are coupled with changes in the following components: | +| | | +| | * ODBC 5.0.0 | +| | * JDBC 6.2.0 | +| | * Pysqream 6.2.0 | +| | * Java CLI 2.7 | +| | * SQLoader As A Service 8.5.0 | ++-------------------------+----------------------------------------------------------------------------------------+ + + +New Features and Enhancements +----------------------------- +► :ref:`Metadata partitioning` significantly reduces statement execution time when metadata contains millions of keys by intelligently leveraging previously created metadata partitions for efficient data skipping. + +► New SQL syntax for :ref:`PUT`, :ref:`GET`, and :ref:`REMOVE` statements empowers users to directly write and read files to and from the SQDB cluster, leveraging its robust access control system. + +► We've upgraded our platform to Java 17, delivering enhanced performance and the latest security features. + +► The :ref:`PIVOT` functionality has been updated to support multi-column pivoting. + + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.12 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-19639 | Consistency check does not recognize arrays. | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19869 | Queries with numerous STDDEV aggregations are experiencing long compilation time. | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19879 | Default permissions for saved queries are not automatically created upon new database creation. | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-20146 | The ``PIVOT`` function does not allow renaming of pivoted columns using aliases. | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-20400 | Compiler throws an error when performing a join on encrypted columns. | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + + +.. _Deprecation: + +Deprecation +------------------- + +► **Column DDL permission** + +Column ``DDL`` permission is now deprecated as its functionality is fully included within the table ``DDL`` permission. Upon upgrading to this version, all existing column ``DDL`` permissions will be automatically revoked. +If you are upgrading from a release using the old behavior, please contact SQream support prior to performing the upgrade. + +► **Haskell CLI** + +Starting October 2024, support for the Haskell CLI is discontinued, and it is replaced by the :ref:`Multi Platform CLI` that is compatible with the Haskell CLI with the added value of ``Table-View`` and cross platform compatibility. + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x has reached its End of Life and is no longer supported by SQreamDB. + +► **DateTime2 Alias** + +Effective April 2025, we've deprecated the alias ''DateTime2'' for the ''DateTime'' data type. This change streamlines our data model and prepares for the introduction of a new, distinct ''DateTime2'' data type in the future. + +Upgrading to Version 4.12 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/upgrade_guide/version_upgrade.html>`_ procedure. + + + .. note:: Column DDL permission is now deprecated as its functionality is fully included within the table DDL permission. For more info please refer to `Deprecation`_ + diff --git a/releases/4.2.rst b/releases/4.2.rst new file mode 100644 index 000000000..203afa04a --- /dev/null +++ b/releases/4.2.rst @@ -0,0 +1,163 @@ +.. _4.2: + +***************** +Release Notes 4.2 +***************** + +SQream is introducing a new version release system that follows the more commonly used Major.Minor.Patch versioning schema. The newly released **4.0 version** is a minor version upgrade and does not require considerable preparation. + +The 4.2 release notes were released on 04/23/2023 and describe the following: + +.. contents:: + :local: + :depth: 1 + +New Features +------------ + + +:ref:`Apache Spark` may now be used for large-scale data processing. + +:ref:`Physical deletion` performance enhancement by supporting file systems with parallelism capabilities + + +Newly Released Connector Drivers +-------------------------------- + +► Pysqream 3.2.5 + * Supports Python version 3.9 and newer + * `.tar file `_ + * :ref:`Documentation` + +► ODBC 4.4.4 + * :ref:`Getting the ODBC Driver` + +► JDBC 4.5.8 + * `.jar file `_ + * :ref:`Documentation` + +► Spark 5.0.0 + * `.jar file `_ + * :ref:`Documentation` + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | * CentOS / REHL - 7.6 - 7.9 | +| | * IBM RedHat 7.6 | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version from 10.1 up to 11.4.3 | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 46 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 4.5.8 | +| | * ODBC 4.4.4 | +| | * NodeJS | +| | * .NET 3.0.2 | +| | * Pysqream 3.2.5 | +| | * Spark | ++-------------------------+------------------------------------------------------------------------+ + + + +SQream Studio Updates and Improvements +-------------------------------------- + +SQream Studio v5.5.4 has been released. + +Known Issues +------------ + +* :ref:`Percentile` is not supported for :ref:`Window Functions`. + +* Performance degradation when using ``VARCHAR`` partition key in a :ref:`Window Functions` expression + + +Version 4.2 Resolved Issues +--------------------------- + ++------------------------+------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++========================+==========================================================================================+ +| SQ-12598 | Foreign table ``SELECT`` statement issue | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-13018 | `cleanup_extent` operation buffer issue | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-13055 | Pysqream ``NULL`` value issue | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-13322 | Clean up process is case sensitive | ++------------------------+------------------------------------------------------------------------------------------+ +| SQ-13450 | Storage upgrade issue | ++------------------------+------------------------------------------------------------------------------------------+ + +Configuration Changes +--------------------- + +No configuration changes + + +Naming Changes +-------------- +No naming changes + + +Deprecated Features +------------------- + +► ``INT96`` + +Due to Parquet's lack of support of the ``INT96`` data type, SQream has decided to deprecate this data type. + + +► Square Brackets ``[]`` + +The ``[]``, which are frequently used to delimit :ref:`identifiers` such as column names, table names, and other database objects, will soon be deprecated to facilitate the use of the ``ARRAY`` data type. + +* Support in ``[]`` for delimiting database object identifiers ends on June 1st, 2023. +* To delimit database object identifiers, you will be able to use double quotes ``""``. + + +► ``VARCHAR`` + +The ``VARCHAR`` data type is deprecated to improve the core functionalities of the platform and to align with the constantly evolving ecosystem requirements. + +* Support in the ``VARCHAR`` data type ends at September 30th, 2023. +* ``VARCHAR`` is no longer supported for new customers, effective from Version 2022.1.3. +* The ``TEXT`` data type is replacing the ``VARCHAR`` and ``NVARCHAR`` data types. + + +End of Support +--------------- +No End of Support changes were made. + +Upgrading to v4.2 +------------------- +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQream recommends storing the generated back-up locally in case needed. + + SQream runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQream services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQream package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQream Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.3.rst b/releases/4.3.rst new file mode 100644 index 000000000..e5cdbba1f --- /dev/null +++ b/releases/4.3.rst @@ -0,0 +1,206 @@ +.. _4.3: + +***************** +Release Notes 4.3 +***************** + +The 4.3 release notes were released on 11/06/2023 and describe the following: + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++---------------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=================================+========================================================================+ +| Supported OS | * CentOS - 7.x | +| | * RHEL - 7.x / 8.x | ++---------------------------------+------------------------------------------------------------------------+ +| Supported Nvidia driver | CUDA version from 10.1 up to 11.4.3 | ++---------------------------------+------------------------------------------------------------------------+ +| Storage version | 49 | ++---------------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 4.5.8 | +| | * ODBC 4.4.4 | +| | * NodeJS | +| | * .NET 3.0.2 | +| | * Pysqream 3.2.5 | +| | * Spark | ++---------------------------------+------------------------------------------------------------------------+ +| SQream Acceleration Studio | Version 5.6.0 | ++---------------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + +► A new :ref:`SQLoader ` will enable you to load data into SQreamDB from other databases. + +► Access control permissions in SQreamDB have been expanded, allowing roles to now grant and revoke access to other roles for the following: + + * VIEWS + * FOREIGN TABLE + * COLUMN + * CATALOG + * SERVICE + +To learn more about how and when you should use this new capability, visit :ref:`access_control_permissions`. + +► RocksDB's metadata scale-up improvements have been implemented. + +SQreamDB Studio Updates and Improvements +----------------------------------------- + +SQream Studio version 5.6.0 has been released. + +Known Issues +------------ + +* :ref:`Percentile` is not supported for :ref:`Window Functions`. + +* Performance degradation when using ``VARCHAR`` partition key in a :ref:`Window Functions` expression + +* In SQreamDB minor versions 4.3.9 and 4.3.10, granting permissions through the Acceleration Studio might result in an error, even though the permission has been successfully granted. + + +Version 4.3 resolved Issues +--------------------------- + ++--------------------+------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++====================+================================================================================================+ +| SQ-11108 | Slow ``COPY FROM`` statements using ORC files | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-11804 | Slow metadata optimization | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-12721 | ``maxConnectionInactivitySeconds`` flag issue when executing Batch Shell Program ETLs | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-12799 | Catalog queries may not be terminated | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13112 | ``GRANT`` query queue issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13201 | ``INSERT INTO`` statement error while copying data from non-clustered table to clustered table | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13210, SQ-13426 | Slow query execution time | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13225 | LoopJoin performance enhancement supports ``=``, ``>``, ``<``, and ``<=`` operators | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13322 | Cleanup operation case-sensitivity issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13401 | The JDBC driver causes the log summery of ``INSERT`` statements to fail | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13453 | Metadata performance issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13460 | ``GRANT ALL ON ALL TABLES`` statement slow compilation time | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13461 | ``WHERE`` clause filter issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13467 | Snapshot issue causes metadata failure | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13529 | Pysqream concurrency issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13566, SQ-13694 | S3 access to bucket failure when using custom endpoint | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13587 | Large number of worker connections failure | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13947 | Unicode character issue when using Tableau | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14094 | Metadata server error stops workers and query queue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14268 | Internal runtime memory issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14724 | Alias issue when executing ``DELETE`` statement | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13387 | Simple query slow compilation time due to metadata size | ++--------------------+------------------------------------------------------------------------------------------------+ + +Configuration Adjustments +------------------------- + +► You may now configure the object access style and your endpoint URL with Virtual Private Cloud (VPC) when using AWS S3. + +Visit :ref:`s3` to learn more about how and when you should use these two new parameters: + +* ``AwsEndpointOverride`` +* ``AwsObjectAccessStyle`` + +Deprecations +------------------- + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* **REHL 8.x** is now officially supported. + +► ``INT96`` + +Due to Parquet's lack of support of the ``INT96`` data type, SQream has decided to deprecate this data type. + + +► Square Brackets ``[]`` + +The ``[]``, which are frequently used to delimit :ref:`identifiers` such as column names, table names, and other database objects, are officially deprecated to facilitate the use of the ``ARRAY`` data type. To delimit database object identifiers, use double quotes ``""``. + + +► ``VARCHAR`` + +The ``VARCHAR`` data type is deprecated to improve the core functionalities of the platform and to align with the constantly evolving ecosystem requirements. + +* Support in the ``VARCHAR`` data type ends at September 30th, 2023. +* ``VARCHAR`` is no longer supported for new customers, effective from version 2022.1.3. +* The ``TEXT`` data type is replacing the ``VARCHAR`` and ``NVARCHAR`` data types. + +.. _upgrade_to_4.3: + +Upgrading to v4.3 +----------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQream recommends storing the generated back-up locally in case needed. + + SQream runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQream services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQream package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + +7. Version 4.3 introduces a service permission feature that enables superusers to grant and revoke role access to services. However, when upgrading from version 4.2 or earlier to version 4.3 or later, this feature initializes access to services and to catalog tables, causing existing roles to lose their access to services, catalog tables and consequently also to the UI (Catalog tables may also be used to determine user access rights and privileges. The UI can integrate with these permissions to control what actions users are allowed to perform in the database.). + +There are two methods of granting back access to services: + + * Grant access to all services for all roles using the :ref:`grant_usage_on_service_to_all_roles` utility function + * Selectively grant or revoke access to services by following the :ref:`access permission guide` + +To grant back access to catalog tables and the UI, you may either grant access to all system roles, using your ``public`` role: + +.. code-block:: + + GRANT ALL PERMISSIONS ON CATALOG TO public; + +Or individually grant access to selected roles: + +.. code-block:: + + GRANT ALL PERMISSIONS ON CATALOG TO ; + +.. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQream Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + + diff --git a/releases/4.4.rst b/releases/4.4.rst new file mode 100644 index 000000000..76d008789 --- /dev/null +++ b/releases/4.4.rst @@ -0,0 +1,176 @@ +.. _4.4: + +***************** +Release Notes 4.4 +***************** + +The 4.4 release notes were released on September 28th, 2023 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++---------------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=================================+========================================================================+ +| Supported OS | * CentOS - 7.x | +| | * RHEL - 7.x / 8.x | ++---------------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version from 10.1 up to 11.4.3 | ++---------------------------------+------------------------------------------------------------------------+ +| Storage version | 50 | ++---------------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.0.0 | +| | * ODBC 4.4.4 | +| | * NodeJS | +| | * .NET 3.0.2 | +| | * Pysqream 5.0.0 | +| | * Spark | ++---------------------------------+------------------------------------------------------------------------+ +| SQream Acceleration Studio | Version 5.7.0 | ++---------------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + +► The newly supported :ref:`sql_data_type_array` data type enables you to simplify queries and optimize space utilization. + +► :ref:`denodo` may now be used for real-time data visualization of various sources. + +► New :ref:`select_gpu_metrics` utility function now available, providing insights into cluster GPU usage over defined periods, crucial for maintaining compliance with license limits. + +Newly Released Connector Drivers +--------------------------------- + +► **Pysqream 5.0.0** + +* `tar.file `_ +* :ref:`Documentation` + +► **JDBC 5.0.0** + +* `jar.file `_ +* :ref:`Documentation` + +SQreamDB Studio Updates and Improvements +----------------------------------------- + +SQream Studio version 5.7.0 has been released. + +Known Issues +------------ + +* :ref:`Percentile` is not supported for :ref:`Window Functions`. + + +Version 4.4 resolved Issues +--------------------------- + ++--------------------+------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++====================+================================================================================================+ +| SQ-12965 | ``ReadParquet`` chunk producer issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13461 | ``LEFT JOIN`` in the ``WHERE`` clause with different date values results in missing filters | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13772 | Foreign table ``JOIN`` operation issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13805 | Different table structures provide different query times when using Parquet files | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13947 | Unicode character issue when using Tableau | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13954 | Runtime error when granting role multiple permissions using the web interface | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-13971 | Parquet file data loading issue when columns contain over 100,000 digits | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14136 | Query deceleration due to metadata server issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14268 | ``TEXT`` column length calculation CUDA memory issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14399 | Figment snapshot recognition issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14400 | Healer configuration flag unavailability | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14556 | Object store path issue when using S3 API | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14724 | Aliases error when using ``DELETE`` statement | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-15074 | Web interface login issue for non-``SUPERUSER`` roles | ++--------------------+------------------------------------------------------------------------------------------------+ + + + +Configuration Adjustments +------------------------- + +► You may now configure the object access style and your endpoint URL with Virtual Private Cloud (VPC) when using AWS S3. + +Visit :ref:`s3` to learn more about how and when you should use these two new parameters: + +* ``AwsEndpointOverride`` +* ``AwsObjectAccessStyle`` + +► New :ref:`server_picker_cli_reference` parameters enable you to direct services to specific Workers and examine Worker availability. + +Deprecations +------------------- + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* **REHL 8.x** is now officially supported. + +► ``INT96`` + +Due to Parquet's lack of support of the ``INT96`` data type, SQreamDB has decided to deprecate this data type. + + +► Square Brackets ``[]`` + +The ``[]``, which are frequently used to delimit :ref:`identifiers` such as column names, table names, and other database objects, are officially deprecated to facilitate the use of the ``ARRAY`` data type. To delimit database object identifiers, use double quotes ``""``. + + +► ``VARCHAR`` + +With the improvement of the core functionalities of the platform and to align with the constantly evolving ecosystem requirements, the ``VARCHAR`` data type is deprecated and may not be used. The ``TEXT`` data type is replacing the ``VARCHAR`` and ``NVARCHAR`` data types. + +Upgrading to Version 4.4 +------------------------- +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + +7. Version 4.4 introduces a service permission feature that enables superusers to grant and revoke role access to services. However, when upgrading from version 4.2 or earlier to version 4.4 or later, this feature initializes access to services, causing existing roles to lose their access to services. + +There are two methods of granting back access to services: + + * Grant access to all services for all roles using the :ref:`grant_usage_on_service_to_all_roles` utility function + * Selectively grant or revoke access to services by following the :ref:`access permission guide` + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.5.rst b/releases/4.5.rst new file mode 100644 index 000000000..5110a77ae --- /dev/null +++ b/releases/4.5.rst @@ -0,0 +1,113 @@ +.. _4.5: + +***************** +Release Notes 4.5 +***************** + +The 4.5 release notes were released on December 5th, 2023 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | * CentOS 7.x | +| | * RHEL 7.x / 8.x | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version from 10.1 up to 11.4.3 | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 50 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.3.1 | +| | * ODBC 4.4.4 | +| | * NodeJS | +| | * .NET 5.0.0 | +| | * Pysqream 5.1.0 (compatible with v4.5.13 or later) | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.1 (compatible with v4.6.1 or later) | +| | * SQLoader As A Process 7.13 (compatible with v4.5.13 or later) | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + +► Introducing a new :ref:`Health-Check Monitor` utility command empowers administrators to oversee the database's health. This command serves as a valuable tool for monitoring, enabling administrators to assess and ensure the optimal health and performance of the database + +► A new :ref:`Query Timeout` session flag designed to identify queries that have exceeded a specified time limit. Once the flag value is reached, the query automatically stops + +► Optimized ``JOIN`` operation for improved performance with large tables + +► The new :ref:`swap_table_names` utility function enables you to swap the names of two tables within a schema. + +Known Issues +------------ + +* :ref:`Percentile` is not supported for :ref:`Window Functions` + + +Version 4.5 resolved Issues +--------------------------- + ++--------------------+------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++====================+================================================================================================+ +| SQ-11523 | Resolved internal runtime issue affecting the ``datetime`` saved query | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14292 | Resolved ``maxConnections`` Worker allocation issue | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-14869 | Optimized compilation time for improved performance with large metadata | ++--------------------+------------------------------------------------------------------------------------------------+ +| SQ-15074 | Addressed UI access issue for non-``SUPERUSER`` roles | ++--------------------+------------------------------------------------------------------------------------------------+ + +Deprecations +------------------- + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* **REHL 8.x** is now officially supported. + +Upgrading to Version 4.5 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + $ select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + $ ./upgrade_storage + +7. Version 4.4 introduces a service permission feature that enables superusers to grant and revoke role access to services. However, when upgrading from version 4.2 or earlier to version 4.4 or later, this feature initializes access to services, causing existing roles to lose their access to services. + +There are two methods of granting back access to services: + + * Grant access to all services for all roles using the :ref:`grant_usage_on_service_to_all_roles` utility function + * Selectively grant or revoke access to services by following the :ref:`access permission guide` + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.6.rst b/releases/4.6.rst new file mode 100644 index 000000000..8d3a60f47 --- /dev/null +++ b/releases/4.6.rst @@ -0,0 +1,147 @@ +.. _4.6: + +***************** +Release Notes 4.6 +***************** + +The 4.6 release notes were released on August 20th, 2024 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL - 8.x | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 11.x | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 51 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.3.1 | +| | * ODBC 4.4.4 | +| | * NodeJS 4.2.4 | +| | * .NET 5.0.0 | +| | * Pysqream 5.2.0 | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.1 (compatible with v4.6.1 or later) | +| | * SQLoader As A Process 7.13 (compatible with v4.5.13 or later) | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + +► Announcing a new :ref:`Activity Report` reflecting your storage and resource usage within a defined time frame. You can export your activity report as a PDF for use in financial records, briefings, or quarterly and yearly reports. + +► Announcing a new cross-platform :ref:`SQream SQL CLI` which is Java-based. This new CLI is fully compatible with the old and soon to be :ref:`deprecated Haskell CLI`. It also supports a neat looking result ``table view``. + +► A new :ref:`ldap` configuration flag allows including LDAP user attributes in your SQreamDB metadata by associating these attributes with SQreamDB roles. This means that you can now search by these attributes using your SQreamDB web interface. + +► The ``TOP`` clause can now take a **subtraction** arithmetic operator when used in a :ref:`select` statement. + +► You may now set your :ref:`Server Picker` more easily using keyword arguments. + +► The ``clientReconnectionTimeout`` configuration flag has been reclassified as a cluster configuration flag. Unlike session flags, cluster flags apply to the entire cluster and persist across system restarts or shutdowns, retaining the configured value. Learn more about :ref:`SQreamDB configuration flags` + +► Two new conditional functions that shorten complex query runtime by checking for the existence of tables and views within the specified schema: + +* :ref:`is_table_exists` +* :ref:`is_view_exists` + +► We enhanced our :ref:`Saved Query` permissions, ensuring that your saved queries are accessible and can be executed and reviewed exclusively by authorized users. + +► For any new SQreamDB installation or upgrade, your default :ref:`legacy configuration file` will include the following cluster flags: + +.. code-block:: json + + { + + "logMaxFileSizeMB": 20, + "logFileRotateTimeFrequency": "daily", + + } + +► Sign into SQreamDB Studio using your universal :ref:`Single Sign-On (SSO)` provider authentication. + +► Our :ref:`Pysqream` connector now support SQLAlchemy version 2.0.27. + +Known Issues +------------ + +* :ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.6 resolved Issues +--------------------------- + ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++====================+=====================================================================================================================+ +| SQ-12872 | Fixed unexpected Worker behavior caused by ``DROP TABLE`` statement | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-12873 | Improved the time it takes to delete metadata keys | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-13057 | Fixed ``DOUBLE`` casting into ``TEXT`` issue | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-15828 | Fixed slow query runtime due to ``VIEW`` unexpected behavior | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16397 | Fixed database tree UI rendering issue | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16531 | Resolved the error encountered when trying to create a ``VIEW`` using a table that requires a cleanup operation | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16592 | Fixed a discrepancy issue following ``OR`` condition execution | ++--------------------+---------------------------------------------------------------------------------------------------------------------+ + +.. _deprecations: + +Deprecations +------------ + +► **Haskell CLI** + +Starting February 2025, support for the Haskell CLI will be discontinued, and it will be replaced by a JAVA CLI that is compatible with both SQreamDB. + +► **CentOS Linux 7.x** + +CentOS Linux 7.x has reached its end of life and is not supported by SQreamDB. + +Upgrading to Version 4.6 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + +7. Version 4.4 introduces a service permission feature that enables superusers to grant and revoke role access to services. However, when upgrading from version 4.2 or earlier to version 4.4 or later, this feature initializes access to services, causing existing roles to lose their access to services. + +There are two methods of granting back access to services: + + * Grant access to all services for all roles using the :ref:`grant_usage_on_service_to_all_roles` utility function + * Selectively grant or revoke access to services by following the :ref:`access permission guide` + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.7.rst b/releases/4.7.rst new file mode 100644 index 000000000..484216acc --- /dev/null +++ b/releases/4.7.rst @@ -0,0 +1,201 @@ +.. _4.7: + +***************** +Release Notes 4.7 +***************** + +The 4.7 release notes were released on September 01, 2024 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL 8.x | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 11.x | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 51 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.3.1 | +| | * ODBC 4.4.4 | +| | * NodeJS 4.2.4 | +| | * .NET 5.0.0 | +| | * Pysqream 5.2.0 | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.1 (compatible with v4.6.1 or later) | +| | * SQLoader As A Process 7.13 (compatible with v4.5.13 or later) | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + + + +► Enhance observability and enable shorter investigation times with the new :ref:`health_monitoring` SQreamDB service. + +► SQreamDB may now be deployed on :ref:`AWS private cloud`. + +► A ``SUPERUSER`` may now release a :ref:`specific lock` or :ref:`all locks` blocking file cleanup and preventing operations on locked objects within the system. + +► SQreamDB operates with utmost efficiency when processing tables containing large data chunks. Introducing a new :ref:`rechunk` utility function, it simplifies the management of tables with small data chunks. This feature enables users to merge small data chunks into larger ones and simultaneously eliminating any deleted records present. + +► Enable automatic termination of queries that exceed a pre-defined time limit in the queue. The introduction of the :ref:`queueTimeoutMinutes ` flag empowers you to set time constraints for queries in the queue, ranging from a few minutes to a maximum of 72 hours. + +► Safely cast data types with the new :ref:`IsCastable` function. This function allows you to check whether a cast operation is possible or supported for a given column and data type and provides an alternative when an exception occurs when used within a ``CASE`` statement. + +► JDBC enhancements have been implemented to facilitate the retrieval of the record count for the updated number of rows during ``INSERT`` and ``DELETE`` operations when connecting to a third-party platform via JDBC. Use the SQreamDB JDBC connector as usual; the sole distinction is in the ability to now observe the updated number of rows. + +► Enhance your :ref:`COPY FROM` operations with the new ``DELETE_SOURCE_ON_SUCCESS`` parameter, which automatically deletes the source file being copied into SQreamDB. This not only saves time and effort in cleaning storage but also helps conserve storage space. + +► You may now retrieve and manipulate data from different databases within a single SQreamDB cluster through the execution of a single SQL statement using the :ref:`Cross-Database` syntax. + +► You may now use the new ``logFormat`` flag to configure the format of your log files. By default, logs are saved as ``CSV`` files. To configure your log files to be saved as ``JSON`` instead, use the ``logFormat`` flag in your :ref:`legacy config file`. If your current logs are in ``CSV`` format and you require :ref:`Health Monitoring`, it's advisable to configure your logs to be saved in both ``CSV`` and ``JSON`` formats as outlined above. + +.. note:: + + The ``logFormat`` flag must be configured identically in both your ``legacy_config_file`` and your ``metadata_config_file`` + +.. _health_monitoring_release: + +► You now have the option to choose the location for your ``metadata_server``, ``server_picker``, and Worker log files. In previous SQreamDB versions, the location of your log files was predetermined and hard-coded. + +:ref:`metadata_server_cli_reference` + +* Using the ``metadata_server_config.json`` file: + + .. code-block:: json + + { + "logPath": "" + } + +* Using the CLI: + + .. code-block:: console + + ./metadata_server --log_path= + +:ref:`server_picker_cli_reference` + + Using the CLI: + + .. code-block:: console + + ./server_picker --log_path= + +:ref:`Worker` + + Using the ``sqream_config_legacy.json``: + + .. code-block:: json + + { + "DefaultPathToLogs": "" + } + +► For any new SQreamDB installation or upgrade, your default :ref:`legacy configuration file` will include the following cluster flags: + +.. code-block:: json + + { + "logMaxFileSizeMB": 20, + "logFileRotateTimeFrequency": "daily", + } + +.. note:: Starting with SQreamDB version 4.6, log file naming conventions have changed. **Ensure that any code referencing log file names is updated accordingly**. + + * When using the ``logFileRotateTimeFrequency`` flag, log file names will follow these patterns: + + ``Daily``: ``sqream_yyyyMMdd_000.log`` + + ``Weekly``: ``sqream_yyyyMMWW_000.log`` (WW = week number within the month) + + ``Monthly``: ``sqream_yyyyMM_000.log`` + + * When using the ``logMaxFileSizeMB`` flag, log files will follow the pattern: + + ``sqream_N.log`` (N = 1 to 13) + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.7 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-15691 | Fixed ``TEXT`` casting into ``DOUBLE`` and ``NUMERIC`` issue when using scientific notation | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16038 | Fixed ``CREATE TABLE.. LIKE`` permission heritage issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16937 | Fixed schema corruption following default permission altering issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-17149 | Created a new ``SWAP_TABLE_NAMES`` utility function to address issue with views affected by SQLoader loads | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-17270 | Enhanced orphan snapshot cleaning mechanism | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-17520 | Fixed a SQLoader ``cleanup_extents`` related issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-17944 | Fixed ``UNION`` query result issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + +Deprecations +------------ + +► **Haskell CLI** + +Starting February 2025, support for the Haskell CLI will be discontinued, and it will be replaced by a JAVA CLI that is compatible with both SQreamDB. + +► **CentOS Linux 7.x** + +CentOS Linux 7.x has reached its end of life and is not supported by SQreamDB. + +* REHL 8.x is now officially supported. + +Upgrading to Version 4.7 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + +7. Version 4.4 introduces a service permission feature that enables superusers to grant and revoke role access to services. However, when upgrading from version 4.2 or earlier to version 4.4 or later, this feature initializes access to services, causing existing roles to lose their access to services. + +There are two methods of granting back access to services: + + * Grant access to all services for all roles using the :ref:`grant_usage_on_service_to_all_roles` utility function + * Selectively grant or revoke access to services by following the :ref:`access permission guide` + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.8.rst b/releases/4.8.rst new file mode 100644 index 000000000..c5fd6af43 --- /dev/null +++ b/releases/4.8.rst @@ -0,0 +1,111 @@ +.. _4.8: + +***************** +Release Notes 4.8 +***************** + +The 4.8 release notes were released on October 6th, 2024 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL 8.x | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 12.x | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 51 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.4.0 | +| | * ODBC 4.4.4 | +| | * NodeJS 4.2.4 | +| | * .NET 5.0.0 | +| | * Pysqream 5.3.0 | +| | * SQLAlchemy 1.4 | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.2 | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- + + + +► Prepared statements, also known as parameterized queries, are a safer and more efficient way to execute SQL statements. They prevent SQL injection attacks by separating SQL code from data, and they can improve performance by reusing prepared statements. +These are now supported by our `Python <../connecting_to_sqream/client_drivers/python/index.html#prepared-statements>`_ and `JDBC <../connecting_to_sqream/client_drivers/jdbc/index.html#prepared-statements>`_ client drivers. + +► `PIVOT <../reference/sql/sql_syntax/pivot_unpivot.html#syntax>`_ allows to convert row-level data into columnar representation. This technique is particularly useful when you need to summarize and visualize data. `UNPIVOT <../reference/sql/sql_syntax/pivot_unpivot.html#syntax>`_ does the opposite by transforming columnar data into rows. This operation is invaluable for scenarios where you wish to explore data in a more granular manner. + +► `Window funtion alias <../reference/sql/sql_syntax/window_functions.html#window-funtion-alias>`_ allows to specify a parameter within the window function definition. This eliminates the need to repeatedly input the same SQL code in queries that use multiple window functions with identical definitions. + +► `CONCAT <../reference/sql/sql_functions/scalar_functions/string/concat_function.html#concat-function>`_ function concatenates one or more strings, or concatenates one or more binary values. + + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.8 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-12365 | SQream CLI - Comment is not ignored as expected | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-17520 | SQLoader - DELETE issue following CDC process | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + + +Deprecations +------------------- + +► **Haskell CLI** + +Starting October 2024, support for the Haskell CLI will be discontinued, and it will be replaced by a JAVA CLI that is compatible with both SQreamDB and BLUE. + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* REHL 8.x is now officially supported. + +Upgrading to Version 4.8 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/installing_sqream_with_binary.html#upgrading-sqream-version>`_ procedure. + diff --git a/releases/4.9.rst b/releases/4.9.rst new file mode 100644 index 000000000..450454939 --- /dev/null +++ b/releases/4.9.rst @@ -0,0 +1,105 @@ +.. _4.9: + +***************** +Release Notes 4.9 +***************** + +The 4.9 release notes were released on Novemebr 28th, 2024 + +.. contents:: + :local: + :depth: 1 + +Compatibility Matrix +-------------------- + ++-------------------------+------------------------------------------------------------------------+ +| System Requirement | Details | ++=========================+========================================================================+ +| Supported OS | RHEL 8.9 | ++-------------------------+------------------------------------------------------------------------+ +| supported Nvidia driver | CUDA version 12.x | ++-------------------------+------------------------------------------------------------------------+ +| Storage version | 57 | ++-------------------------+------------------------------------------------------------------------+ +| Driver compatibility | * JDBC 5.4.2 | +| | * ODBC 4.4.4 | +| | * NodeJS 4.2.4 | +| | * .NET 5.0.0 | +| | * Pysqream 5.3.0 | +| | * SQLAlchemy 1.4 | +| | * Spark 5.0.0 | +| | * SQLoader As A Service 8.3 | ++-------------------------+------------------------------------------------------------------------+ + +New Features and Enhancements +----------------------------- +This release does not include any new features or enhancements + +Known Issues +------------ + +:ref:`Percentile` is not supported for :ref:`Window Functions` + +Version 4.9 resolved Issues +--------------------------- + ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| **SQ No.** | **Description** | ++==============+=====================================================================================================================+ +| SQ-19055 | Illegal memory access was encountered error | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19053 | Workers connectivity issues | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-19051 | Compression related metadata issue | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-18745 | Cannot grant select to a role on a table | ++--------------+---------------------------------------------------------------------------------------------------------------------+ +| SQ-16877 | Query run time is longer than expected | ++--------------+---------------------------------------------------------------------------------------------------------------------+ + + +Deprecations +------------------- + +► **Haskell CLI** + +Starting October 2024, support for the Haskell CLI is discontinued, and it is replaced by the :ref:`Multi Platform CLI` that is compatible with the Haskell CLI with the added value of ``Table-View`` and cross platform compatability. + +► **CentOS Linux 7.x** + +* As of June 2024, CentOS Linux 7.x will reach its End of Life and will not be supported by SQreamDB. This announcement provides a one-year advance notice for our users to plan for this change. We recommend users to explore migration or upgrade options to maintain ongoing support and security beyond this date. + +* REHL 8.x is now officially supported. + +Upgrading to Version 4.9 +------------------------- + +1. Generate a back-up of the metadata by running the following command: + + .. code-block:: console + + select backup_metadata('out_path'); + + .. tip:: SQreamDB recommends storing the generated back-up locally in case needed. + + SQreamDB runs the Garbage Collector and creates a clean backup tarball package. + +2. Shut down all SQreamDB services. + +3. Copy the recently created back-up file. + +4. Replace your current metadata with the metadata you stored in the back-up file. + +5. Navigate to the new SQreamDB package bin folder. + +6. Run the following command: + + .. code-block:: console + + ./upgrade_storage + + + + .. note:: Upgrading from a major version to another major version requires you to follow the **Upgrade Storage** step. This is described in Step 7 of the `Upgrading SQreamDB Version <../installation_guides/upgrade_guide/version_upgrade.html>`_ procedure. + diff --git a/releases/index.rst b/releases/index.rst index 472197afc..0f0fbd4f4 100644 --- a/releases/index.rst +++ b/releases/index.rst @@ -1,28 +1,75 @@ .. _releases: -********** +************* Release Notes -********** +************* - -.. list-table:: - :widths: auto - :header-rows: 1 - - - * - Version - - Release Date - * - :ref:`2021.2` - - September 13, 2021 - * - :ref:`2021.1` - - June 13, 2021 - * - :ref:`2020.3` - - October 8, 2020 - * - :ref:`2020.2` - - July 22, 2020 - * - :ref:`2020.1` - - January 15, 2020 +:ref:`Version 4.12 - July 3rd, 2025<4.12>` + +:ref:`Version 4.11 - April 9th, 2025<4.11>` + +:ref:`Version 4.10 - January 20, 2025<4.10>` + +:ref:`Version 4.9 - November 28, 2024<4.9>` + +:ref:`Version 4.8 - October 06, 2024<4.8>` + +* Prepared statements are now supported by our `Python <../connecting_to_sqream/client_drivers/python/index.html#prepared-statements>`_ and `JDBC <../connecting_to_sqream/client_drivers/jdbc/index.html#prepared-statements>`_ client drivers. +* `PIVOT <../reference/sql/sql_syntax/pivot_unpivot.html#syntax>`_ and `UNPIVOT <../reference/sql/sql_syntax/pivot_unpivot.html#syntax>`_. +* `Window funtion alias <../reference/sql/sql_syntax/window_functions.html#window-funtion-alias>`_ allows to specify a parameter within the window function definition. This eliminates the need to repeatedly input the same SQL code in queries that use multiple window functions with identical definitions. +* `CONCAT <../reference/sql/sql_functions/scalar_functions/string/concat_function.html#concat-function>`_ function concatenates one or more strings, or concatenates one or more binary values. + +:ref:`Version 4.7 - September 01, 2024<4.7>` + +* :ref:`AWS private cloud deployment` is now available for SQreamDB on AWS Marketplace. +* Execute a single SQL statement across your SQreamDB cluster using the new :ref:`Cross-Database` syntax. +* Safely cast data types with the new :ref:`IsCastable` function. +* Automatically delete source files being copied into SQreamDB using the :ref:`copy_from` command. + +:ref:`Version 4.6 - August 20, 2024<4.6>` + +* You can now sign in to SQreamDB Studio using your universal :ref:`Single Sign-On (SSO)` provider authentication + +* Announcing a new :ref:`Activity Report` reflecting your storage and resource usage + +* Announcing a new Java-based cross-platform :ref:`SQream SQL CLI` + +* ``TOP`` clause enhancements +* :ref:`Saved Query` command permission enhancements + +:ref:`Version 4.5 - December 5, 2023<4.5>` + +* Introducing a new :ref:`Health-Check Monitor` utility command empowers administrators to oversee the database's health. This command serves as a valuable tool for monitoring, enabling administrators to assess and ensure the optimal health and performance of the database + +* A new :ref:`Query Timeout` session flag designed to identify queries that have exceeded a specified time limit. Once the flag value is reached, the query automatically stops + +:ref:`Version 4.4 - September 28, 2023<4.4>` + +* `Enhancing storage efficiency and performance with the newly supported ARRAY data type `_ +* `New integration with Denodo Platform `_ + +:ref:`Version 4.3 - June 11, 2023<4.3>` + +* `Access Control Permission Expansion `_ +* `New AWS S3 Access Configurations `_ + +:ref:`Version 4.2 - April 23, 2023<4.2>` + +* `New Apache Spark Connector `_ +* `Physical Deletion Performance Enhancement `_ + +:ref:`Version 4.1 - March 01, 2023<4.1>` + +* `LDAP Management Enhancements `_ +* `New Trino Connector `_ +* `Brute-Force Attack Protection `_ + +:ref:`Version 4.0 - January 25, 2023<4.0>` + +* `SQreamDB License Storage Capacity `_ +* `LDAP Authentication `_ +* `Physical Deletion Performance Enhancement `_ .. toctree:: @@ -30,8 +77,7 @@ Release Notes :glob: :hidden: - 2021.2_index - 2021.1_index - 2020.3_index - 2020.2 - 2020.1 + releasePolicy + 4.0_index + + diff --git a/releases/releasePolicy.rst b/releases/releasePolicy.rst new file mode 100644 index 000000000..1b3d91f0c --- /dev/null +++ b/releases/releasePolicy.rst @@ -0,0 +1,65 @@ +.. _releasePolicy: + +******************* +SQDB release policy +******************* + + +Release Cadence +=============== +* **Major Versions** Major product versions are released when significant architectural changes, or substantial improvements are ready for deployment. There is no fixed release cadence for major versions. +* **Minor Versions** SQream releases minor product versions once per quarter. Assuring our customers benefit new features and improvements. +* **Patches** Patches and hotfixes will be releases as needed to address specific issues. + + +Transition to Maintenance Mode +============================== +* Upon release of a new version, the previous versions immediately transitions into Maintenance Mode. +* **Minimum Full Support Period** Each minor version will receive full support for a minimum of three months from its release date before transitioning to maintenance mode. Major versions will also have a minimum of 3 months of full support. +* During Maintenance Mode, support will be limited to high-priority issues and showstopper bugs that impact critical functionality. + + +End of Support Timeline +======================= +End of Support (EOS) for each version is scheduled one year after the version's release date. +After the EOS date, no further updates or bug fixes will be provided for that version, and customers will be encouraged to upgrade to a supported version to receive continued updates and support. + +SQDB Releases Timeline +====================== + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Release + - Release Date + - Maintenance Mode + - End of Support + * - ``4.12`` + - July 3rd 2025 + - October 3rd, 2025 + - July 3rd 2026 + * - ``4.11`` + - April 9th 2025 + - May 27th 2025 + - April 9th 2026 + * - ``4.10`` + - January 20th 2025 + - April 9th 2025 + - January 20th 2026 + * - ``4.9`` + - November 29th 2024 + - February 29th 2025 + - November 29th 2025 + * - ``4.8`` + - October 6th 2024 + - January 6th 2025 + - October 6th 2025 + * - ``4.5`` + - December 5th 2023 + - March 5th 2024 + - September 30th 2025 + * - ``4.3`` + - June 11th 2023 + - September 11th 2023 + - June 11th 2024 \ No newline at end of file diff --git a/requirements.txt b/requirements.txt index 806fe2730..d3137b1cb 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,7 +1,11 @@ # File: docs/requirements.txt # Defining the exact version will make sure things don't break -sphinx==3.5.3 -sphinx_rtd_theme==0.5.2 -sphinx-notfound-page +sphinx==7.2.6 +sphinx-rtd-theme>=2.1.0rc2 +urllib3<=2.0.0 +openssl-python>=0.1.1 +sphinx-notfound-page>=1.0.4 Pygments>=2.4.0 +sphinx-favicon>=1.0.1 +pdftex diff --git a/sqream_studio_5.4.3/configuring_your_instance_of_sqream.rst b/sqream_studio/configuring_your_instance_of_sqream.rst similarity index 75% rename from sqream_studio_5.4.3/configuring_your_instance_of_sqream.rst rename to sqream_studio/configuring_your_instance_of_sqream.rst index 2a60146e0..17d9bc43c 100644 --- a/sqream_studio_5.4.3/configuring_your_instance_of_sqream.rst +++ b/sqream_studio/configuring_your_instance_of_sqream.rst @@ -1,23 +1,24 @@ -.. _configuring_your_instance_of_sqream: - -**************************** -Configuring Your Instance of SQream -**************************** -The **Configuration** section lets you edit parameters from one centralized location. While you can edit these parameters from the **worker configuration file (config.json)** or from your CLI, you can also modify them in Studio in an easy-to-use format. - -Configuring your instance of SQream in Studio is session-based, which enables you to edit parameters per session on your own device. -Because session-based configurations are not persistent and are deleted when your session ends, you can edit your required parameters while avoiding conflicts between parameters edited on different devices at different points in time. - -Editing Your Parameters -------------------------------- -When configuring your instance of SQream in Studio you can edit parameters for the **Generic** and **Admin** parameters only. - -Studio includes two types of parameters: toggle switches, such as **flipJoinOrder**, and text fields, such as **logSysLevel**. After editing a parameter, you can reset each one to its previous value or to its default value individually, or revert all parameters to their default setting simultaneously. Note that you must click **Save** to save your configurations. - -You can hover over the **information** icon located on each parameter to read a short description of its behavior. - -Exporting and Importing Configuration Files -------------------------- -You can also export and import your configuration settings into a .json file. This allows you to easily edit your parameters and to share this file with other users if required. - -For more information about configuring your instance of SQream, see `Configuration Guides `_. +.. _configuring_your_instance_of_sqream: + +************************************ +Configuring Your Instance of SQreams +************************************ + +The **Configuration** section lets you edit parameters from one centralized location. While you can edit these parameters from the **worker configuration file (config.json)** or from your CLI, you can also modify them in Studio in an easy-to-use format. + +Configuring your instance of SQream in Studio is session-based, which enables you to edit parameters per session on your own device. +Because session-based configurations are not persistent and are deleted when your session ends, you can edit your required parameters while avoiding conflicts between parameters edited on different devices at different points in time. + +Editing Your Parameters +----------------------- + +When configuring your instance of SQream in Studio you can edit session and cluster parameters only. + +Studio includes two types of parameters: toggle switches, such as **flipJoinOrder**, and text fields, such as **logSysLevel**. After editing a parameter, you can reset each one to its previous value or to its default value individually, or revert all parameters to their default setting simultaneously. Note that you must click **Save** to save your configurations. + +You can hover over the **information** icon located on each parameter to read a short description of its behavior. + +Exporting and Importing Configuration Files +------------------------------------------- + +You can also export and import your configuration settings into a .json file. This allows you to easily edit your parameters and to share this file with other users if required. \ No newline at end of file diff --git a/sqream_studio_5.4.3/creating_assigning_and_managing_roles_and_permissions.rst b/sqream_studio/creating_assigning_and_managing_roles_and_permissions.rst similarity index 78% rename from sqream_studio_5.4.3/creating_assigning_and_managing_roles_and_permissions.rst rename to sqream_studio/creating_assigning_and_managing_roles_and_permissions.rst index 325563761..8a0fdc006 100644 --- a/sqream_studio_5.4.3/creating_assigning_and_managing_roles_and_permissions.rst +++ b/sqream_studio/creating_assigning_and_managing_roles_and_permissions.rst @@ -1,98 +1,90 @@ -.. _creating_assigning_and_managing_roles_and_permissions: - -.. _roles_5.4.3: - -**************************** -Creating, Assigning, and Managing Roles and Permissions -**************************** -The **Creating, Assigning, and Managing Roles and Permissions** describes the following: - -.. contents:: - :local: - :depth: 1 - -Overview ---------------- -In the **Roles** area you can create and assign roles and manage user permissions. - -The **Type** column displays one of the following assigned role types: - -.. list-table:: - :widths: 15 75 - :header-rows: 1 - - * - Role Type - - Description - * - Groups - - Roles with no users. - * - Enabled users - - Users with log-in permissions and a password. - * - Disabled users - - Users with log-in permissions and with a disabled password. An admin may disable a user's password permissions to temporary disable access to the system. - -.. note:: If you disable a password, when you enable it you have to create a new one. - -:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` - - -Viewing Information About a Role --------------------- -Clicking a role in the roles table displays the following information: - - * **Parent Roles** - displays the parent roles of the selected role. Roles inherit all roles assigned to the parent. - - :: - - * **Members** - displays all members that the role has been assigned to. The arrow indicates the roles that the role has inherited. Hovering over a member displays the roles that the role is inherited from. - - :: - - * **Permissions** - displays the role's permissions. The arrow indicates the permissions that the role has inherited. Hovering over a permission displays the roles that the permission is inherited from. - -:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` - - -Creating a New Role --------------------- -You can create a new role by clicking **New Role**. - - - -An admin creates a **user** by granting login permissions and a password to a role. Each role is defined by a set of permissions. An admin can also group several roles together to form a **group** to manage them simultaneously. For example, permissions can be granted to or revoked on a group level. - -Clicking **New Role** lets you do the following: - - * Add and assign a role name (required) - * Enable or disable log-in permissions for the role. - * Set a password. - * Assign or delete parent roles. - * Add or delete permissions. - * Grant the selected user with superuser permissions. - -From the New Role panel you view directly and indirectly (or inherited) granted permissions. Disabled permissions have no connect permissions for the referenced database and are displayed in gray text. You can add or remove permissions from the **Add permissions** field. From the New Role panel you can also search and scroll through the permissions. In the **Search** field you can use the **and** operator to search for strings that fulfill multiple criteria. - -When adding a new role, you must select the **Enable login for this role** and **Has password** check boxes. - -:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` - -Editing a Role --------------------- -Once you've created a role, clicking the **Edit Role** button lets you do the following: - - * Edit the role name. - * Enable or disable log-in permissions. - * Set a password. - * Assign or delete parent roles. - * Assign a role **administrator** permissions. - * Add or delete permissions. - * Grant the selected user with superuser permissions. - -From the Edit Role panel you view directly and indirectly (or inherited) granted permissions. Disabled permissions have no connect permissions for the referenced database and are displayed in gray text. You can add or remove permissions from the **Add permissions** field. From the Edit Role panel you can also search and scroll through the permissions. In the **Search** field you can use the **and** operator to search for strings that fulfill multiple criteria. - -:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` - -Deleting a Role ------------------ -Clicking the **delete** icon displays a confirmation message with the amount of users and groups that will be impacted by deleting the role. - -:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` \ No newline at end of file +.. _creating_assigning_and_managing_roles_and_permissions: + +.. _roles_5.4.7: + +******************************************************* +Creating, Assigning, and Managing Roles and Permissions +******************************************************* + +In the **Roles** area you can create and assign roles and manage user permissions. + +The **Type** column displays one of the following assigned role types: + +.. list-table:: + :widths: 15 75 + :header-rows: 1 + + * - Role Type + - Description + * - Groups + - Roles with no users. + * - Enabled users + - Users with log-in permissions and a password. + * - Disabled users + - Users with log-in permissions and with a disabled password. An admin may disable a user's password permissions to temporary disable access to the system. + +.. note:: If you disable a password, when you enable it you have to create a new one. + +:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` + + +Viewing Information About a Role +-------------------------------- + +Clicking a role in the roles table displays the following information: + + * **Parent Roles** - displays the parent roles of the selected role. Roles inherit all roles assigned to the parent. + + * **Members** - displays all members that the role has been assigned to. The arrow indicates the roles that the role has inherited. Hovering over a member displays the roles that the role is inherited from. + + * **Permissions** - displays the role's permissions. The arrow indicates the permissions that the role has inherited. Hovering over a permission displays the roles that the permission is inherited from. + +:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` + + +Creating a New Role +------------------- + +You can create a new role by clicking **New Role**. + + +An admin creates a **user** by granting login permissions and a password to a role. Each role is defined by a set of permissions. An admin can also group several roles together to form a **group** to manage them simultaneously. For example, permissions can be granted to or revoked on a group level. + +Clicking **New Role** lets you do the following: + + * Add and assign a role name (required) + * Enable or disable log-in permissions for the role + * Set a password + * Assign or delete parent roles + * Add or delete permissions + * Grant the selected user with superuser permissions + +From the New Role panel you view directly and indirectly (or inherited) granted permissions. Disabled permissions have no connect permissions for the referenced database and are displayed in gray text. You can add or remove permissions from the **Add permissions** field. From the New Role panel you can also search and scroll through the permissions. In the **Search** field you can use the **and** operator to search for strings that fulfill multiple criteria. + +When adding a new role, you must select the **Enable login for this role** and **Has password** check boxes. + +:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` + +Editing a Role +-------------- + +Once you've created a role, clicking the **Edit Role** button lets you do the following: + + * Edit role name + * Enable or disable log-in permissions + * Set a password + * Assign or delete parent roles + * Assign a role **administrator** permissions + * Add or delete permissions + * Grant the selected user with superuser permissions + +From the Edit Role panel you view directly and indirectly (or inherited) granted permissions. Disabled permissions have no connect permissions for the referenced database and are displayed in gray text. You can add or remove permissions from the **Add permissions** field. From the Edit Role panel you can also search and scroll through the permissions. In the **Search** field you can use the **and** operator to search for strings that fulfill multiple criteria. + +:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` + +Deleting a Role +--------------- + +Clicking the **delete** icon displays a confirmation message with the amount of users and groups that will be impacted by deleting the role. + +:ref:`Back to Creating, Assigning, and Managing Roles and Permissions` \ No newline at end of file diff --git a/sqream_studio_5.4.3/executing_statements_and_running_queries_from_the_editor.rst b/sqream_studio/executing_statements_and_running_queries_from_the_editor.rst similarity index 76% rename from sqream_studio_5.4.3/executing_statements_and_running_queries_from_the_editor.rst rename to sqream_studio/executing_statements_and_running_queries_from_the_editor.rst index 55369d761..895b05671 100644 --- a/sqream_studio_5.4.3/executing_statements_and_running_queries_from_the_editor.rst +++ b/sqream_studio/executing_statements_and_running_queries_from_the_editor.rst @@ -1,492 +1,470 @@ -.. _executing_statements_and_running_queries_from_the_editor: - -.. _editor_top_5.4.3: - -**************************** -Executing Statements and Running Queries from the Editor -**************************** -The **Editor** is used for the following: - -* Selecting an active database and executing queries. -* Performing statement-related operations and showing metadata. -* Executing pre-defined queries. -* Writing queries and statements and viewing query results. - -The following is a brief description of the Editor panels: - - -.. list-table:: - :widths: 10 34 56 - :header-rows: 1 - - * - No. - - Element - - Description - * - 1 - - :ref:`Toolbar` - - Used to select the active database you want to work on, limit the number of rows, save query, etc. - * - 2 - - :ref:`Database Tree and System Queries panel` - - Shows a hierarchy tree of databases, views, tables, and columns - * - 3 - - :ref:`Statement panel` - - Used for writing queries and statements - * - 4 - - :ref:`Results panel` - - Shows query results and execution information. - - - -.. _top_5.4.3: - -.. _studio_5.4.3_editor_toolbar: - -Executing Statements from the Toolbar -================ -You can access the following from the Toolbar pane: - -* **Database dropdown list** - select a database that you want to run statements on. - - :: - -* **Service dropdown list** - select a service that you want to run statements on. The options in the service dropdown menu depend on the database you select from the **Database** dropdown list. - - :: - -* **Execute** - lets you set which statements to execute. The **Execute** button toggles between **Execute** and **Stop**, and can be used to stop an active statement before it completes: - - * **Statements** - executes the statement at the location of the cursor. - * **Selected** - executes only the highlighted text. This mode should be used when executing subqueries or sections of large queries (as long as they are valid SQLs). - * **All** - executes all statements in a selected tab. - -* **Format SQL** - Lets you reformat and reindent statements. - - :: - -* **Download query** - Lets you download query text to your computer. - - :: - -* **Open query** - Lets you upload query text from your computer. - - :: - -* **Max Rows** - By default, the Editor fetches only the first 10,000 rows. You can modify this number by selecting an option from the **Max Rows** dropdown list. Note that setting a higher number may slow down your browser if the result is very large. This number is limited to 100,000 results. To see a higher number, you can save the results in a file or a table using the :ref:`create_table_as` command. - - -For more information on stopping active statements, see the :ref:`STOP_STATEMENT` command. - -:ref:`Back to Executing Statements and Running Queries from the Editor` - - -.. _studio_5.4.3_editor_db_tree: - -Performing Statement-Related Operations from the Database Tree -================ -From the Database Tree you can perform statement-related operations and show metadata (such as a number indicating the amount of rows in the table). - - - - - -The database object functions are used to perform the following: - -* The **SELECT** statement - copies the selected table's **columns** into the Statement panel as ``SELECT`` parameters. - - :: - -* The **copy** feature |icon-copy| - copies the selected table's **name** into the Statement panel. - - :: - -* The **additional operations** |icon-dots| - displays the following additional options: - - -.. |icon-user| image:: /_static/images/studio_icon_user.png - :align: middle - -.. |icon-dots| image:: /_static/images/studio_icon_dots.png - :align: middle - -.. |icon-editor| image:: /_static/images/studio_icon_editor.png - :align: middle - -.. |icon-copy| image:: /_static/images/studio_icon_copy.png - :align: middle - -.. |icon-select| image:: /_static/images/studio_icon_select.png - :align: middle - -.. |icon-dots| image:: /_static/images/studio_icon_dots.png - :align: middle - -.. |icon-filter| image:: /_static/images/studio_icon_filter.png - :align: middle - -.. |icon-ddl-edit| image:: /_static/images/studio_icon_ddl_edit.png - :align: middle - -.. |icon-run-optimizer| image:: /_static/images/studio_icon_run_optimizer.png - :align: middle - -.. |icon-generate-create-statement| image:: /_static/images/studio_icon_generate_create_statement.png - :align: middle - -.. |icon-plus| image:: /_static/images/studio_icon_plus.png - :align: middle - -.. |icon-close| image:: /_static/images/studio_icon_close.png - :align: middle - -.. |icon-left| image:: /_static/images/studio_icon_left.png - :align: middle - -.. |icon-right| image:: /_static/images/studio_icon_right.png - :align: middle - -.. |icon-format-sql| image:: /_static/images/studio_icon_format.png - :align: middle - -.. |icon-download-query| image:: /_static/images/studio_icon_download_query.png - :align: middle - -.. |icon-open-query| image:: /_static/images/studio_icon_open_query.png - :align: middle - -.. |icon-execute| image:: /_static/images/studio_icon_execute.png - :align: middle - -.. |icon-stop| image:: /_static/images/studio_icon_stop.png - :align: middle - -.. |icon-dashboard| image:: /_static/images/studio_icon_dashboard.png - :align: middle - -.. |icon-expand| image:: /_static/images/studio_icon_expand.png - :align: middle - -.. |icon-scale| image:: /_static/images/studio_icon_scale.png - :align: middle - -.. |icon-expand-down| image:: /_static/images/studio_icon_expand_down.png - :align: middle - -.. |icon-add| image:: /_static/images/studio_icon_add.png - :align: middle - -.. |icon-add-worker| image:: /_static/images/studio_icon_add_worker.png - :align: middle - -.. |keep-tabs| image:: /_static/images/studio_keep_tabs.png - :align: middle - - -.. list-table:: - :widths: 30 70 - :header-rows: 1 - - * - Function - - Description - * - Insert statement - - Generates an `INSERT `_ statement for the selected table in the editing area. - * - Delete statement - - Generates a `DELETE `_ statement for the selected table in the editing area. - * - Create Table As statement - - Generates a `CREATE TABLE AS `_ statement for the selected table in the editing area. - * - Rename statement - - Generates an `RENAME TABLE AS `_ statement for renaming the selected table in the editing area. - * - Adding column statement - - Generates an `ADD COLUMN `_ statement for adding columns to the selected table in the editing area. - * - Truncate table statement - - Generates a `TRUNCATE_IF_EXISTS `_ statement for the selected table in the editing area. - * - Drop table statement - - Generates a ``DROP`` statement for the selected object in the editing area. - * - Table DDL - - Generates a DDL statement for the selected object in the editing area. To get the entire database DDL, click the |icon-ddl-edit| icon next to the database name in the tree root. See `Seeing System Objects as DDL `_. - * - DDL Optimizer - - The `DDL Optimizer `_ lets you analyze database tables and recommends possible optimizations. - -Optimizing Database Tables Using the DDL Optimizer ------------------------ -The **DDL Optimizer** tab analyzes database tables and recommends possible optimizations according to SQream's best practices. - -As described in the previous table, you can access the DDL Optimizer by clicking the **additional options icon** and selecting **DDL Optimizer**. - -The following table describes the DDL Optimizer screen: - -.. list-table:: - :widths: 15 75 - :header-rows: 1 - - * - Element - - Description - * - Column area - - Shows the column **names** and **column types** from the selected table. You can scroll down or to the right/left for long column lists. - * - Optimization area - - Shows the number of rows to sample as the basis for running an optimization, the default setting (1,000,000) when running an optimization (this is also the overhead threshold used when analyzing ``VARCHAR`` fields), and the default percent buffer to add to ``VARCHAR`` lengths (10%). Attempts to determine field nullability. - * - Run Optimizer - - Starts the optimization process. - -Clicking **Run Optimizer** adds a tab to the Statement panel showing the optimized results of the selected object. - -For more information, see `Optimization and Best Practices `_. - -Executing Pre-Defined Queries from the System Queries Panel ---------------- -The **System Queries** panel lets you execute predefined queries and includes the following system query types: - -* **Catalog queries** - used for analyzing table compression rates, users and permissions, etc. - - :: - -* **Admin queries** - queries related to available (describe the functionality in a general way). Queries useful for SQream database management. - -Clicking an item pastes the query into the Statement pane, and you can undo a previous operation by pressing **Ctrl + Z**. - -.. _studio_5.4.3_editor_statement_area: - -Writing Statements and Queries from the Statement Panel -============== -The multi-tabbed statement area is used for writing queries and statements, and is used in tandem with the toolbar. When writing and executing statements, you must first select a database from the **Database** dropdown menu in the toolbar. When you execute a statement, it passes through a series of statuses until completed. Knowing the status helps you with statement maintenance, and the statuses are shown in the **Results panel**. - -The auto-complete feature assists you when writing statements by suggesting statement options. - -The following table shows the statement statuses: - -.. list-table:: - :widths: 45 160 - :header-rows: 1 - - * - Status - - Description - * - Pending - - The statement is pending. - * - In queue - - The statement is waiting for execution. - * - Initializing - - The statement has entered execution checks. - * - Executing - - The statement is executing. - * - Statement stopped - - The statement has been stopped. - -You can add and name new tabs for each statement that you need to execute, and Studio preserves your created tabs when you switch between databases. You can add new tabs by clicking |icon-plus| , which creates a new tab to the right with a default name of SQL and an increasing number. This helps you keep track of your statements. - -You can also rename the default tab name by double-clicking it and typing a new name and write multiple statements in tandem in the same tab by separating them with semicolons (``;``).If too many tabs to fit into the Statement Pane are open at the same time, the tab arrows are displayed. You can scroll through the tabs by clicking |icon-left| or |icon-right|, and close tabs by clicking |icon-close|. You can also close all tabs at once by clicking **Close all** located to the right of the tabs. - -.. tip:: If this is your first time using SQream, see `Getting Started `_. - - -.. Keyboard shortcuts -.. ^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. :kbd:`Ctrl` +: kbd:`Enter` - Execute all queries in the statement area, or just the highlighted part of the query. - -.. :kbd:`Ctrl` + :kbd:`Space` - Auto-complete the current keyword - -.. :kbd:`Ctrl` + :kbd:`↑` - Switch to next tab. - -.. :kbd:`Ctrl` + :kbd:`↓` - Switch to previous tab - -.. _studio_editor_results_5.4.3: - -:ref:`Back to Executing Statements and Running Queries from the Editor` - -.. _studio_5.4.3_editor_results: - -.. _results_panel_5.4.3: - -Viewing Statement and Query Results from the Results Panel -============== -The results panel shows statement and query results. By default, only the first 10,000 results are returned, although you can modify this from the :ref:`studio_editor_toolbar`, as described above. By default, executing several statements together opens a separate results tab for each statement. Executing statements together executes them serially, and any failed statement cancels all subsequent executions. - -.. image:: /_static/images/results_panel.png - -The following is a brief description of the Results panel views highlighted in the figure above: - -.. list-table:: - :widths: 45 160 - :header-rows: 1 - - * - Element - - Description - * - :ref:`Results view` - - Lets you view search query results. - * - :ref:`Execution Details view` - - Lets you analyze your query for troubleshooting and optimization purposes. - * - :ref:`SQL view` - - Lets you see the SQL view. - - -.. _results_view_5.4.3: - -:ref:`Back to Executing Statements and Running Queries from the Editor` - -Searching Query Results in the Results View ----------------- -The **Results view** lets you view search query results. - -From this view you can also do the following: - -* View the amount of time (in seconds) taken for a query to finish executing. -* Switch and scroll between tabs. -* Close all tabs at once. -* Enable keeping tabs by selecting **Keep tabs**. -* Sort column results. - -Saving Results to the Clipboard -^^^^^^^^^^^^ -The **Save results to clipboard** function lets you save your results to the clipboard to paste into another text editor or into Excel for further analysis. - -.. _save_results_to_local_file_5.4.3: - -Saving Results to a Local File -^^^^^^^^^^^^ -The **Save results to local file** functions lets you save your search query results to a local file. Clicking **Save results to local file** downloads the contents of the Results panel to an Excel sheet. You can then use copy and paste this content into other editors as needed. - -In the Results view you can also run parallel statements, as described in **Running Parallel Statements** below. - -.. _running_parallel_statements_5.4.3: - -Running Parallel Statements -^^^^^^^^^^^^ -While Studio's default functionality is to open a new tab for each executed statement, Studio supports running parallel statements in one statement tab. Running parallel statements requires using macros and is useful for advanced users. - -The following shows the syntax for running parallel statements: - -.. code-block:: console - - $ @@ parallel - $ $$ - $ select 1; - $ select 2; - $ select 3; - $ $$ - - -:ref:`Back to Viewing Statement and Query Results from the Results Panel` - -.. _execution_details_view_5.4.3: - -.. _execution_tree_5.4.3: - -Execution Details View --------------- -The **Execution Details View** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Overview -^^^^^^^^^^^^ -Clicking **Execution Details View** displays the **Execution Tree**, which is a chronological tree of processes that occurred to execute your queries. The purpose of the Execution Tree is to analyze all aspects of your query for troubleshooting and optimization purposes, such as resolving queries with an exceptionally long runtime. - -.. note:: The **Execution Details View** button is enabled only when a query takes longer than five seconds. - -From this screen you can scroll in, out, and around the execution tree with the mouse to analyze all aspects of your query. You can navigate around the execution tree by dragging or by using the mini-map in the bottom right corner. - -.. image:: /_static/images/execution_tree_1.png - -You can also search for query data by pressing **Ctrl+F** or clicking the search icon |icon-search| in the search field in the top right corner and typing text. - -.. image:: /_static/images/search_field.png - -Pressing **Enter** takes you directly to the next result matching your search criteria, and pressing **Shift + Enter** takes you directly to the previous result. You can also search next and previous results using the up and down arrows. - -.. |icon-search| image:: /_static/images/studio_icon_search.png - :align: middle - -The nodes are color-coded based on the following: - -* **Slow nodes** - red -* **In progress nodes** - yellow -* **Completed nodes** - green -* **Pending nodes** - white -* **Currently selected node** - blue -* **Search result node** - purple (in the mini-map) - -The execution tree displays the same information as shown in the plain view in tree format. - -The Execution Tree tracks each phase of your query in real time as a vertical tree of nodes. Each node refers to an operation that occurred on the GPU or CPU. When a phase is completed, the next branch begins to its right until the entire query is complete. Joins are displayed as two parallel branches merged together in a node called **Join**, as shown in the figure above. The nodes are connected by a line indicating the number of rows passed from one node to the next. The width of the line indicates the amount of rows on a logarithmic scale. - -Each node displays a number displaying its **node ID**, its **type**, **table name** (if relevant), **status**, and **runtime**. The nodes are color-coded for easy identification. Green nodes indicate **completed nodes**, yellow indicates **nodes in progress**, and red indicates **slowest nodes**, typically joins, as shown below: - -.. image:: /_static/images/nodes.png - -Viewing Query Statistics -^^^^^^^^^^^^ -The following statistical information is displayed in the top left corner, as shown in the figure above: - -* **Query Statistics**: - - * **Elapsed** - the total time taken for the query to complete. - * **Result rows** - the amount of rows fetched. - * **Running nodes completion** - * **Total query completion** - the amount of the total execution tree that was executed (nodes marked green). - -* **Slowest Nodes** information is displayed in the top right corner in red text. Clicking the slowest node centers automatically on that node in the execution tree. - -You can also view the following **Node Statistics** in the top right corner for each individual node by clicking a node: - -.. list-table:: - :widths: 45 160 - :header-rows: 1 - - * - Element - - Description - * - Node type - - Shows the node type. - * - Status - - Shows the execution status. - * - Time - - The total time taken to execute. - * - Rows - - Shows the number of produced rows passed to the next node. - * - Chunks - - Shows number of produced chunks. - * - Average rows per chunk - - Shows the number of average rows per chunk. - * - Table (for **ReadTable** and joins only) - - Shows the table name. - * - Write (for joins only) - - Shows the total date size written to the disk. - * - Read (for **ReadTable** and joins only) - - Shows the total data size read from the disk. - -Note that you can scroll the Node Statistics table. You can also download the execution plan table in .csv format by clicking the download arrow |icon-download| in the upper-right corner. - -.. |icon-download| image:: /_static/images/studio_icon_download.png - :align: middle - -Using the Plain View -^^^^^^^^^^^^ -You can use the **Plain View** instead of viewing the execution tree by clicking **Plain View** |icon-plain| in the top right corner. The plain view displays the same information as shown in the execution tree in table format. - -.. |icon-plain| image:: /_static/images/studio_icon_plain.png - :align: middle - - - - -The plain view lets you view a query’s execution plan for monitoring purposes and highlights rows based on how long they ran relative to the entire query. - -This can be seen in the **timeSum** column as follows: - -* **Rows highlighted red** - longest runtime -* **Rows highlighted orange** - medium runtime -* **Rows highlighted yellow** - shortest runtime - -:ref:`Back to Viewing Statement and Query Results from the Results Panel` - -.. _sql_view_5.4.3: - -Viewing Wrapped Strings in the SQL View ------------------- -The SQL View panel allows you to more easily view certain queries, such as a long string that appears on one line. The SQL View makes it easier to see by wrapping it so that you can see the entire string at once. It also reformats and organizes query syntax entered in the Statement panel for more easily locating particular segments of your queries. The SQL View is identical to the **Format SQL** feature in the Toolbar, allowing you to retain your originally constructed query while viewing a more intuitively structured snapshot of it. - -.. _save_results_to_clipboard_5.4.3: - -:ref:`Back to Viewing Statement and Query Results from the Results Panel` - -:ref:`Back to Executing Statements and Running Queries from the Editor` +.. _executing_statements_and_running_queries_from_the_editor: + +.. _editor_top: + +******************************************************** +Executing Statements and Running Queries from the Editor +******************************************************** + +The **Editor** is used for the following: + +* Selecting an active database and executing queries. +* Performing statement-related operations and showing metadata. +* Executing pre-defined queries. +* Writing queries and statements and viewing query results. + +The following is a brief description of the Editor panels: + + +.. list-table:: + :widths: 10 34 56 + :header-rows: 1 + + * - No. + - Element + - Description + * - 1 + - :ref:`Toolbar` + - Used to select the active database you want to work on, limit the number of rows, save query, etc. + * - 2 + - :ref:`Database Tree and System Queries panel` + - Shows a hierarchy tree of databases, views, tables, and columns + * - 3 + - :ref:`Statement panel` + - Used for writing queries and statements + * - 4 + - :ref:`Results panel` + - Shows query results and execution information. + + + +.. _top: + +.. _studio_editor_toolbar: + +Executing Statements from the Toolbar +===================================== + +You can access the following from the Toolbar pane: + +* **Database dropdown list** - select a database that you want to run statements on. + +* **Service dropdown list** - select a service that you want to run statements on. The options in the service dropdown menu depend on the database you select from the **Database** dropdown list. + +* **Execute** - lets you set which statements to execute. The **Execute** button toggles between **Execute** and **Stop**, and can be used to stop an active statement before it completes: + + * **Statements** - executes the statement at the location of the cursor. + * **Selected** - executes only the highlighted text. This mode should be used when executing subqueries or sections of large queries (as long as they are valid SQLs). + * **All** - executes all statements in a selected tab. + +* **Format SQL** - Lets you reformat and reindent statements. + +* **Download query** - Lets you download query text to your computer. + +* **Open query** - Lets you upload query text from your computer. + +* **Max Rows** - By default, the Editor fetches only the first 10,000 rows. You can modify this number by selecting an option from the **Max Rows** dropdown list. Note that setting a higher number may slow down your browser if the result is very large. This number is limited to 100,000 results. To see a higher number, you can save the results in a file or a table using the :ref:`create_table_as` command. + + +For more information on stopping active statements, see the :ref:`STOP_STATEMENT` command. + +.. _studio_editor_db_tree: + +Performing Statement-Related Operations from the Database Tree +============================================================== + +From the Database Tree you can perform statement-related operations and show metadata (such as a number indicating the amount of rows in the table). + + +The database object functions are used to perform the following: + +* The **SELECT** statement - copies the selected table's **columns** into the Statement panel as ``SELECT`` parameters. + +* The **copy** feature |icon-copy| - copies the selected table's **name** into the Statement panel. + +* The **additional operations** |icon-dots| - displays the following additional options: + + +.. |icon-user| image:: /_static/images/studio_icon_user.png + :align: middle + +.. |icon-dots| image:: /_static/images/studio_icon_dots.png + :align: middle + +.. |icon-editor| image:: /_static/images/studio_icon_editor.png + :align: middle + +.. |icon-copy| image:: /_static/images/studio_icon_copy.png + :align: middle + +.. |icon-select| image:: /_static/images/studio_icon_select.png + :align: middle + +.. |icon-filter| image:: /_static/images/studio_icon_filter.png + :align: middle + +.. |icon-ddl-edit| image:: /_static/images/studio_icon_ddl_edit.png + :align: middle + +.. |icon-run-optimizer| image:: /_static/images/studio_icon_run_optimizer.png + :align: middle + +.. |icon-plus| image:: /_static/images/studio_icon_plus.png + :align: middle + +.. |icon-close| image:: /_static/images/studio_icon_close.png + :align: middle + +.. |icon-left| image:: /_static/images/studio_icon_left.png + :align: middle + +.. |icon-right| image:: /_static/images/studio_icon_right.png + :align: middle + +.. |icon-format-sql| image:: /_static/images/studio_icon_format.png + :align: middle + +.. |icon-download-query| image:: /_static/images/studio_icon_download_query.png + :align: middle + +.. |icon-open-query| image:: /_static/images/studio_icon_open_query.png + :align: middle + +.. |icon-execute| image:: /_static/images/studio_icon_execute.png + :align: middle + +.. |icon-stop| image:: /_static/images/studio_icon_stop.png + :align: middle + +.. |icon-dashboard| image:: /_static/images/studio_icon_dashboard.png + :align: middle + +.. |icon-expand| image:: /_static/images/studio_icon_expand.png + :align: middle + +.. |icon-scale| image:: /_static/images/studio_icon_scale.png + :align: middle + +.. |icon-expand-down| image:: /_static/images/studio_icon_expand_down.png + :align: middle + +.. |icon-add| image:: /_static/images/studio_icon_add.png + :align: middle + +.. |icon-add-worker| image:: /_static/images/studio_icon_add_worker.png + :align: middle + +.. |keep-tabs| image:: /_static/images/studio_keep_tabs.png + :align: middle + + +.. list-table:: + :widths: 30 70 + :header-rows: 1 + + * - Function + - Description + * - Insert statement + - Generates an :ref:`INSERT` statement for the selected table in the editing area. + * - Delete statement + - Generates a :ref:`DELETE` statement for the selected table in the editing area. + * - Create Table As statement + - Generates a :ref:`CREATE TABLE AS` statement for the selected table in the editing area. + * - Rename statement + - Generates an :ref:`RENAME TABLE AS` statement for renaming the selected table in the editing area. + * - Adding column statement + - Generates an :ref:`ADD COLUMN` statement for adding columns to the selected table in the editing area. + * - Drop table statement + - Generates a ``DROP`` statement for the selected object in the editing area. + * - Table DDL + - Generates a DDL statement for the selected object in the editing area. To get the entire database DDL, click the |icon-ddl-edit| icon next to the database name in the tree root. + * - DDL Optimizer + - The :ref:`DDL Optimizer` lets you analyze database tables and recommends possible optimizations. + +Optimizing Database Tables Using the DDL Optimizer +-------------------------------------------------- + +The **DDL Optimizer** tab analyzes database tables and recommends possible optimizations according to SQreamDB's best practices. + +As described in the previous table, you can access the DDL Optimizer by clicking the **additional options icon** and selecting **DDL Optimizer**. + +The following table describes the DDL Optimizer screen: + +.. list-table:: + :widths: 15 75 + :header-rows: 1 + + * - Element + - Description + * - Column area + - Shows the column **names** and **column types** from the selected table. You can scroll down or to the right/left for long column lists. + * - Optimization area + - Shows the number of rows to sample as the basis for running an optimization, the default setting (1,000,000) when running an optimization (this is also the overhead threshold used when analyzing ``TEXT`` fields), and the default percent buffer to add to ``TEXT`` lengths (10%). Attempts to determine field nullability. + * - Run Optimizer + - Starts the optimization process. + +Clicking **Run Optimizer** adds a tab to the Statement panel showing the optimized results of the selected object. + +For more information, see :ref:`Optimization and Best Practices`. + +Executing Pre-Defined Queries from the System Queries Panel +----------------------------------------------------------- + +The **System Queries** panel lets you execute predefined queries and includes the following system query types: + +* **Catalog queries** - Used for analyzing table compression rates, users and permissions, etc. + +* **Admin queries** - Queries useful for SQreamDB database management. + +Clicking an item pastes the query into the Statement pane, and you can undo a previous operation by pressing **Ctrl + Z**. + +.. _studio_editor_statement_area: + +Writing Statements and Queries from the Statement Panel +======================================================= + +The multi-tabbed statement area is used for writing queries and statements, and is used in tandem with the toolbar. When writing and executing statements, you must first select a database from the **Database** dropdown menu in the toolbar. When you execute a statement, it passes through a series of statuses until completed. Knowing the status helps you with statement maintenance, and the statuses are shown in the **Results panel**. + +The auto-complete feature assists you when writing statements by suggesting statement options. + +The following table shows the statement statuses: + +.. list-table:: + :widths: 45 160 + :header-rows: 1 + + * - Status + - Description + * - Pending + - The statement is pending. + * - In queue + - The statement is waiting for execution. + * - Initializing + - The statement has entered execution checks. + * - Executing + - The statement is executing. + * - Statement stopped + - The statement has been stopped. + +You can add and name new tabs for each statement that you need to execute, and Studio preserves your created tabs when you switch between databases. You can add new tabs by clicking |icon-plus| , which creates a new tab to the right with a default name of SQL and an increasing number. This helps you keep track of your statements. + +You can also rename the default tab name by double-clicking it and typing a new name and write multiple statements in tandem in the same tab by separating them with semicolons (``;``).If too many tabs to fit into the Statement Pane are open at the same time, the tab arrows are displayed. You can scroll through the tabs by clicking |icon-left| or |icon-right|, and close tabs by clicking |icon-close|. You can also close all tabs at once by clicking **Close all** located to the right of the tabs. + + + +.. Keyboard shortcuts +.. ^^^^^^^^^^^^^^^^^^ + +.. :kbd:`Ctrl` +: kbd:`Enter` - Execute all queries in the statement area, or just the highlighted part of the query. + +.. :kbd:`Ctrl` + :kbd:`Space` - Auto-complete the current keyword + +.. :kbd:`Ctrl` + :kbd:`↑` - Switch to next tab. + +.. :kbd:`Ctrl` + :kbd:`↓` - Switch to previous tab + +.. _studio_editor_results: + + + +.. _studio_5.4.7_editor_results: + +.. _results_panel: + +Viewing Statement and Query Results from the Results Panel +========================================================== + +The results panel shows statement and query results. By default, only the first 10,000 results are returned, although you can modify this from the :ref:`studio_editor_toolbar`, as described above. By default, executing several statements together opens a separate results tab for each statement. Executing statements together executes them serially, and any failed statement cancels all subsequent executions. + +.. image:: /_static/images/results_panel.png + +The following is a brief description of the Results panel views highlighted in the figure above: + +.. list-table:: + :widths: 45 160 + :header-rows: 1 + + * - Element + - Description + * - :ref:`Results view` + - Lets you view search query results. + * - :ref:`Execution Details view` + - Lets you analyze your query for troubleshooting and optimization purposes. + * - :ref:`SQL view` + - Lets you see the SQL view. + + +.. _results_view: + + + +Searching Query Results in the Results View +------------------------------------------- + +The **Results view** lets you view search query results. + +From this view you can also do the following: + +* View the amount of time (in seconds) taken for a query to finish executing. +* Switch and scroll between tabs. +* Close all tabs at once. +* Enable keeping tabs by selecting **Keep tabs**. +* Sort column results. + +Saving Results to the Clipboard +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The **Save results to clipboard** function lets you save your results to the clipboard to paste into another text editor or into Excel for further analysis. + +.. _save_results_to_local_file: + +Saving Results to a Local File +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The **Save results to local file** functions lets you save your search query results to a local file. Clicking **Save results to local file** downloads the contents of the Results panel to an Excel sheet. You can then use copy and paste this content into other editors as needed. + +In the Results view you can also run parallel statements, as described in **Running Parallel Statements** below. + +.. _running_parallel_statements: + +Running Parallel Statements +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While Studio's default functionality is to open a new tab for each executed statement, Studio supports running parallel statements in one statement tab. Running parallel statements requires using macros and is useful for advanced users. + +The following shows the syntax for running parallel statements: + +.. code-block:: cry + + @@ parallel + $$ + SELECT 1; + SELECT 2; + SELECT 3; + $$ + + + + +.. _execution_details_view: + +.. _execution_tree: + +Execution Details View +---------------------- + +Clicking **Execution Details View** displays the **Execution Tree**, which is a chronological tree of processes that occurred to execute your queries. The purpose of the Execution Tree is to analyze all aspects of your query for troubleshooting and optimization purposes, such as resolving queries with an exceptionally long runtime. + +.. note:: The **Execution Details View** button is enabled only when a query takes longer than five seconds. + + +.. contents:: + :local: + :depth: 1 + + +From this screen you can scroll in, out, and around the execution tree with the mouse to analyze all aspects of your query. You can navigate around the execution tree by dragging or by using the mini-map in the bottom right corner. + +.. image:: /_static/images/execution_tree_1.png + +You can also search for query data by pressing **Ctrl+F** or clicking the search icon |icon-search| in the search field in the top right corner and typing text. + +.. image:: /_static/images/search_field.png + +Pressing **Enter** takes you directly to the next result matching your search criteria, and pressing **Shift + Enter** takes you directly to the previous result. You can also search next and previous results using the up and down arrows. + +.. |icon-search| image:: /_static/images/studio_icon_search.png + :align: middle + +The nodes are color-coded based on the following: + +* **Slow nodes** - red +* **In progress nodes** - yellow +* **Completed nodes** - green +* **Pending nodes** - white +* **Currently selected node** - blue +* **Search result node** - purple (in the mini-map) + +The execution tree displays the same information as shown in the plain view in tree format. + +The Execution Tree tracks each phase of your query in real time as a vertical tree of nodes. Each node refers to an operation that occurred on the GPU or CPU. When a phase is completed, the next branch begins to its right until the entire query is complete. Joins are displayed as two parallel branches merged together in a node called **Join**, as shown in the figure above. The nodes are connected by a line indicating the number of rows passed from one node to the next. The width of the line indicates the amount of rows on a logarithmic scale. + +Each node displays a number displaying its **node ID**, its **type**, **table name** (if relevant), **status**, and **runtime**. The nodes are color-coded for easy identification. Green nodes indicate **completed nodes**, yellow indicates **nodes in progress**, and red indicates **slowest nodes**, typically joins, as shown below: + +.. image:: /_static/images/nodes.png + +Viewing Query Statistics +^^^^^^^^^^^^^^^^^^^^^^^^ + +The following statistical information is displayed in the top left corner, as shown in the figure above: + +* **Query Statistics**: + + * **Elapsed** - the total time taken for the query to complete. + * **Result rows** - the amount of rows fetched. + * **Running nodes completion** + * **Total query completion** - the amount of the total execution tree that was executed (nodes marked green). + +* **Slowest Nodes** information is displayed in the top right corner in red text. Clicking the slowest node centers automatically on that node in the execution tree. + +You can also view the following **Node Statistics** in the top right corner for each individual node by clicking a node: + +.. list-table:: + :widths: 45 160 + :header-rows: 1 + + * - Element + - Description + * - Node type + - Shows the node type. + * - Status + - Shows the execution status. + * - Time + - The total time taken to execute. + * - Rows + - Shows the number of produced rows passed to the next node. + * - Chunks + - Shows number of produced chunks. + * - Average rows per chunk + - Shows the number of average rows per chunk. + * - Table (for **ReadTable** and joins only) + - Shows the table name. + * - Write (for joins only) + - Shows the total date size written to the disk. + * - Read (for **ReadTable** and joins only) + - Shows the total data size read from the disk. + +Note that you can scroll the Node Statistics table. You can also download the execution plan table in .csv format by clicking the download arrow |icon-download| in the upper-right corner. + +.. |icon-download| image:: /_static/images/studio_icon_download.png + :align: middle + +Using the Plain View +^^^^^^^^^^^^^^^^^^^^ + +You can use the **Plain View** instead of viewing the execution tree by clicking **Plain View** |icon-plain| in the top right corner. The plain view displays the same information as shown in the execution tree in table format. + +.. |icon-plain| image:: /_static/images/studio_icon_plain.png + :align: middle + + + + +The plain view lets you view a query’s execution plan for monitoring purposes and highlights rows based on how long they ran relative to the entire query. + +This can be seen in the **timeSum** column as follows: + +* **Rows highlighted red** - longest runtime +* **Rows highlighted orange** - medium runtime +* **Rows highlighted yellow** - shortest runtime + + + +.. _sql_view: + +Viewing Wrapped Strings in the SQL View +--------------------------------------- + +The SQL View panel allows you to more easily view certain queries, such as a long string that appears on one line. The SQL View makes it easier to see by wrapping it so that you can see the entire string at once. It also reformats and organizes query syntax entered in the Statement panel for more easily locating particular segments of your queries. The SQL View is identical to the **Format SQL** feature in the Toolbar, allowing you to retain your originally constructed query while viewing a more intuitively structured snapshot of it. + +.. _save_results_to_clipboard: diff --git a/sqream_studio_5.4.3/getting_started.rst b/sqream_studio/getting_started.rst similarity index 64% rename from sqream_studio_5.4.3/getting_started.rst rename to sqream_studio/getting_started.rst index 3b9644cdc..e5f1903c7 100644 --- a/sqream_studio_5.4.3/getting_started.rst +++ b/sqream_studio/getting_started.rst @@ -1,61 +1,67 @@ -.. _getting_started: - -**************************** -Getting Started with SQream Acceleration Studio 5.4.3 -**************************** -Setting Up and Starting Studio ----------------- -Studio is included with all `dockerized installations of SQream DB `_. When starting Studio, it listens on the local machine on port 8080. - -Logging In to Studio ---------------- -**To log in to SQream Studio:** - -1. Open a browser to the host on **port 8080**. - - For example, if your machine IP address is ``192.168.0.100``, insert the IP address into the browser as shown below: - - .. code-block:: console - - $ http://192.168.0.100:8080 - -2. Fill in your SQream DB login credentials. These are the same credentials used for :ref:`sqream sql` or JDBC. - - When you sign in, the License Warning is displayed. - -Navigating Studio's Main Features -------------- -When you log in, you are automatically taken to the **Editor** screen. The Studio's main functions are displayed in the **Navigation** pane on the left side of the screen. - -From here you can navigate between the main areas of the Studio: - -.. list-table:: - :widths: 10 90 - :header-rows: 1 - - * - Element - - Description - * - :ref:`Dashboard` - - Lets you monitor system health and manage queues and workers. - * - :ref:`Editor` - - Lets you select databases, perform statement operations, and write and execute queries. - * - :ref:`Logs` - - Lets you view usage logs. - * - :ref:`Roles` - - Lets you create users and manage user permissions. - * - :ref:`Configuration` - - Lets you configure your instance of SQream. - -By clicking the user icon, you can also use it for logging out and viewing the following: - -* User information -* Connection type -* SQream version -* SQream Studio version -* License expiration date -* License storage capacity -* Log out - -.. _back_to_dashboard_5.4.3: - -.. _studio_dashboard_5.4.3: +.. _getting_started: + +*********************************************** +Getting Started with SQream Acceleration Studio +*********************************************** + +Setting Up and Starting Studio +------------------------------ + +When starting Studio, it listens on the local machine on port 8080. + +Logging In to Studio +-------------------- + +1. Open a browser to the host on **port 8080**. + + For example, if your machine IP address is ``192.168.0.100``, insert the IP address into the browser as shown below: + + .. code-block:: console + + $ http://192.168.0.100:8080 + +2. Fill in your SQream DB login credentials. These are the same credentials used for :ref:`sqream sql` or JDBC. + + When you sign in, the License Warning is displayed. + +.. _monitoring_workers_and_services_from_the_dashboard: + +Navigating Studio's Main Features +--------------------------------- + +When you log in, you are automatically taken to the **Editor** screen. The Studio's main functions are displayed in the **Navigation** pane on the left side of the screen. + +From here you can navigate between the main areas of the Studio: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - Element + - Description + * - :ref:`Editor` + - Lets you select databases, perform statement operations, and write and execute queries. + * - :ref:`Logs` + - Lets you view usage logs. + * - :ref:`Roles` + - Lets you create users and manage user permissions. + * - :ref:`Configuration` + - Lets you configure your instance of SQream. + +By clicking the user icon, you can view the following: + +* User information +* Connection type +* SQream version +* SQream Studio version +* License expiration date +* License storage capacity +* :ref:`Activity report` +* Log out + +.. _view_activity_report: + +View Activity Report +-------------------- + +The **View activity report** menu item enables you to monitor storage and resource usage, including GPUs, workers, and machines. You can select different time frames to view cluster activity and export the data as a PDF for use in financial records, briefings, or quarterly and yearly reports. \ No newline at end of file diff --git a/sqream_studio/index.rst b/sqream_studio/index.rst new file mode 100644 index 000000000..ad5e36729 --- /dev/null +++ b/sqream_studio/index.rst @@ -0,0 +1,17 @@ +.. _sqream_studio_: + +******************************** +Acceleration Studio +******************************** + +The SQreamDB Acceleration Studio 5.8.0 is a web-based client for use with SQreamDB. Studio provides users with all functionality available from the command line in an intuitive and easy-to-use format. This includes running statements, managing roles and permissions, and managing SQreamDB clusters. + +.. toctree:: + :maxdepth: 1 + :glob: + + getting_started + executing_statements_and_running_queries_from_the_editor + viewing_logs + creating_assigning_and_managing_roles_and_permissions + configuring_your_instance_of_sqream \ No newline at end of file diff --git a/sqream_studio_5.4.3/viewing_logs.rst b/sqream_studio/viewing_logs.rst similarity index 82% rename from sqream_studio_5.4.3/viewing_logs.rst rename to sqream_studio/viewing_logs.rst index 0a8350a45..8ca1a6186 100644 --- a/sqream_studio_5.4.3/viewing_logs.rst +++ b/sqream_studio/viewing_logs.rst @@ -1,122 +1,128 @@ -.. _viewing_logs: - -.. _logs_top_5.4.3: - -**************************** -Viewing Logs -**************************** -The **Logs** screen is used for viewing logs and includes the following elements: - -.. list-table:: - :widths: 15 75 - :header-rows: 1 - - * - Element - - Description - * - :ref:`Filter area` - - Lets you filter the data shown in the table. - * - :ref:`Query tab` - - Shows basic query information logs, such as query number and the time the query was run. - * - :ref:`Session tab` - - Shows basic session information logs, such as session ID and user name. - * - :ref:`System tab` - - Shows all system logs. - * - :ref:`Log lines tab` - - Shows the total amount of log lines. - - -.. _filter_5.4.3: - -Filtering Table Data -------------- -From the Logs tab, from the **FILTERS** area you can also apply the **TIMESPAN**, **ONLY ERRORS**, and additional filters (**Add**). The **Timespan** filter lets you select a timespan. The **Only Errors** toggle button lets you show all queries, or only queries that generated errors. The **Add** button lets you add additional filters to the data shown in the table. The **Filter** button applies the selected filter(s). - -Other filters require you to select an item from a dropdown menu: - -* INFO -* WARNING -* ERROR -* FATAL -* SYSTEM - -You can also export a record of all of your currently filtered logs in Excel format by clicking **Download** located above the Filter area. - -.. _queries_5.4.3: - -:ref:`Back to Viewing Logs` - - -Viewing Query Logs ----------- -The **QUERIES** log area shows basic query information, such as query number and the time the query was run. The number next to the title indicates the amount of queries that have been run. - -From the Queries area you can see and sort by the following: - -* Query ID -* Start time -* Query -* Compilation duration -* Execution duration -* Total duration -* Details (execution details, error details, successful query details) - -In the Queries table, you can click on the **Statement ID** and **Query** items to set them as your filters. In the **Details** column you can also access additional details by clicking one of the **Details** options for a more detailed explanation of the query. - -:ref:`Back to Viewing Logs` - -.. _sessions_5.4.3: - -Viewing Session Logs ----------- -The **SESSIONS** tab shows the sessions log table and is used for viewing activity that has occurred during your sessions. The number at the top indicates the amount of sessions that have occurred. - -From here you can see and sort by the following: - -* Timestamp -* Connection ID -* Username -* Client IP -* Login (Success or Failed) -* Duration (of session) -* Configuration Changes - -In the Sessions table, you can click on the **Timestamp**, **Connection ID**, and **Username** items to set them as your filters. - -:ref:`Back to Viewing Logs` - -.. _system_5.4.3: - -Viewing System Logs ----------- -The **SYSTEM** tab shows the system log table and is used for viewing all system logs. The number at the top indicates the amount of sessions that have occurred. Because system logs occur less frequently than queries and sessions, you may need to increase the filter timespan for the table to display any system logs. - -From here you can see and sort by the following: - -* Timestamp -* Log type -* Message - -In the Systems table, you can click on the **Timestamp** and **Log type** items to set them as your filters. In the **Message** column, you can also click on an item to show more information about the message. - -:ref:`Back to Viewing Logs` - -.. _log_lines_5.4.3: - -Viewing All Log Lines ----------- -The **LOG LINES** tab is used for viewing the total amount of log lines in a table. From here users can view a more granular breakdown of log information collected by Studio. The other tabs (QUERIES, SESSIONS, and SYSTEM) show a filtered form of the raw log lines. For example, the QUERIES tab shows an aggregation of several log lines. - -From here you can see and sort by the following: - -* Timestamp -* Message level -* Worker hostname -* Worker port -* Connection ID -* Database name -* User name -* Statement ID - -In the **LOG LINES** table, you can click on any of the items to set them as your filters. - -:ref:`Back to Viewing Logs` \ No newline at end of file +.. _viewing_logs: + +.. _logs_top_5.4.7: + +************ +Viewing Logs +************ + +The **Logs** screen is used for viewing logs and includes the following elements: + +.. list-table:: + :widths: 15 75 + :header-rows: 1 + + * - Element + - Description + * - :ref:`Filter area` + - Lets you filter the data shown in the table. + * - :ref:`Query tab` + - Shows basic query information logs, such as query number and the time the query was run. + * - :ref:`Session tab` + - Shows basic session information logs, such as session ID and user name. + * - :ref:`System tab` + - Shows all system logs. + * - :ref:`Log lines tab` + - Shows the total amount of log lines. + + +.. _filter_5.4.7: + +Filtering Table Data +-------------------- + +From the Logs tab, from the **FILTERS** area you can also apply the **TIMESPAN**, **ONLY ERRORS**, and additional filters (**Add**). The **Timespan** filter lets you select a timespan. The **Only Errors** toggle button lets you show all queries, or only queries that generated errors. The **Add** button lets you add additional filters to the data shown in the table. The **Filter** button applies the selected filter(s). + +Other filters require you to select an item from a dropdown menu: + +* INFO +* WARNING +* ERROR +* FATAL +* SYSTEM + +You can also export a record of all of your currently filtered logs in Excel format by clicking **Download** located above the Filter area. + +.. _queries_5.4.7: + +:ref:`Back to Viewing Logs` + + +Viewing Query Logs +------------------ + +The **QUERIES** log area shows basic query information, such as query number and the time the query was run. The number next to the title indicates the amount of queries that have been run. + +From the Queries area you can see and sort by the following: + +* Query ID +* Start time +* Query +* Compilation duration +* Execution duration +* Total duration +* Details (execution details, error details, successful query details) + +In the Queries table, you can click on the **Statement ID** and **Query** items to set them as your filters. In the **Details** column you can also access additional details by clicking one of the **Details** options for a more detailed explanation of the query. + +:ref:`Back to Viewing Logs` + +.. _sessions_5.4.7: + +Viewing Session Logs +-------------------- + +The **SESSIONS** tab shows the sessions log table and is used for viewing activity that has occurred during your sessions. The number at the top indicates the amount of sessions that have occurred. + +From here you can see and sort by the following: + +* Timestamp +* Connection ID +* Username +* Client IP +* Login (Success or Failed) +* Duration (of session) +* Configuration Changes + +In the Sessions table, you can click on the **Timestamp**, **Connection ID**, and **Username** items to set them as your filters. + +:ref:`Back to Viewing Logs` + +.. _system_5.4.7: + +Viewing System Logs +------------------- + +The **SYSTEM** tab shows the system log table and is used for viewing all system logs. The number at the top indicates the amount of sessions that have occurred. Because system logs occur less frequently than queries and sessions, you may need to increase the filter timespan for the table to display any system logs. + +From here you can see and sort by the following: + +* Timestamp +* Log type +* Message + +In the Systems table, you can click on the **Timestamp** and **Log type** items to set them as your filters. In the **Message** column, you can also click on an item to show more information about the message. + +:ref:`Back to Viewing Logs` + +.. _log_lines_5.4.7: + +Viewing All Log Lines +--------------------- + +The **LOG LINES** tab is used for viewing the total amount of log lines in a table. From here users can view a more granular breakdown of log information collected by Studio. The other tabs (QUERIES, SESSIONS, and SYSTEM) show a filtered form of the raw log lines. For example, the QUERIES tab shows an aggregation of several log lines. + +From here you can see and sort by the following: + +* Timestamp +* Message level +* Worker hostname +* Worker port +* Connection ID +* Database name +* User name +* Statement ID + +In the **LOG LINES** table, you can click on any of the items to set them as your filters. + +:ref:`Back to Viewing Logs` \ No newline at end of file diff --git a/sqream_studio_5.4.3/index.rst b/sqream_studio_5.4.3/index.rst deleted file mode 100644 index ac607b121..000000000 --- a/sqream_studio_5.4.3/index.rst +++ /dev/null @@ -1,19 +0,0 @@ -.. _sqream_studio_5.4.3: - -********************************** -SQream Acceleration Studio 5.4.3 -********************************** -The SQream Acceleration Studio is a web-based client for use with SQream. Studio provides users with all functionality available from the command line in an intuitive and easy-to-use format. This includes running statements, managing roles and permissions, and managing SQream clusters. - -This section describes how to use the SQream Accleration Studio version 5.4.3: - -.. toctree:: - :maxdepth: 1 - :glob: - - getting_started - monitoring_workers_and_services_from_the_dashboard - executing_statements_and_running_queries_from_the_editor - viewing_logs - creating_assigning_and_managing_roles_and_permissions - configuring_your_instance_of_sqream \ No newline at end of file diff --git a/sqream_studio_5.4.3/monitoring_workers_and_services_from_the_dashboard.rst b/sqream_studio_5.4.3/monitoring_workers_and_services_from_the_dashboard.rst deleted file mode 100644 index e30962f37..000000000 --- a/sqream_studio_5.4.3/monitoring_workers_and_services_from_the_dashboard.rst +++ /dev/null @@ -1,265 +0,0 @@ -.. _monitoring_workers_and_services_from_the_dashboard: - -.. _back_to_dashboard_5.4.3: - -**************************** -Monitoring Workers and Services from the Dashboard -**************************** -The **Dashboard** is used for the following: - -* Monitoring system health. -* Viewing, monitoring, and adding defined service queues. -* Viewing and managing worker status and add workers. - -The following is an image of the Dashboard: - -.. image:: /_static/images/dashboard.png - -You can only access the Dashboard if you signed in with a ``SUPERUSER`` role. - -The following is a brief description of the Dashboard panels: - -.. list-table:: - :widths: 10 25 65 - :header-rows: 1 - - * - No. - - Element - - Description - * - 1 - - :ref:`Services panel` - - Used for viewing and monitoring the defined service queues. - * - 2 - - :ref:`Workers panel` - - Monitors system health and shows each Sqreamd worker running in the cluster. - * - 3 - - :ref:`License information` - - Shows the remaining amount of days left on your license. - - -.. _data_storage_panel_5.4.3: - - - -:ref:`Back to Monitoring Workers and Services from the Dashboard` - -.. _services_panel_5.4.3: - -Subscribing to Workers from the Services Panel --------------------------- -Services are used to categorize and associate (also known as **subscribing**) workers to particular services. The **Service** panel is used for viewing, monitoring, and adding defined `service queues `_. - - - -The following is a brief description of each pane: - -.. list-table:: - :widths: 10 90 - :header-rows: 1 - - * - No. - - Description - * - 1 - - Adds a worker to the selected service. - * - 2 - - Shows the service name. - * - 3 - - Shows a trend graph of queued statements loaded over time. - * - 4 - - Adds a service. - * - 5 - - Shows the currently processed queries belonging to the service/total queries for that service in the system (including queued queries). - -Adding A Service -^^^^^^^^^^^^^^^^^^^^^ -You can add a service by clicking **+ Add** and defining the service name. - -.. note:: If you do not associate a worker with the new service, it will not be created. - -You can manage workers from the **Workers** panel. For more information about managing workers, see the following: - -* :ref:`Managing Workers from the Workers Panel` -* `Workers `_ - -:ref:`Back to Monitoring Workers and Services from the Dashboard` - -.. _workers_panel_5.4.3: - -Managing Workers from the Workers Panel ------------- -From the **Workers** panel you can do the following: - -* :ref:`View workers ` -* :ref:`Add a worker to a service` -* :ref:`View a worker's active query information` -* :ref:`View a worker's execution plan` - -.. _view_workers_5.4.3: - -Viewing Workers -^^^^^^^^ -The **Worker** panel shows each worker (``sqreamd``) running in the cluster. Each worker has a status bar that represents the status over time. The status bar is divided into 20 equal segments, showing the most dominant activity in that segment. - -From the **Scale** dropdown menu you can set the time scale of the displayed information -You can hover over segments in the status bar to see the date and time corresponding to each activity type: - -* **Idle** – the worker is idle and available for statements. -* **Compiling** – the worker is compiling a statement and is preparing for execution. -* **Executing** – the worker is executing a statement after compilation. -* **Stopped** – the worker was stopped (either deliberately or due to an error). -* **Waiting** – the worker was waiting on an object locked by another worker. - -.. _add_worker_to_service_5.4.3: - -Adding A Worker to A Service -^^^^^^^^^^^^^^^^^^^^^ -You can add a worker to a service by clicking the **add** button. - - - -Clicking the **add** button shows the selected service's workers. You can add the selected worker to the service by clicking **Add Worker**. Adding a worker to a service does not break associations already made between that worker and other services. - - -.. _view_worker_query_information_5.4.3: - -Viewing A Worker's Active Query Information -^^^^^^^^^^^^^^^^^^^^^ -You can view a worker's active query information by clicking **Queries**, which displays them in the selected service. - - -Each statement shows the **query ID**, **status**, **service queue**, **elapsed time**, **execution time**, and **estimated completion status**. In addition, each statement can be stopped or expanded to show its execution plan and progress. For more information on viewing a statement's execution plan and progress, see :ref:`Viewing a Worker's Execution Plan ` below. - -Viewing A Worker's Host Utilization -^^^^^^^^^^^^^^^^^^^^^ - -While viewing a worker's query information, clicking the **down arrow** expands to show the host resource utilization. - - - -The graphs show the resource utilization trends over time, and the **CPU memory** and **utilization** and the **GPU utilization** values on the right. You can hover over the graph to see more information about the activity at any point on the graph. - -Error notifications related to statements are displayed, and you can hover over them for more information about the error. - - -.. _view_worker_execution_plan_5.4.3: - -Viewing a Worker's Execution Plan -^^^^^^^^^^^^^^^^^^^^^ - -Clicking the ellipsis in a service shows the following additional options: - -* **Stop Query** - stops the query. -* **Show Execution Plan** - shows the execution plan as a table. The columns in the **Show Execution Plan** table can be sorted. - -For more information on the current query plan, see `SHOW_NODE_INFO `_. For more information on checking active sessions across the cluster, see `SHOW_SERVER_STATUS `_. - -.. include:: /reference/sql/sql_statements/monitoring_commands/show_server_status.rst - :start-line: 67 - :end-line: 84 - -Managing Worker Status -^^^^^^^^^^^^^^^^^^^^^ - -In some cases you may want to stop or restart workers for maintenance purposes. Each Worker line has a :kbd:`⋮` menu used for stopping, starting, or restarting workers. - - -Starting or restarting workers terminates all queries related to that worker. When you stop a worker, its background turns gray. - - - - -.. |icon-user| image:: /_static/images/studio_icon_user.png - :align: middle - -.. |icon-dots| image:: /_static/images/studio_icon_dots.png - :align: middle - -.. |icon-editor| image:: /_static/images/studio_icon_editor.png - :align: middle - -.. |icon-copy| image:: /_static/images/studio_icon_copy.png - :align: middle - -.. |icon-select| image:: /_static/images/studio_icon_select.png - :align: middle - -.. |icon-dots| image:: /_static/images/studio_icon_dots.png - :align: middle - -.. |icon-filter| image:: /_static/images/studio_icon_filter.png - :align: middle - -.. |icon-ddl-edit| image:: /_static/images/studio_icon_ddl_edit.png - :align: middle - -.. |icon-run-optimizer| image:: /_static/images/studio_icon_run_optimizer.png - :align: middle - -.. |icon-generate-create-statement| image:: /_static/images/studio_icon_generate_create_statement.png - :align: middle - -.. |icon-plus| image:: /_static/images/studio_icon_plus.png - :align: middle - -.. |icon-close| image:: /_static/images/studio_icon_close.png - :align: middle - -.. |icon-left| image:: /_static/images/studio_icon_left.png - :align: middle - -.. |icon-right| image:: /_static/images/studio_icon_right.png - :align: middle - -.. |icon-format-sql| image:: /_static/images/studio_icon_format.png - :align: middle - -.. |icon-download-query| image:: /_static/images/studio_icon_download_query.png - :align: middle - -.. |icon-open-query| image:: /_static/images/studio_icon_open_query.png - :align: middle - -.. |icon-execute| image:: /_static/images/studio_icon_execute.png - :align: middle - -.. |icon-stop| image:: /_static/images/studio_icon_stop.png - :align: middle - -.. |icon-dashboard| image:: /_static/images/studio_icon_dashboard.png - :align: middle - -.. |icon-expand| image:: /_static/images/studio_icon_expand.png - :align: middle - -.. |icon-scale| image:: /_static/images/studio_icon_scale.png - :align: middle - -.. |icon-expand-down| image:: /_static/images/studio_icon_expand_down.png - :align: middle - -.. |icon-add| image:: /_static/images/studio_icon_add.png - :align: middle - -.. |icon-add-worker| image:: /_static/images/studio_icon_add_worker.png - :align: middle - -.. |keep-tabs| image:: /_static/images/studio_keep_tabs.png - :align: middle - -:ref:`Back to Monitoring Workers and Services from the Dashboard` - - - -.. _license_information_5.4.3: - -License Information ----------------------- -The license information section shows the following: - - * The amount of time in days remaining on the license. - * The license storage capacity. - -.. image:: /_static/images/license_storage_capacity.png - - -:ref:`Back to Monitoring Workers and Services from the Dashboard` diff --git a/sqreamdb_on_aws/index.rst b/sqreamdb_on_aws/index.rst new file mode 100644 index 000000000..724745ec5 --- /dev/null +++ b/sqreamdb_on_aws/index.rst @@ -0,0 +1,134 @@ +.. _sqreamdb_on_aws: + +*************** +SQreamDB on AWS +*************** + +Private cloud deployment on AWS provides the AWS's scalable infrastructure, flexible resource management, and cost-efficient services. + +The SQreamDB data processing and analytics acceleration platform on AWS marketplace is available `here `_. + +.. contents:: + :local: + :depth: 1 + +Before You Begin +================ + +It is essential that you have the following: + +* An AWS account +* An existing EC2 key pair +* AWS administrator permissions + +Usage Notes +=========== + +If you need to access data from an external bucket (one that is not part of the SqreamDB installation or used for ``tablespaceURL`` or ``tempPath``), you must manually grant access. Alternatively, you can copy data to and from the bucket using the AWS_ID and AWS_Secret parameters. + +Configuration on AWS +==================== + +Under the **CloudFormation** > **Stacks** > **Specify stack details** tab, configure the following parameters: + +.. list-table:: + :widths: auto + :header-rows: 1 + + * - Parameter + - Description + * - ``environment`` + - The identifier used for naming all created resources + * - ``region`` + - The AWS region where the machines will be deployed. For optimal performance and cost efficiency, the S3 bucket storing Sqream data should be in the same region + * - ``availability_zones`` + - The availability zone within the specified region to place the machines. It should support GPU-enabled instances + * - ``key_name`` + - The name of an existing EC2 key pair in your AWS account, used to log into all created instances + * - ``office_cidrs`` + - A list of IP ranges (CIDRs) that are allowed access to the product and SSH access to the machines for security purposes + * - ``sqream_ami`` + - The Amazon Machine Image (AMI) pre-configured with Sqream. For Sqream 4.7, use ``ami-07d82637b2dab962e`` + * - ``ui_instance_type`` + - The instance type for the UI server. A machine with 16GB of RAM and moderate CPU resources, such as a ``r6i.2xlarge``, is recommended + * - ``md_instance_type`` + - The instance type for the metadata and server picker machine. Recommended starting point is a ``r6i.2xlarge``, but it may vary depending on your workload + * - ``workers_instance_type`` + - The instance type for the worker machines, which must be GPU-enabled. Recommended options include ``g6.8xlarge`` or ``g5.8xlarge`` + * - ``workers_count`` + - The number of worker machines to be created + * - ``tablespaceURL`` + - The location where the database will be stored, ideally in the same region as the instances to minimize costs. Important: A ``terraform_important`` directory will also be created here and should not be deleted unless the installation is completely removed. Deleting this directory prematurely may cause issues during upgrades or changes, leading to a full reinstall of the environment + * - ``tempPath`` + - The temporary storage path, usually set to ``/mnt/ephemeral``, though it can also point to an S3 bucket. This storage is used for running queries and is automatically cleared once the queries are completed + +License +======= + +#. Get a list of machines using your AWS console by filtering EC2 instances with: + + * The **worker** keyword + * The environment name given + * The `AWS instance ID for each EC2 `_ + +#. Send the machines to SqreamDB to generate license. + +#. On each machine, install the license by: + + a. Connecting to the machine (check "connect to AWS machine" section above) - only available from IPs given access by parameter ``office_cidrs``. + b. Create a new file **in this path:** + + .. code-block:: console + + sudo vi /etc/sqream/license.enc + +#. Place the license given by SqreamDB in it. + +#. Wait 1-2 minutes for the Worker to automatically start. + + +Connecting to the Machine +========================= + +For security purposes, all machines are assigned private IP addresses. To enable connections, an EC2 endpoint is configured during installation. You can connect either via the AWS Console UI or through the CLI. + +Connecting Using the CLI +------------------------ + +You'll need your machine ID and region and the type of key file. + +Run the following command: + +.. code-block:: console + + ssh -i ec2-user@i- -o ProxyCommand="aws ec2-instance-connect opentunnel --instance-id i- --region=" + +Connecting to SQreamDB +====================== + +During installation, a Network Load Balancer (NLB) named ``sqream--nlb`` is created to route traffic to various machines. After installation, SqreamDB is accessible via the NLB's DNS name. For the SqreamDB UI, use this URL in any browser, or connect to it from third-party software components. + +#. To get the URL using AWS Console, copy the DNS of the Network Load Balancer. + +Connection Troubleshooting +-------------------------- + +If you are unable to connect, please ensure the following: + +* The license file has been generated and distributed to all Worker nodes. +* Your IP address is included in the ``office_cidrs`` parameter, as only the specified IPs are allowed access to the cluster. + +Adding a Signed Certificate to the Cluster +========================================== + +To add your signed certificate to the Sqream cluster, follow these steps: + +#. `Create a new listener `_ for the Network Load Balancer (sqream--nlb) using the TLS protocol. + +#. A TLS target group that points to the UI machine has already been created for your convenience. You can use it for the new listener. The group name is ``sqream--nlb-ui-443``. + +#. If you require a new DNS, you can retrieve the public IP of the Network Load Balancer by either: + + * Running the host CLI command with the NLB's URL + + * Finding it in the AWS console \ No newline at end of file diff --git a/studio_login_5.3.2.png b/studio_login_5.3.2.png deleted file mode 100644 index e888aca13..000000000 Binary files a/studio_login_5.3.2.png and /dev/null differ diff --git a/third_party_tools/client_drivers/cpp/connect_test.cpp b/third_party_tools/client_drivers/cpp/connect_test.cpp deleted file mode 100644 index dc199f06b..000000000 --- a/third_party_tools/client_drivers/cpp/connect_test.cpp +++ /dev/null @@ -1,34 +0,0 @@ -// Trivial example - -#include - -#include "sqream.h" - -int main () { - - sqream::driver sqc; - - // Connection parameters: Hostname, Port, Use SSL, Username, Password, - // Database name, Service name - sqc.connect("127.0.0.1", 5000, false, "rhendricks", "Tr0ub4dor&3", - "raviga", "sqream"); - - // create table with data - run_direct_query(&sqc, "CREATE TABLE test_table (x int)"); - run_direct_query(&sqc, "INSERT INTO test_table VALUES (5), (6), (7), (8)"); - - // query it - sqc.new_query("SELECT * FROM test_table"); - sqc.execute_query(); - - // See the results - while (sqc.next_query_row()) { - std::cout << "Received: " << sqc.get_int(0) << std::endl; - } - - sqc.finish_query(); - - // Close the connection completely - sqc.disconnect(); - -} diff --git a/third_party_tools/client_drivers/cpp/index.rst b/third_party_tools/client_drivers/cpp/index.rst deleted file mode 100644 index fbbf6fb39..000000000 --- a/third_party_tools/client_drivers/cpp/index.rst +++ /dev/null @@ -1,87 +0,0 @@ -.. _cpp_native: - -************************* -C++ Driver -************************* - -The SQream DB C++ driver allows C++ programs and tools to connect to SQream DB. - -This tutorial shows how to write a C++ program that uses this driver. - -.. contents:: In this topic: - :depth: 2 - :local: - - -Installing the C++ driver -================================== - -Prerequisites ----------------- - -The SQream DB C++ driver was built on 64-bit Linux, and is designed to work with RHEL 7 and Ubuntu 16.04 and newer. - -Getting the library ---------------------- - -The C++ driver is provided as a tarball containing the compiled ``libsqream.so`` file and a header ``sqream.h``. Get the driver from the `SQream Drivers page `_. The library can be integrated into your C++-based applications or projects. - - -Extract the tarball archive ------------------------------ - -Extract the library files from the tarball - -.. code-block:: console - - $ tar xf libsqream-3.0.tar.gz - -Examples -============================================== - -Assuming there is a SQream DB worker to connect to, we'll connect to it using the application and run some statements. - -Testing the connection to SQream DB --------------------------------------------- - -Download this file by right clicking and saving to your computer :download:`connect_test.cpp `. - -.. literalinclude:: connect_test.cpp - :language: cpp - :caption: Connect to SQream DB - :linenos: - - -Compiling and running the application -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To build this code, place the library and header file in ./libsqream-3.0/ and run - -.. code-block:: console - - $ g++ -Wall -Ilibsqream-3.0 -Llibsqream-3.0 -lsqream connect_test.cpp -o connect_test - $ ./connect_test - -Modify the ``-I`` and ``-L`` arguments to match the ``.so`` library and ``.h`` file if they are in another directory. - -Creating a table and inserting values --------------------------------------------- - -Download this file by right clicking and saving to your computer :download:`insert_test.cpp `. - -.. literalinclude:: insert_test.cpp - :language: cpp - :caption: Inserting data to a SQream DB table - :linenos: - - -Compiling and running the application -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To build this code, use - -.. code-block:: console - - $ g++ -Wall -Ilibsqream-3.0 -Llibsqream-3.0 -lsqream insert_test.cpp -o insert_test - $ ./insert_test - diff --git a/third_party_tools/client_drivers/cpp/insert_test.cpp b/third_party_tools/client_drivers/cpp/insert_test.cpp deleted file mode 100644 index 8a16618a4..000000000 --- a/third_party_tools/client_drivers/cpp/insert_test.cpp +++ /dev/null @@ -1,39 +0,0 @@ -// Insert with parameterized statement example - -#include - -#include "sqream.h" - -int main () { - - sqream::driver sqc; - - // Connection parameters: Hostname, Port, Use SSL, Username, Password, - // Database name, Service name - sqc.connect("127.0.0.1", 5000, false, "rhendricks", "Tr0ub4dor&3", - "raviga", "sqream"); - - run_direct_query(&sqc, - "CREATE TABLE animals (id INT NOT NULL, name VARCHAR(10) NOT NULL)"); - - // prepare the statement - sqc.new_query("INSERT INTO animals VALUES (?, ?)"); - sqc.execute_query(); - - // Data to insert - int row0[] = {1,2,3}; - std::string row1[] = {"Dog","Cat","Possum"}; - int len = sizeof(row0)/sizeof(row0[0]); - - for (int i = 0; i < len; ++i) { - sqc.set_int(0, row0[i]); - sqc.set_varchar(1, row1[i]); - sqc.next_query_row(); - } - - // This commits the insert - sqc.finish_query(); - - sqc.disconnect(); - -} diff --git a/third_party_tools/client_drivers/index.rst b/third_party_tools/client_drivers/index.rst deleted file mode 100644 index 2b486d47f..000000000 --- a/third_party_tools/client_drivers/index.rst +++ /dev/null @@ -1,110 +0,0 @@ -.. _client_drivers: - -************************************ -Client Drivers for |latest_version| -************************************ - -The guides on this page describe how to use the Sqream DB client drivers and client applications with SQream. - -Client Driver Downloads -============================= - -All Operating Systems ---------------------------- -The following are applicable to all operating systems: - -.. _jdbc: - -* **JDBC** - recommended installation via ``mvn``: - - * `JDBC .jar file `_ - sqream-jdbc-4.5.3 (.jar) - * `JDBC driver `_ - - -.. _python: - -* **Python** - Recommended installation via ``pip``: - - * `Python .tar file `_ - pysqream v3.1.3 (.tar.gz) - * `Python driver `_ - - -.. _nodejs: - -* **Node.JS** - Recommended installation via ``npm``: - - * `Node.JS `_ - sqream-v4.2.4 (.tar.gz) - * `Node.JS driver `_ - - -.. _tableau_connector: - -* **Tableau**: - - * `Tableau connector `_ - SQream (.taco) - * `Tableau manual installation `_ - - -.. _powerbi_connector: - -* **Power BI**: - - * `Power BI PowerQuery connector `_ - SQream (.mez) - * `Power BI manual installation `_ - - -Windows --------------- -The following are applicable to Windows: - -* **ODBC installer** - SQream Drivers v2020.2.0, with Tableau customizations. Please contact your `Sqream represenative `_ for this installer. - - For more information on installing and configuring ODBC on Windows, see :ref:`Install and configure ODBC on Windows `. - - -* **Net driver** - `SQream .Net driver v3.0.2 `_ - - - -Linux --------------- -The following are applicable to Linux: - -* `SQream SQL (x86_64) `_ - sqream-sql-v2020.1.1_stable.x86_64.tar.gz -* `Sqream SQL CLI Reference `_ - Interactive command-line SQL client for Intel-based machines - - :: - -* `SQream SQL*(IBM POWER9) `_ - sqream-sql-v2020.1.1_stable.ppc64le.tar.gz -* `Sqream SQL CLI Reference `_ - Interactive command-line SQL client for IBM POWER9-based machines - - :: - -* ODBC Installer - Please contact your SQream representative for this installer. - - :: - -* C++ connector - `libsqream-4.0 `_ -* `C++ shared object library `_ - - -.. toctree:: - :maxdepth: 4 - :caption: Client Driver Documentation: - :titlesonly: - - jdbc/index - python/index - nodejs/index - odbc/index - cpp/index - - - -.. rubric:: Need help? - -If you couldn't find what you're looking for, we're always happy to help. Visit `SQream's support portal `_ for additional support. - -.. rubric:: Looking for older drivers? - -If you're looking for an older version of SQream DB drivers, versions 1.10 through 2019.2.1 are available at https://sqream.com/product/client-drivers/. \ No newline at end of file diff --git a/third_party_tools/client_drivers/jdbc/index.rst b/third_party_tools/client_drivers/jdbc/index.rst deleted file mode 100644 index 42a04548f..000000000 --- a/third_party_tools/client_drivers/jdbc/index.rst +++ /dev/null @@ -1,162 +0,0 @@ -.. _java_jdbc: - -************************* -JDBC -************************* - -The SQream DB JDBC driver allows many Java applications and tools connect to SQream DB. -This tutorial shows how to write a Java application using the JDBC interface. - -The JDBC driver requires Java 1.8 or newer. - -.. contents:: In this topic: - :local: - -Installing the JDBC driver -================================== - -Prerequisites ----------------- - -The SQream DB JDBC driver requires Java 1.8 or newer. We recommend either Oracle Java or OpenJDK. - -**Oracle Java** - -Download and install Java 8 from Oracle for your platform - -https://www.java.com/en/download/manual.jsp - -**OpenJDK** - -For Linux and BSD, see https://openjdk.java.net/install/ - -For Windows, SQream recommends Zulu 8 https://www.azul.com/downloads/zulu-community/?&version=java-8-lts&architecture=x86-64-bit&package=jdk - -.. _get_jdbc_jar: - -Getting the JAR file ---------------------- - -The JDBC driver is provided as a zipped JAR file, available for download from the :ref:`client drivers download page`. This JAR file can integrate into your Java-based applications or projects. - - -Extract the zip archive -------------------------- - -Extract the JAR file from the zip archive - -.. code-block:: console - - $ unzip sqream-jdbc-4.3.0.zip - -Setting up the Class Path ----------------------------- - -To use the driver, the JAR named ``sqream-jdbc-.jar`` (for example, ``sqream-jdbc-4.3.0.jar``) needs to be included in the class path, either by putting it in the ``CLASSPATH`` environment variable, or by using flags on the relevant Java command line. - -For example, if the JDBC driver has been unzipped to ``/home/sqream/sqream-jdbc-4.3.0.jar``, the application should be run as follows: - -.. code-block:: console - - $ export CLASSPATH=/home/sqream/sqream-jdbc-4.3.0.jar:$CLASSPATH - $ java my_java_app - -An alternative method is to pass ``-classpath`` to the Java executable: - -.. code-block:: console - - $ java -classpath .:/home/sqream/sqream-jdbc-4.3.0.jar my_java_app - - -Connect to SQream DB with a JDBC application -============================================== - -Driver class --------------- - -Use ``com.sqream.jdbc.SQDriver`` as the driver class in the JDBC application. - - -.. _connection_string: - -Connection string --------------------- - -JDBC drivers rely on a connection string. Use the following syntax for SQream DB - -.. code-block:: text - - jdbc:Sqream:///;user=;password=sqream;[; ...] - -Connection parameters -^^^^^^^^^^^^^^^^^^^^^^^^ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Item - - Optional - - Default - - Description - * - ```` - - ✗ - - None - - Hostname and port of the SQream DB worker. For example, ``127.0.0.1:5000``, ``sqream.mynetwork.co:3108`` - * - ```` - - ✗ - - None - - Database name to connect to. For example, ``master`` - * - ``username=`` - - ✗ - - None - - Username of a role to use for connection. For example, ``username=rhendricks`` - * - ``password=`` - - ✗ - - None - - Specifies the password of the selected role. For example, ``password=Tr0ub4dor&3`` - * - ``service=`` - - ✓ - - ``sqream`` - - Specifices service queue to use. For example, ``service=etl`` - * - ```` - - ✓ - - ``false`` - - Specifies SSL for this connection. For example, ``ssl=true`` - * - ```` - - ✓ - - ``true`` - - Connect via load balancer (use only if exists, and check port). - -Connection string examples -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For a SQream DB cluster with load balancer and no service queues, with SSL - -.. code-block:: text - - jdbc:Sqream://sqream.mynetwork.co:3108/master;user=rhendricks;password=Tr0ub4dor&3;ssl=true;cluster=true - -Minimal example for a local, standalone SQream DB - -.. code-block:: text - - jdbc:Sqream://127.0.0.1:5000/master;user=rhendricks;password=Tr0ub4dor&3 - -For a SQream DB cluster with load balancer and a specific service queue named ``etl``, to the database named ``raviga`` - -.. code-block:: text - - jdbc:Sqream://sqream.mynetwork.co:3108/raviga;user=rhendricks;password=Tr0ub4dor&3;cluster=true;service=etl - - -Sample Java program --------------------- - -Download this file by right clicking and saving to your computer :download:`sample.java `. - -.. literalinclude:: sample.java - :language: java - :caption: JDBC application sample - :linenos: - diff --git a/third_party_tools/client_drivers/python/api-reference.rst b/third_party_tools/client_drivers/python/api-reference.rst deleted file mode 100644 index 28e1205e6..000000000 --- a/third_party_tools/client_drivers/python/api-reference.rst +++ /dev/null @@ -1,191 +0,0 @@ -.. _pysqream_api_reference: - -************************* -pysqream API reference -************************* - -The SQream Python connector allows Python programs to connect to SQream DB. - -pysqream conforms to Python DB-API specifications `PEP-249 `_ - - -The main module is pysqream, which contains the :py:meth:`Connection` class. - -.. method:: connect(host, port, database, username, password, clustered = False, use_ssl = False, service='sqream', reconnect_attempts=3, reconnect_interval=10) - - Creates a new :py:meth:`Connection` object and connects to SQream DB. - - host - SQream DB hostname or IP - - port - SQream DB port - - database - database name - - username - Username to use for connection - - password - Password for ``username`` - - clustered - Connect through load balancer, or direct to worker (Default: false - direct to worker) - - use_ssl - use SSL connection (default: false) - - service - Optional service queue (default: 'sqream') - - reconnect_attempts - Number of reconnection attempts to attempt before closing the connection - - reconnect_interval - Time in seconds between each reconnection attempt - -.. class:: Connection - - .. attribute:: arraysize - - Specifies the number of rows to fetch at a time with :py:meth:`~Connection.fetchmany`. Defaults to 1 - one row at a time. - - .. attribute:: rowcount - - Unused, always returns -1. - - .. attribute:: description - - Read-only attribute that contains result set metadata. - - This attribute is populated after a statement is executed. - - .. list-table:: - :widths: auto - :header-rows: 1 - - * - Value - - Description - * - ``name`` - - Column name - * - ``type_code`` - - Internal type code - * - ``display_size`` - - Not used - same as ``internal_size`` - * - ``internal_size`` - - Data size in bytes - * - ``precision`` - - Precision of numeric data (not used) - * - ``scale`` - - Scale for numeric data (not used) - * - ``null_ok`` - - Specifies if ``NULL`` values are allowed for this column - - .. method:: execute(self, query, params=None) - - Execute a statement. - - Parameters are not supported - - self - :py:meth:`Connection` - - query - statement or query text - - params - Unused - - .. method:: executemany(self, query, rows_or_cols=None, data_as='rows', amount=None) - - Prepares a statement and executes it against all parameter sequences found in ``rows_or_cols``. - - self - :py:meth:`Connection` - - query - INSERT statement - - rows_or_cols - Data buffer to insert. This should be a sequence of lists or tuples. - - data_as - (Optional) Read data as rows or columns - - amount - (Optional) count of rows to insert - - .. method:: close(self) - - Close a statement and connection. - After a statement is closed, it must be reopened by creating a new cursor. - - self - :py:meth:`Connection` - - .. method:: cursor(self) - - Create a new :py:meth:`Connection` cursor. - - We recommend creating a new cursor for every statement. - - self - :py:meth:`Connection` - - .. method:: fetchall(self, data_as='rows') - - Fetch all remaining records from the result set. - - An empty sequence is returned when no more rows are available. - - self - :py:meth:`Connection` - - data_as - (Optional) Read data as rows or columns - - .. method:: fetchone(self, data_as='rows') - - Fetch one record from the result set. - - An empty sequence is returned when no more rows are available. - - self - :py:meth:`Connection` - - data_as - (Optional) Read data as rows or columns - - - .. method:: fetchmany(self, size=[Connection.arraysize], data_as='rows') - - Fetches the next several rows of a query result set. - - An empty sequence is returned when no more rows are available. - - self - :py:meth:`Connection` - - size - Number of records to fetch. If not set, fetches :py:obj:`Connection.arraysize` (1 by default) records - - data_as - (Optional) Read data as rows or columns - - .. method:: __iter__() - - Makes the cursor iterable. - - -.. attribute:: apilevel = '2.0' - - String constant stating the supported API level. The connector supports API "2.0". - -.. attribute:: threadsafety = 1 - - Level of thread safety the interface supports. pysqream currently supports level 1, which states that threads can share the module, but not connections. - -.. attribute:: paramstyle = 'qmark' - - The placeholder marker. Set to ``qmark``, which is a question mark (``?``). diff --git a/third_party_tools/client_drivers/python/index.rst b/third_party_tools/client_drivers/python/index.rst deleted file mode 100644 index 1c69752d7..000000000 --- a/third_party_tools/client_drivers/python/index.rst +++ /dev/null @@ -1,502 +0,0 @@ -.. _pysqream: - -************************* -Python (pysqream) -************************* - -The SQream Python connector is a set of packages that allows Python programs to connect to SQream DB. - -* ``pysqream`` is a pure Python connector. It can be installed with ``pip`` on any operating system, including Linux, Windows, and macOS. - -* ``pysqream-sqlalchemy`` is a SQLAlchemy dialect for ``pysqream`` - -The connector supports Python 3.6.5 and newer. - -The base ``pysqream`` package conforms to Python DB-API specifications `PEP-249 `_. - -.. contents:: In this topic: - :local: - -Installing the Python connector -================================== - -Prerequisites ----------------- - -1. Python -^^^^^^^^^^^^ - -The connector requires Python 3.6.5 or newer. To verify your version of Python: - -.. code-block:: console - - $ python --version - Python 3.7.3 - - -.. note:: If both Python 2.x and 3.x are installed, you can run ``python3`` and ``pip3`` instead of ``python`` and ``pip`` respectively for the rest of this guide - -.. warning:: If you're running on an older version, ``pip`` will fetch an older version of ``pysqream``, with version <3.0.0. This version is currently not supported. - -2. PIP -^^^^^^^^^^^^ -The Python connector is installed via ``pip``, the Python package manager and installer. - -We recommend upgrading to the latest version of ``pip`` before installing. To verify that you are on the latest version, run the following command: - -.. code-block:: console - - $ python -m pip install --upgrade pip - Collecting pip - Downloading https://files.pythonhosted.org/packages/00/b6/9cfa56b4081ad13874b0c6f96af8ce16cfbc1cb06bedf8e9164ce5551ec1/pip-19.3.1-py2.py3-none-any.whl (1.4MB) - |████████████████████████████████| 1.4MB 1.6MB/s - Installing collected packages: pip - Found existing installation: pip 19.1.1 - Uninstalling pip-19.1.1: - Successfully uninstalled pip-19.1.1 - Successfully installed pip-19.3.1 - -.. note:: - * On macOS, you may want to use virtualenv to install Python and the connector, to ensure compatibility with the built-in Python environment - * If you encounter an error including ``SSLError`` or ``WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.`` - please be sure to reinstall Python with SSL enabled, or use virtualenv or Anaconda. - -3. OpenSSL for Linux -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Some distributions of Python do not include OpenSSL. The Python connector relies on OpenSSL for secure connections to SQream DB. - -* To install OpenSSL on RHEL/CentOS - - .. code-block:: console - - $ sudo yum install -y libffi-devel openssl-devel - -* To install OpenSSL on Ubuntu - - .. code-block:: console - - $ sudo apt-get install libssl-dev libffi-dev -y - -4. Cython (optional) -^^^^^^^^^^^^^^^^^^^^^^^^ - -Optional but highly recommended is Cython, which improves performance of Python applications. - - .. code-block:: console - - $ pip install cython - -Install via pip ------------------ - -The Python connector is available via `PyPi `_. - -Install the connector with ``pip``: - -.. code-block:: console - - $ pip install pysqream pysqream-sqlalchemy - -``pip`` will automatically install all necessary libraries and modules. - -Upgrading an existing installation --------------------------------------- - -The Python drivers are updated periodically. -To upgrade an existing pysqream installation, use pip's ``-U`` flag. - -.. code-block:: console - - $ pip install pysqream pysqream-sqlalchemy -U - - -Validate the installation ------------------------------ - -Create a file called ``test.py``, containing the following: - -.. literalinclude:: test.py - :language: python - :caption: pysqream Validation Script - :linenos: - -Make sure to replace the parameters in the connection with the respective parameters for your SQream DB installation. - -Run the test file to verify that you can connect to SQream DB: - -.. code-block:: console - - $ python test.py - Version: v2020.1 - -If all went well, you are now ready to build an application using the SQream DB Python connector! - -If any connection error appears, verify that you have access to a running SQream DB and that the connection parameters are correct. - -SQLAlchemy examples -======================== - -SQLAlchemy is an ORM for Python. - -When you install the SQream DB dialect (``pysqream-sqlalchemy``) you can use frameworks like Pandas, TensorFlow, and Alembic to query SQream DB directly. - -A simple connection example ---------------------------------- - -.. code-block:: python - - import sqlalchemy as sa - from sqlalchemy.engine.url import URL - - engine_url = URL('sqream' - , username='rhendricks' - , password='secret_passwor" - , host='localhost' - , port=5000 - , database='raviga' - , query={'use_ssl': False}) - - engine = sa.create_engine(engine_url) - - res = engine.execute('create table test (ints int)') - res = engine.execute('insert into test values (5), (6)') - res = engine.execute('select * from test') - -Pulling a table into Pandas ---------------------------------- - -In this example, we use the URL method to create the connection string. - -.. code-block:: python - - import sqlalchemy as sa - import pandas as pd - from sqlalchemy.engine.url import URL - - - engine_url = URL('sqream' - , username='rhendricks' - , password='secret_passwor" - , host='localhost' - , port=5000 - , database='raviga' - , query={'use_ssl': False}) - - engine = sa.create_engine(engine_url) - - table_df = pd.read_sql("select * from nba", con=engine) - - -API Examples -=============== - -Explaining the connection example ---------------------------------------- - -First, import the package and create a connection - -.. code-block:: python - - # Import pysqream package - - import pysqream - - """ - Connection parameters include: - * IP/Hostname - * Port - * database name - * username - * password - * Connect through load balancer, or direct to worker (Default: false - direct to worker) - * use SSL connection (default: false) - * Optional service queue (default: 'sqream') - """ - - # Create a connection object - - con = pysqream.connect(host='127.0.0.1', port=3108, database='raviga' - , username='rhendricks', password='Tr0ub4dor&3' - , clustered=True) - -Then, run a query and fetch the results - -.. code-block:: python - - cur = con.cursor() # Create a new cursor - # Prepare and execute a query - cur.execute('select show_version()') - - result = cur.fetchall() # `fetchall` gets the entire data set - - print (f"Version: {result[0][0]}") - -This should print the SQream DB version. For example ``v2020.1``. - -Finally, we will close the connection - -.. code-block:: python - - con.close() - -Using the cursor --------------------------------------------- - -The DB-API specification includes several methods for fetching results from the cursor. - -We will use the ``nba`` example. Here's a peek at the table contents: - -.. csv-table:: nba - :file: nba-t10.csv - :widths: auto - :header-rows: 1 - -Like before, we will import the library and create a :py:meth:`~Connection`, followed by :py:meth:`~Connection.execute` on a simple ``SELECT *`` query. - -.. code-block:: python - - import pysqream - con = pysqream.connect(host='127.0.0.1', port=3108, database='master' - , username='rhendricks', password='Tr0ub4dor&3' - , clustered=True) - - cur = con.cursor() # Create a new cursor - # The select statement: - statement = 'SELECT * FROM nba' - cur.execute(statement) - -After executing the statement, we have a :py:meth:`Connection` cursor object waiting. A cursor is iterable, meaning that everytime we fetch, it advances the cursor to the next row. - -Use :py:meth:`~Connection.fetchone` to get one record at a time: - -.. code-block:: python - - first_row = cur.fetchone() # Fetch one row at a time (first row) - - second_row = cur.fetchone() # Fetch one row at a time (second row) - -To get several rows at a time, use :py:meth:`~Connection.fetchmany`: - -.. code-block:: python - - # executing `fetchone` twice is equivalent to this form: - third_and_fourth_rows = cur.fetchmany(2) - -To get all rows at once, use :py:meth:`~Connection.fetchall`: - -.. code-block:: python - - # To get all rows at once, use `fetchall` - remaining_rows = cur.fetchall() - - # Close the connection when done - con.close() - -Here are the contents of the row variables we used: - -.. code-block:: pycon - - >>> print(first_row) - ('Avery Bradley', 'Boston Celtics', 0, 'PG', 25, '6-2', 180, 'Texas', 7730337) - >>> print(second_row) - ('Jae Crowder', 'Boston Celtics', 99, 'SF', 25, '6-6', 235, 'Marquette', 6796117) - >>> print(third_and_fourth_rows) - [('John Holland', 'Boston Celtics', 30, 'SG', 27, '6-5', 205, 'Boston University', None), ('R.J. Hunter', 'Boston Celtics', 28, 'SG', 22, '6-5', 185, 'Georgia State', 1148640)] - >>> print(remaining_rows) - [('Jonas Jerebko', 'Boston Celtics', 8, 'PF', 29, '6-10', 231, None, 5000000), ('Amir Johnson', 'Boston Celtics', 90, 'PF', 29, '6-9', 240, None, 12000000), ('Jordan Mickey', 'Boston Celtics', 55, 'PF', 21, '6-8', 235, 'LSU', 1170960), ('Kelly Olynyk', 'Boston Celtics', 41, 'C', 25, '7-0', 238, 'Gonzaga', 2165160), - [...] - -.. note:: Calling a fetch command after all rows have been fetched will return an empty array (``[]``). - -Reading result metadata ----------------------------- - -When executing a statement, the connection object also contains metadata about the result set (e.g.column names, types, etc). - -The metadata is stored in the :py:attr:`Connection.description` object of the cursor. - -.. code-block:: pycon - - >>> import pysqream - >>> con = pysqream.connect(host='127.0.0.1', port=3108, database='master' - ... , username='rhendricks', password='Tr0ub4dor&3' - ... , clustered=True) - >>> cur = con.cursor() - >>> statement = 'SELECT * FROM nba' - >>> cur.execute(statement) - - >>> print(cur.description) - [('Name', 'STRING', 24, 24, None, None, True), ('Team', 'STRING', 22, 22, None, None, True), ('Number', 'NUMBER', 1, 1, None, None, True), ('Position', 'STRING', 2, 2, None, None, True), ('Age (as of 2018)', 'NUMBER', 1, 1, None, None, True), ('Height', 'STRING', 4, 4, None, None, True), ('Weight', 'NUMBER', 2, 2, None, None, True), ('College', 'STRING', 21, 21, None, None, True), ('Salary', 'NUMBER', 4, 4, None, None, True)] - -To get a list of column names, iterate over the ``description`` list: - -.. code-block:: pycon - - >>> [ i[0] for i in cur.description ] - ['Name', 'Team', 'Number', 'Position', 'Age (as of 2018)', 'Height', 'Weight', 'College', 'Salary'] - -Loading data into a table ---------------------------- - -This example loads 10,000 rows of dummy data to a SQream DB instance - -.. code-block:: python - - import pysqream - from datetime import date, datetime - from time import time - - con = pysqream.connect(host='127.0.0.1', port=3108, database='master' - , username='rhendricks', password='Tr0ub4dor&3' - , clustered=True) - - # Create a table for loading - create = 'create or replace table perf (b bool, t tinyint, sm smallint, i int, bi bigint, f real, d double, s varchar(12), ss text, dt date, dtt datetime)' - con.execute(create) - - # After creating the table, we can load data into it with the INSERT command - - # Create dummy data which matches the table we created - data = (False, 2, 12, 145, 84124234, 3.141, -4.3, "Marty McFly" , u"キウイは楽しい鳥です" , date(2019, 12, 17), datetime(1955, 11, 4, 1, 23, 0, 0)) - - - row_count = 10**4 - - # Get a new cursor - cur = con.cursor() - insert = 'insert into perf values (?,?,?,?,?,?,?,?,?,?,?)' - start = time() - cur.executemany(insert, [data] * row_count) - print (f"Total insert time for {row_count} rows: {time() - start} seconds") - - # Close this cursor - cur.close() - - # Verify that the data was inserted correctly - # Get a new cursor - cur = con.cursor() - cur.execute('select count(*) from perf') - result = cur.fetchall() # `fetchall` collects the entire data set - print (f"Count of inserted rows: {result[0][0]}") - - # When done, close the cursor - cur.close() - - # Close the connection - con.close() - -Reading data from a CSV file for load into a table ----------------------------------------------------------- - -We will write a helper function to create an :ref:`insert` statement, by reading an existing table's metadata. - -.. code-block:: python - - import pysqream - import datetime - - def insert_from_csv(cur, table_name, csv_filename, field_delimiter = ',', null_markers = []): - """ - We will first ask SQream DB for some table information. - This is important for understanding the number of columns, and will help - to create a matching INSERT statement - """ - - column_info = cur.execute(f"SELECT * FROM {table_name} LIMIT 0").description - - - def parse_datetime(v): - try: - return datetime.datetime.strptime(row[i], '%Y-%m-%d %H:%M:%S.%f') - except ValueError: - try: - return datetime.datetime.strptime(row[i], '%Y-%m-%d %H:%M:%S') - except ValueError: - return datetime.datetime.strptime(row[i], '%Y-%m-%d') - - # Create enough placeholders (`?`) for the INSERT query string - qstring = ','.join(['?'] * len(column_info)) - insert_statement = f"insert into {table_name} values ({qstring})" - - # Open the CSV file - with open(csv_filename, mode='r') as csv_file: - csv_reader = csv.reader(csv_file, delimiter=field_delimiter) - - # Execute the INSERT statement with the CSV data - cur.executemany(insert_statement, [row for row in csv_reader]) - - - con = pysqream.connect(host='127.0.0.1', port=3108, database='master' - , username='rhendricks', password='Tr0ub4dor&3' - , clustered=True) - - cur = con.cursor() - insert_from_csv(cur, 'nba', 'nba.csv', field_delimiter = ',', null_markers = []) - - con.close() - - -Using SQLAlchemy ORM to create tables and fill them with data ------------------------------------------------------------------------ - -You can also use the ORM to create tables and insert data to them from Python objects. - -For example: - -.. code-block:: python - - import sqlalchemy as sa - import pandas as pd - from sqlalchemy.engine.url import URL - - - engine_url = URL('sqream' - , username='rhendricks' - , password='secret_passwor" - , host='localhost' - , port=5000 - , database='raviga' - , query={'use_ssl': False}) - - engine = sa.create_engine(engine_url) - - # Build a metadata object and bind it - - metadata = sa.MetaData() - metadata.bind = engine - - # Create a table in the local metadata - - employees = sa.Table( - 'employees' - , metadata - , sa.Column('id', sa.Integer) - , sa.Column('name', sa.VARCHAR(32)) - , sa.Column('lastname', sa.VARCHAR(32)) - , sa.Column('salary', sa.Float) - ) - - # The create_all() function uses the SQream DB engine object - # to create all the defined table objects. - - metadata.create_all(engine) - - # Now that the table exists, we can insert data into it. - - # Build the data rows - insert_data = [ {'id': 1, 'name': 'Richard','lastname': 'Hendricks', 'salary': 12000.75} - ,{'id': 3, 'name': 'Bertram', 'lastname': 'Gilfoyle', 'salary': 8400.0} - ,{'id': 8, 'name': 'Donald', 'lastname': 'Dunn', 'salary': 6500.40} - ] - - # Build the insert command - ins = employees.insert(insert_data) - - # Execute the command - result = engine.execute(ins) - -.. toctree:: - :maxdepth: 8 - :caption: Further information - - api-reference diff --git a/third_party_tools/client_platforms/index.rst b/third_party_tools/client_platforms/index.rst deleted file mode 100644 index 30280c788..000000000 --- a/third_party_tools/client_platforms/index.rst +++ /dev/null @@ -1,37 +0,0 @@ -.. _client_platforms: - -************************************ -Client Platforms -************************************ -These topics explain how to install and connect a variety of third party tools. - -Browse the articles below, in the sidebar, or use the search to find the information you need. - -Overview -========== - -SQream DB is designed to work with most common database tools and interfaces, allowing you direct access through a variety of drivers, connectors, tools, vizualisers, and utilities. - -The tools listed have been tested and approved for use with SQream DB. Most 3\ :sup:`rd` party tools that work through JDBC, ODBC, and Python should work. - -If you are looking for a tool that is not listed, SQream and our partners can help. Go to `SQream Support `_ or contact your SQream account manager for more information. - -.. toctree:: - :maxdepth: 4 - :caption: In this section: - :titlesonly: - - power_bi - tibco_spotfire - sas_viya - sql_workbench - tableau - pentaho - microstrategy - informatica - r - php - xxtalend - xxdiagnosing_common_connectivity_issues - -.. image:: /_static/images/connectivity_ecosystem.png \ No newline at end of file diff --git a/third_party_tools/client_platforms/php.rst b/third_party_tools/client_platforms/php.rst deleted file mode 100644 index 599d6a578..000000000 --- a/third_party_tools/client_platforms/php.rst +++ /dev/null @@ -1,46 +0,0 @@ -.. _php: - -***************************** -Connect to SQream Using PHP -***************************** - -You can use PHP to interact with a SQream DB cluster. - -This tutorial is a guide that will show you how to connect a PHP application to SQream DB. - -.. contents:: In this topic: - :local: - -Prerequisites -=============== - -#. Install the :ref:`SQream DB ODBC driver for Linux` and create a DSN. - -#. - Install the `uODBC `_ extension for your PHP installation. - To configure PHP to enable uODBC, configure it with ``./configure --with-pdo-odbc=unixODBC,/usr/local`` when compiling php or install ``php-odbc`` and ``php-pdo`` along with php (version 7.1 minimum for best results) using your distribution package manager. - -Testing the connection -=========================== - -#. - Create a test connection file. Be sure to use the correct parameters for your SQream DB installation. - - Download this :download:`PHP example connection file ` . - - .. literalinclude:: test.php - :language: php - :emphasize-lines: 4 - :linenos: - - .. tip:: - An example of a valid DSN line is: - - .. code:: php - - $dsn = "odbc:Driver={SqreamODBCDriver};Server=192.168.0.5;Port=5000;Database=master;User=rhendricks;Password=super_secret;Service=sqream"; - - For more information about supported DSN parameters, see :ref:`dsn_params`. - -#. Run the PHP file either directly with PHP (``php test.php``) or through a browser. - diff --git a/third_party_tools/client_platforms/sas_viya.rst b/third_party_tools/client_platforms/sas_viya.rst deleted file mode 100644 index fc0806296..000000000 --- a/third_party_tools/client_platforms/sas_viya.rst +++ /dev/null @@ -1,185 +0,0 @@ -.. _connect_to_sas_viya: - -************************* -Connect to SQream Using SAS Viya -************************* - -Overview -========== -SAS Viya is a cloud-enabled analytics engine used for producing useful insights. The **Connect to SQream Using SAS Viya** page describes how to connect to SAS Viya, and describes the following: - -.. contents:: - :local: - :depth: 1 - -Installing SAS Viya -------------------- -The **Installing SAS Viya** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Downloading SAS Viya -~~~~~~~~~~~~~~~~~~ -Integrating with SQream has been tested with SAS Viya v.03.05 and newer. - -To download SAS Viya, see `SAS Viya `_. - -Installing the JDBC Driver -~~~~~~~~~~~~~~~~~~ -The SQream JDBC driver is required for establishing a connection between SAS Viya and SQream. - -**To install the JDBC driver:** - -#. Download the `JDBC driver `_. - - :: - -#. Unzip the JDBC driver into a location on the SAS Viya server. - - SQream recommends creating the directory ``/opt/sqream`` on the SAS Viya server. - -Configuring SAS Viya -------------------- -After installing the JDBC driver, you must configure the JDBC driver from the SAS Studio so that it can be used with SQream Studio. - -**To configure the JDBC driver from the SAS Studio:** - -#. Sign in to the SAS Studio. - - :: - -#. From the **New** menu, click **SAS Program**. - - :: - -#. Configure the SQream JDBC connector by adding the following rows: - - .. literalinclude:: connect3.sas - :language: php - -For more information about writing a connection string, see **Connect to SQream DB with a JDBC Application** and navigate to `Connection String `_. - -Operating SAS Viya --------------------- -The **Operating SAS Viya** section describes the following: - -.. contents:: - :local: - :depth: 1 - -Using SAS Viya Visual Analytics -~~~~~~~~~~~~~~~~~~ -This section describes how to use SAS Viya Visual Analytics. - -**To use SAS Viya Visual Analytics:** - -#. Log in to `SAS Viya Visual Analytics `_ using your credentials: - - :: - -2. Click **New Report**. - - :: - -3. Click **Data**. - - :: - -4. Click **Data Sources**. - - :: - -5. Click the **Connect** icon. - - :: - -6. From the **Type** menu, select **Database**. - - :: - -7. Provide the required information and select **Persist this connection beyond the current session**. - - :: - -8. Click **Advanced** and provide the required information. - - :: - -9. Add the following additional parameters by clicking **Add Parameters**: - -.. list-table:: - :widths: 10 90 - :header-rows: 1 - - * - Name - - Value - * - class - - com.sqream.jdbc.SQDriver - * - classPath - - ** - * - url - - \jdbc:Sqream://**:**/**;cluster=true - * - username - - - * - password - - - -10. Click **Test Connection**. - - :: - -11. If the connection is successful, click **Save**. - -If your connection is not successful, see :ref:`troubleshooting_sas_viya` below. - -.. _troubleshooting_sas_viya: - -Troubleshooting SAS Viya -------------------------- -The **Best Practices and Troubleshooting** section describes the following best practices and troubleshooting procedures when connecting to SQream using SAS Viya: - -.. contents:: - :local: - :depth: 1 - -Inserting Only Required Data -~~~~~~~~~~~~~~~~~~ -When using SAS Viya, SQream recommends using only data that you need, as described below: - -* Insert only the data sources you need into SAS Viya, excluding tables that don’t require analysis. - - :: - -* To increase query performance, add filters before analyzing. Every modification you make while analyzing data queries the SQream database, sometimes several times. Adding filters to the datasource before exploring limits the amount of data analyzed and increases query performance. - -Creating a Separate Service for SAS Viya -~~~~~~~~~~~~~~~~~~ -SQream recommends creating a separate service for SAS Viya with the DWLM. This reduces the impact that Tableau has on other applications and processes, such as ETL. In addition, this works in conjunction with the load balancer to ensure good performance. - -Locating the SQream JDBC Driver -~~~~~~~~~~~~~~~~~~ -In some cases, SAS Viya cannot locate the SQream JDBC driver, generating the following error message: - -.. code-block:: text - - java.lang.ClassNotFoundException: com.sqream.jdbc.SQDriver - -**To locate the SQream JDBC driver:** - -1. Verify that you have placed the JDBC driver in a directory that SAS Viya can access. - - :: - -2. Verify that the classpath in your SAS program is correct, and that SAS Viya can access the file that it references. - - :: - -3. Restart SAS Viya. - -For more troubleshooting assistance, see the `SQream Support Portal `_. - -Supporting TEXT -~~~~~~~~~~~~~~~~~~ -In SAS Viya versions lower than 4.0, casting ``TEXT`` to ``CHAR`` changes the size to 1,024, such as when creating a table including a ``TEXT`` column. This is resolved by casting ``TEXT`` into ``CHAR`` when using the JDBC driver. diff --git a/third_party_tools/client_platforms/tableau.rst b/third_party_tools/client_platforms/tableau.rst deleted file mode 100644 index 666b2f198..000000000 --- a/third_party_tools/client_platforms/tableau.rst +++ /dev/null @@ -1,453 +0,0 @@ -.. _connect_to_tableau: - -************************* -Connecting to SQream Using Tableau -************************* - -Overview -===================== -SQream's Tableau connector plugin, based on standard JDBC, enables storing and fast querying large volumes of data. - -The **Connecting to SQream Using Tableau** page is a Quick Start Guide that describes how install Tableau and the JDBC and ODBC drivers and connect to SQream using the JDBC and ODBC drivers for data analysis. It also describes using best practices and troubleshoot issues that may occur while installing Tableau. SQream supports both Tableau Desktop and Tableau Server on Windows, MacOS, and Linux distributions. - -For more information on SQream's integration with Tableau, see `Tableau's Extension Gallery `_. - -The Connecting to SQream Using Tableau page describes the following: - -.. contents:: - :local: - -Installing the JDBC Driver and Tableau Connector Plugin -------------------- -This section describes how to install the JDBC driver using the fully-integrated Tableau connector plugin (Tableau Connector, or **.taco** file). SQream has been tested with Tableau versions 9.2 and newer. - -**To connect to SQream using Tableau:** - -#. Install the Tableau Desktop application. - - For more information about installing the Tableau Desktop application, see the `Tableau products page `_ and click **Download Free Trial**. Note that Tableau offers a 14-day trial version. - - :: - -#. Do one of the following: - - * **For Windows** - See :ref:`Installing Tableau Using the Windows Installer `. - * **For MacOS or Linux** - See :ref:`Installing the JDBC Driver Manually `. - -.. note:: For Tableau **2019.4 versions and later**, SQream recommends installing the JDBC driver instead of the previously recommended ODBC driver. - -.. _tableau_windows_installer: - -Installing the JDBC Driver Using the Windows Installer -~~~~~~~~~~~~~~~~~~ -If you are using Windows, after installing the Tableau Desktop application you can install the JDBC driver using the Windows installer. The Windows installer is an installation wizard that guides you through the JDBC driver installation steps. When the driver is installed, you can connect to SQream. - -**To install Tableau using the Windows installer**: - -#. Close Tableau Desktop. - - :: - -#. Download the most current version of the `SQream JDBC driver `_. - - :: - -#. Do the following: - - #. Start the installer. - #. Verify that the **Tableau Desktop connector** item is selected. - #. Follow the installation steps. - - :: - -You can now restart Tableau Desktop or Server to begin using the SQream driver by :ref:`connecting to SQream `. - -.. _tableau_jdbc_installer: - -Installing the JDBC Driver Manually -~~~~~~~~~~~~~ -If you are using MacOS, Linux, or the Tableau server, after installing the Tableau Desktop application you can install the JDBC driver manually. When the driver is installed, you can connect to SQream. - -**To install the JDBC driver manually:** - -1. Download the JDBC installer and SQream Tableau connector (.taco) file from the :ref:`from the client drivers page`. - - :: - -#. Install the JDBC driver by unzipping the JDBC driver into a Tableau driver directory. - - Based on the installation method that you used, your Tableau driver directory is located in one of the following places: - - * **Tableau Desktop on Windows:** *C:\\Program Files\\Tableau\\Drivers* - * **Tableau Desktop on MacOS:** *~/Library/Tableau/Drivers* - * **Tableau on Linux**: */opt/tableau/tableau_driver/jdbc* - -.. note:: If the driver includes only a single .jar file, copy it to *C:\\Program Files\\Tableau/Drivers*. If the driver includes multiple files, create a subfolder *A* in *C:\\Program Files\\Tableau/Drivers* and copy all files to folder *A*. - -Note the following when installing the JDBC driver: - -* You must have read permissions on the .jar file. -* Tableau requires a JDBC 4.0 or later driver. -* Tableau requires a Type 4 JDBC driver. -* The latest 64-bit version of Java 8 is installed. - -3. Install the **SQreamDB.taco** file by moving the SQreamDB.taco file into the Tableau connectors directory. - - Based on the installation method that you used, your Tableau driver directory is located in one of the following places: - - * **Tableau Desktop on Windows:** *C:\\Users\\\\My Tableau Repository\\Connectors* - * **Tableau Desktop on Windows:** *~/My Tableau Repository/Connectors* - - :: - -4. *Optional* - If you are using the Tableau Server, do the following: - - 1. Create a directory for Tableau connectors and give it a descriptive name, such as *C:\\tableau_connectors*. - - This directory needs to exist on all Tableau servers. - - :: - - 2. Copy the SQreamDB.taco file into the new directory. - - :: - - 3. Set the **native_api.connect_plugins_path** option to ``tsm`` as shown in the following example: - - .. code-block:: console - - $ tsm configuration set -k native_api.connect_plugins_path -v C:/tableau_connectors - - If a configuration error is displayed, add ``--force-keys`` to the end of the command as shown in the following example: - - .. code-block:: console - - $ tsm configuration set -k native_api.connect_plugins_path -v C:/tableau_connectors--force-keys - - 4. To apply the pending configuration changes, run the following command: - - .. code-block:: console - - $ tsm pending-changes apply - - .. warning:: This restarts the server. - -You can now restart Tableau Desktop or Server to begin using the SQream driver by :ref:`connecting to SQream ` as described in the section below. - -.. _tableau_connect_to_sqream: - - -Installing the ODBC Driver for Tableau Versions 2019.3 and Earlier --------------- - - -This section describes the installation method for Tableau version 2019.3 or earlier and describes the following: - -.. contents:: - :local: - -.. note:: SQream recommends installing the JDBC driver to provide improved connectivity. - -Automatically Reconfiguring the ODBC Driver After Initial Installation -~~~~~~~~~~~~~~~~~~ -If you've already installed the SQream ODBC driver and installed Tableau, SQream recommends reinstalling the ODBC driver with the **.TDC Tableau Settings for SQream DB** configuration shown in the image below: - -.. image:: /_static/images/odbc_windows_installer_tableau.png - -SQream recommends this configuration because Tableau creates temporary tables and runs several discovery queries that may impact performance. The ODBC driver installer avoids this by automatically reconfiguring Tableau. - -For more information about reinstalling the ODBC driver installer, see :ref:`Install and Configure ODBC on Windows `. - -If you want to manually reconfigure the ODBC driver, see :ref:`Manually Reconfiguring the ODBC Driver After Initial Installation ` below. - -.. _manually_reconfigure_odbc_driver: - -Manually Reconfiguring the ODBC Driver After Initial Installation -~~~~~~~~~~~~~~~~~~ -The file **Tableau Datasource Customization (TDC)** file lets you use Tableau make full use of SQream DB's features and capabilities. - -**To manually reconfigure the ODBC driver after initial installation:** - -1. Do one of the following: - - 1. Download the :download:`odbc-sqream.tdc ` file to your machine and open it in a text editor. - - :: - - 2. Copy the text below into a text editor: - - .. literalinclude:: odbc-sqream.tdc - :language: xml - :caption: SQream ODBC TDC File - :emphasize-lines: 2 - -#. Check which version of Tableau you are using. - - :: - -#. In the text of the file shown above, in the highlighted line, replace the version number with the **major** version of Tableau that you are using. - - For example, if you are using Tableau vesion **2019.2.1**, replace it with **2019.2**. - - :: - -#. Do one of the following: - - * If you are using **Tableau Desktop** - save the TDC file to *C:\\Users\\\\Documents\\My Tableau Repository\\Datasources*, where ```` is the Windows username that you have installed Tableau under. - - :: - - * If you are using the **Tableau Server** - save the TDC file to *C:\\ProgramData\\Tableau\\Tableau Server\\data\\tabsvc\\vizqlserver\\Datasources*. - -Configuring the ODBC Connection -~~~~~~~~~~~~ -The ODBC connection uses a DSN when connecting to ODBC data sources, and each DSN represents one SQream database. - -**To configure the ODBC connection:** - -1. Create an ODBC DSN. - - :: - -#. Open the Windows menu by pressing the Windows button (:kbd:`⊞ Win`) or clicking the **Windows** menu button. - - :: - -#. Type **ODBC** and select **ODBC Data Sources (64-bit)**. - - During installation, the installer created a sample user DSN named **SQreamDB**. - - :: - -#. *Optional* - Do one or both of the following: - - * Modify the DSN name. - - :: - - * Create a new DSN name by clicking **Add** and selecting **SQream ODBC Driver**. - -.. image:: /_static/images/odbc_windows_dsns.png - - -5. Click **Finish**. - - :: - -6. Enter your connection parameters. - - The following table describes the connection parameters: - - .. list-table:: - :widths: 15 38 38 - :header-rows: 1 - - * - Item - - Description - - Example - * - Data Source Name - - The Data Source Name. SQream recommends using a descriptive and easily recognizable name for referencing your DSN. Once set, the Data Source Name cannot be changed. - - - * - Description - - The description of your DSN. This field is optional. - - - * - User - - The username of a role to use for establishing the connection. - - ``rhendricks`` - * - Password - - The password of the selected role. - - ``Tr0ub4dor`` - * - Database - - The database name to connect to. For example, ``master`` - - ``master`` - * - Service - - The :ref:`service queue` to use. - - For example, ``etl``. For the default service ``sqream``, leave blank. - * - Server - - The hostname of the SQream worker. - - ``127.0.0.1`` or ``sqream.mynetwork.co`` - * - Port - - The TCP port of the SQream worker. - - ``5000`` or ``3108`` - * - User Server Picker - - Uses the load balancer when establishing a connection. Use only if exists, and check port. - - - * - SSL - - Uses SSL when establishing a connection. - - - * - Logging Options - - Lets you modify your logging options when tracking the ODBC connection for connection issues. - - - -.. tip:: Test the connection by clicking **Test** before saving your DSN. - -7. Save the DSN by clicking **OK.** - -Connecting Tableau to SQream -~~~~~~~~~~~~ -**To connect Tableau to SQream:** - -1. Start Tableau Desktop. - - :: - -#. In the **Connect** menu, in the **To a server** sub-menu, click **More Servers** and select **Other Databases (ODBC)**. - - The **Other Databases (ODBC)** window is displayed. - - :: - -#. In the Other Databases (ODBC) window, select the DSN that you created in :ref:`Setting Up SQream Tables as Data Sources `. - - Tableau may display the **Sqream ODBC Driver Connection Dialog** window and prompt you to provide your username and password. - -#. Provide your username and password and click **OK**. - -.. _tableau_connect_to_sqream_db: - - -Connecting to SQream ---------------------- -After installing the JDBC driver you can connect to SQream. - -**To connect to SQream:** - -#. Start Tableau Desktop. - - :: - -#. In the **Connect** menu, in the **To a Server** sub-menu, click **More...**. - - More connection options are displayed. - - :: - -#. Select **SQream DB by SQream Technologies**. - - The **New Connection** dialog box is displayed. - - :: - -#. In the New Connection dialog box, fill in the fields and click **Sign In**. - - The following table describes the fields: - - .. list-table:: - :widths: 15 38 38 - :header-rows: 1 - - * - Item - - Description - - Example - * - Server - - Defines the server of the SQream worker. - - ``127.0.0.1`` or ``sqream.mynetwork.co`` - * - Port - - Defines the TCP port of the SQream worker. - - ``3108`` when using a load balancer, or ``5100`` when connecting directly to a worker with SSL. - * - Database - - Defines the database to establish a connection with. - - ``master`` - * - Cluster - - Enables (``true``) or disables (``false``) the load balancer. After enabling or disabling the load balance, verify the connection. - - - * - Username - - Specifies the username of a role to use when connecting. - - ``rhendricks`` - * - Password - - Specifies the password of the selected role. - - ``Tr0ub4dor&3`` - * - Require SSL (recommended) - - Sets SSL as a requirement for establishing this connection. - - - -The connection is established and the data source page is displayed. - -.. tip:: - Tableau automatically assigns your connection a default name based on the DSN and table. SQream recommends giving the connection a more descriptive name. - -.. _set_up_sqream_tables_as_data_sources: - -Setting Up SQream Tables as Data Sources ----------------- -After connecting to SQream you must set up the SQream tables as data sources. - -**To set up SQream tables as data sources:** - -1. From the **Table** menu, select the desired database and schema. - - SQream's default schema is **public**. - - :: - -#. Drag the desired tables into the main area (labeled **Drag tables here**). - - This area is also used for specifying joins and data source filters. - - :: - -#. Open a new sheet to analyze data. - -.. tip:: - For more information about configuring data sources, joining, filtering, see Tableau's `Set Up Data Sources `_ tutorials. - -Tableau Best Practices and Troubleshooting ---------------- -This section describes the following best practices and troubleshooting procedures when connecting to SQream using Tableau: - -.. contents:: - :local: - -Inserting Only Required Data -~~~~~~~~~~~~~~~~~~ -When using Tableau, SQream recommends using only data that you need, as described below: - -* Insert only the data sources you need into Tableau, excluding tables that don't require analysis. - - :: - -* To increase query performance, add filters before analyzing. Every modification you make while analyzing data queries the SQream database, sometimes several times. Adding filters to the datasource before exploring limits the amount of data analyze and increases query performance. - -Using Tableau's Table Query Syntax -~~~~~~~~~~~~~~~~~~~ -Dragging your desired tables into the main area in Tableau builds queries based on its own syntax. This helps ensure increased performance, while using views or custom SQL may degrade performance. In addition, SQream recommends using the :ref:`create_view` to create pre-optimized views, which your datasources point to. - -Creating a Separate Service for Tableau -~~~~~~~~~~~~~~~~~~~ -SQream recommends creating a separate service for Tableau with the DWLM. This reduces the impact that Tableau has on other applications and processes, such as ETL. In addition, this works in conjunction with the load balancer to ensure good performance. - -Troubleshooting Workbook Performance Before Deploying to the Tableau Server -~~~~~~~~~~~~~~~~~~~ -Tableau has a built-in `performance recorder `_ that shows how time is being spent. If you're seeing slow performance, this could be the result of a misconfiguration such as setting concurrency too low. - -Use the Tableau Performance Recorder for viewing the performance of queries run by Tableau. You can use this information to identify queries that can be optimized by using views. - -Troubleshooting Error Codes -~~~~~~~~~~~~~~~~~~~ -Tableau may be unable to locate the SQream JDBC driver. The following message is displayed when Tableau cannot locate the driver: - -.. code-block:: console - - Error Code: 37CE01A3, No suitable driver installed or the URL is incorrect - -**To troubleshoot error codes:** - -If Tableau cannot locate the SQream JDBC driver, do the following: - - 1. Verify that the JDBC driver is located in the correct directory: - - * **Tableau Desktop on Windows:** *C:\Program Files\Tableau\Drivers* - * **Tableau Desktop on MacOS:** *~/Library/Tableau/Drivers* - * **Tableau on Linux**: */opt/tableau/tableau_driver/jdbc* - - 2. Find the file path for the JDBC driver and add it to the Java classpath: - - * **For Linux** - ``export CLASSPATH=;$CLASSPATH`` - - :: - - * **For Windows** - add an environment variable for the classpath: - - .. image:: /_static/images/third_party_connectors/tableau/envrionment_variable_for_classpath.png - -If you experience issues after restarting Tableau, see the `SQream support portal `_. diff --git a/third_party_tools/client_platforms/talend.rst b/third_party_tools/client_platforms/talend.rst deleted file mode 100644 index 6e34a7168..000000000 --- a/third_party_tools/client_platforms/talend.rst +++ /dev/null @@ -1,177 +0,0 @@ -.. _talend: - -************************* -Connecting to SQream Using Talend -************************* -.. _top: - -Overview -================= - -This page describes how to use Talend to interact with a SQream DB cluster. The Talend connector is used for reading data from a SQream DB cluster and loading data into SQream DB. - -In addition, this page provides a viability report on Talend's comptability with SQream DB for stakeholders. - -It includes the following: - -* :ref:`A Quick Start guide ` -* :ref:`Information about supported SQream drivers ` -* :ref:`Supported data sources ` and :ref:`tool and operating system versions ` -* :ref:`A description of known issues ` -* :ref:`Related links ` - -About Talend -================= -Talend is an open-source data integration platform. It provides various software and services for Big Data integration and management, enterprise application integration, data quality and cloud storage. - -For more information about Talend, see `Talend `_. - -.. _quickstart_guide: - -Quick Start Guide -======================= - -Creating a New Metadata JDBC DB Connection -------------- -**To create a new metadata JDBC DB connection:** - -1. In the **Repository** panel, nagivate to **Metadata** and right-click **Db connections**. - -:: - -2. Select **Create connection**. - -3. In the **Name** field, type a name. - -The name cannot contain spaces. - -4. In the **Purpose** field, type a purpose and click **Next**. You cannot go to the next step until you define both a Name and a Purpose. - -:: - -5. In the **DB Type** field, select **JDBC**. - -:: - -6. In the **JDBC URL** field, type the relevant connection string. - - For connection string examples, see `Connection Strings `_. - -7. In the **Drivers** field, click the **Add** button. - - The **"newLine** entry is added. - -8. One the **"newLine** entry, click the ellipsis. - -.. image:: /_static/images/Third_Party_Connectors/Creating_a_New_Metadata_JDBC_DB_Connection_8.png - -The **Module** window is displayed. - -9. From the Module window, select **Artifact repository(local m2/nexus)** and select **Install a new module**. - -:: - -10. Click the ellipsis. - -.. image:: /_static/images/Third_Party_Connectors/Creating_a_New_Metadata_JDBC_DB_Connection_9.5.png - -Your hard drive is displayed. - -11. Navigate to a **JDBC jar file** (such as **sqream-jdbc-4.4.0.jar**)and click **Open**. - -:: - -12. Click **Detect the module install status**. - -:: - -13. Click **OK**. - -The JDBC that you selected is displayed in the **Driver** field. - -14. Click **Select class name**. - -:: - -15. Click **Test connection**. - -If a driver class is not found (for example, you didn't select a JDBC jar file), the following error message is displayed: - -After creating a new metadata JDBC DB connection, you can do the following: - - * Use your new metadata connection. - * Drag it to the **job** screen. - * Build Talend components. - -For more information on loading data from JSON files to the Talend Open Studio, see `How to Load Data from JSON Files in Talend `_. - -:ref:`Back to top ` - -.. _supported_sqream_drivers: - -Supported SQream Drivers -================ - -The following list shows the supported SQream drivers and versions: - -* **JDBC** - Version 4.3.3 and higher. -* **ODBC** - Version 4.0.0. This version requires a Bridge to connect. For more information on the required Bridge, see `Connecting Talend on Windows to an ODBC Database `_. - -:ref:`Back to top ` - -.. _supported_data_sources: - -Supported Data Sources -============================ -Talend Cloud connectors let you create reusable connections with a wide variety of systems and environments, such as those shown below. This lets you access and read records of a range of diverse data. - -* **Connections:** Connections are environments or systems for storing datasets, including databases, file systems, distributed systems and platforms. Because these systems are reusable, you only need to establish connectivity with them once. - -* **Datasets:** Datasets include database tables, file names, topics (Kafka), queues (JMS) and file paths (HDFS). For more information on the complete list of connectors and datasets that Talend supports, see `Introducing Talend Connectors `_. - -:ref:`Back to top ` - -.. _supported_tools_os_sys_versions: - -Supported Tool and Operating System Versions -====================== -Talend was tested using the following: - -* Talend version 7.4.1M6 -* Windows 10 -* SQream version 2021.1 -* JDBC version - -:ref:`Back to top ` - -.. _known_issues: - -Known Issues -=========================== -The the list below describes the following known issues as of 6/1/2021: - -* Schemas not displayed for tables with identical names. - -:ref:`Back to top ` - -.. _related_links: - -Related Links -=============== -The following is a list of links relevant to the Talend connector: - -* `Talend Home page `_ -* `Talend Community page `_ -* `Talend BugTracker `_ - -Download Links -================== -The following is a list of download links relevant to the Talend connector: - -* `Talend Open Studio for Big Data `_ -* `Latest version of SQream JDBC `_ - -:ref:`Back to top ` - -.. contents:: In this topic: - :local: \ No newline at end of file diff --git a/third_party_tools/index.rst b/third_party_tools/index.rst deleted file mode 100644 index 1052f9f27..000000000 --- a/third_party_tools/index.rst +++ /dev/null @@ -1,18 +0,0 @@ -.. _third_party_tools: - -************************* -Third Party Tools -************************* -SQream supports the most common database tools and interfaces, giving you direct access through a variety of drivers, connectors, and visualiztion tools and utilities. The tools described on this page have been tested and approved for use with SQream. Most third party tools that work through JDBC, ODBC, and Python should work. - -This section provides information about the following third party tools: - -.. toctree:: - :maxdepth: 2 - :glob: - :titlesonly: - - client_platforms/index - client_drivers/index - -If you need a tool that SQream does not support, contact SQream Support or your SQream account manager for more information. \ No newline at end of file diff --git a/troubleshooting/core_dumping_related_issues.rst b/troubleshooting/core_dumping_related_issues.rst index ace7c8787..a4d71b9b0 100644 --- a/troubleshooting/core_dumping_related_issues.rst +++ b/troubleshooting/core_dumping_related_issues.rst @@ -1,8 +1,8 @@ .. _core_dumping_related_issues: -*********************** +*************************** Core Dumping Related Issues -*********************** +*************************** The **Core Dumping Related Issues** page describes the troubleshooting procedure to be followed if all parameters have been configured correctly, but the cores have not been created. diff --git a/troubleshooting/examining_logs.rst b/troubleshooting/examining_logs.rst deleted file mode 100644 index 9b5a5fb79..000000000 --- a/troubleshooting/examining_logs.rst +++ /dev/null @@ -1,6 +0,0 @@ -.. _examining_logs: - -*********************** -Examining Logs -*********************** -See the :ref:`collecting_logs` section of the :ref:`information_for_support` guide for information about collecting logs for support. \ No newline at end of file diff --git a/troubleshooting/identifying_configuration_issues.rst b/troubleshooting/identifying_configuration_issues.rst index 25708cda6..9ed4b850b 100644 --- a/troubleshooting/identifying_configuration_issues.rst +++ b/troubleshooting/identifying_configuration_issues.rst @@ -1,8 +1,8 @@ .. _identifying_configuration_issues: -*********************** +******************************** Identifying Configuration Issues -*********************** +******************************** The **Troubleshooting Common Issues** page describes how to troubleshoot the following common issues: diff --git a/troubleshooting/index.rst b/troubleshooting/index.rst index efbcdd412..21586efaa 100644 --- a/troubleshooting/index.rst +++ b/troubleshooting/index.rst @@ -12,14 +12,9 @@ The **Troubleshooting** page describes solutions to the following issues: remedying_slow_queries resolving_common_issues - examining_logs identifying_configuration_issues lock_related_issues - sas_viya_related_issues - tableau_related_issues - solving_code_126_odbc_errors log_related_issues - node_js_related_issues core_dumping_related_issues - sqream_sql_installation_related_issues + retrieving_execution_plan_output_using_studio information_for_support \ No newline at end of file diff --git a/troubleshooting/information_for_support.rst b/troubleshooting/information_for_support.rst index 16dcb4174..3b3aa48b7 100644 --- a/troubleshooting/information_for_support.rst +++ b/troubleshooting/information_for_support.rst @@ -1,21 +1,17 @@ .. _information_for_support: -******************************************* +**************************************** Gathering Information for SQream Support -******************************************* +**************************************** -.. What do we want to look into a performance issue - -.. what about other kinds of issues - -.. what about bug reports - -`SQream Support `_ is ready to answer any questions, and help solve any issues with SQream DB. +.. contents:: + :local: + :depth: 1 Getting Support and Reporting Bugs -======================================= +================================== -When contacting `SQream Support `_, we recommend reporting the following information: +When contacting `SQreamDB Support `_, we recommend reporting the following information: * What is the problem encountered? * What was the expected outcome? @@ -27,12 +23,13 @@ When possible, please attach as many of the following: * DDL and queries that reproduce the issue * :ref:`Log files` * Screen captures if relevant +* :ref:`Execution plan output` How SQream Debugs Issues -=================================== +======================== Reproduce --------------- +--------- If we are able to easily reproduce your issue in our testing lab, this greatly improves the speed at which we can fix it. @@ -47,7 +44,7 @@ Reproducing an issue consists of understanding: See the :ref:`reproducible_statement` section ahead for information about collecting a full reproducible example. Logs --------- +---- The logs produced by SQream DB contain a lot of information that may be useful for debugging. @@ -57,7 +54,7 @@ See the :ref:`collecting_logs` section ahead for information about collecting a Fix ---------- +--- Once we have a fix, this can be issued as a hotfix to an existing version, or as part of a bigger major release. @@ -66,14 +63,14 @@ Your SQream account manager will keep you up-to-date about the status of the iss .. _reproducible_statement: Collecting a Reproducible Example of a Problematic Statement -=============================================================== +============================================================ SQream DB contains an SQL utility that can help SQream support reproduce a problem with a query or statement. This utility compiles and executes a statement, and collects the relevant data in a small database which can be used to recreate and investigate the issue. SQL Syntax ---------------- +---------- .. code-block:: postgres @@ -85,7 +82,7 @@ SQL Syntax Parameters ---------------- +---------- .. list-table:: :widths: auto @@ -99,7 +96,7 @@ Parameters - Statements to analyze. Example ------------ +------- .. code-block:: postgres @@ -108,14 +105,14 @@ Example .. _collecting_logs: Collecting Logs and Metadata Database -============================================= +===================================== SQream DB comes bundled with a data collection utility and an SQL utility intended for collecting logs and additional information that can help SQream support drill down into possible issues. See more information in the :ref:`Collect logs from your cluster` section of the :ref:`logging` guide. Examples ------------------ +-------- Write an archive to ``/home/rhendricks``, containing log files: @@ -133,7 +130,7 @@ Write an archive to ``/home/rhendricks``, containing log files and metadata data Using the Command Line Utility: -============================================= +=============================== .. code-block:: console diff --git a/troubleshooting/lock_related_issues.rst b/troubleshooting/lock_related_issues.rst index 1a15858ec..7a5ef908b 100644 --- a/troubleshooting/lock_related_issues.rst +++ b/troubleshooting/lock_related_issues.rst @@ -3,27 +3,15 @@ *********************** Lock Related Issues *********************** + Sometimes, a rare situation can occur where a lock is never freed. The workflow for troubleshooting locks is: -#. Identify which statement has obtained locks -#. Understand if the statement is itself stuck, or waiting for another statement -#. Try to abort the offending statement -#. Force the stale locks to be removed - -For example, we will assume that the statement from the previous example is stuck (statement #\ ``287``). We can attempt to abort it using :ref:`stop_statement`: - -.. code-block:: psql - - t=> SELECT STOP_STATEMENT(287); - executed - -If the locks still appear in the :ref:`show_locks` utility, we can force remove the stale locks: - -.. code-block:: psql +#. Identify which statement has obtained locks. +#. Understand if the statement is itself stuck, or waiting for another statement. +#. Try to :ref:`stop` the offending statement, as in the following example: - t=> SELECT RELEASE_DEFUNCT_LOCKS(); - executed +.. code-block:: sql -.. warning:: This operation can cause some statements to fail on the specific worker on which they are queued. This is intended as a "last resort" to solve stale locks. \ No newline at end of file + SELECT STOP_STATEMENT(2484923); \ No newline at end of file diff --git a/troubleshooting/log_related_issues.rst b/troubleshooting/log_related_issues.rst index a260f59d5..2f37f0962 100644 --- a/troubleshooting/log_related_issues.rst +++ b/troubleshooting/log_related_issues.rst @@ -3,6 +3,7 @@ *********************** Log Related Issues *********************** + The **Log Related Issues** page describes how to resolve the following common issues: .. toctree:: @@ -12,13 +13,14 @@ The **Log Related Issues** page describes how to resolve the following common is Loading Logs with Foreign Tables --------------------------------------- -Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a database administrator can access the logs using the :ref:`external_tables` concept through SQream DB. + +Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a database administrator can access the logs using the :ref:`foreign_tables` concept through SQream DB. .. code-block:: postgres CREATE FOREIGN TABLE logs ( - start_marker VARCHAR(4), + start_marker TEXT(4), row_id BIGINT, timestamp DATETIME, message_level TEXT, @@ -32,7 +34,7 @@ Assuming logs are stored at ``/home/rhendricks/sqream_storage/logs/``, a databas service_name TEXT, message_type_id INT, message TEXT, - end_message VARCHAR(5) + end_message TEXT(5) ) WRAPPER csv_fdw OPTIONS @@ -81,8 +83,8 @@ Finding Fatal Errors .. code-block:: psql t=> SELECT message FROM logs WHERE message_type_id=1010; - Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable - Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/leveldb/LOCK: Resource temporarily unavailable + Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/rocksdb/LOCK: Resource temporarily unavailable + Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/rocksdb/LOCK: Resource temporarily unavailable Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26 Mismatch in storage version, upgrade is needed,Storage version: 25, Server version is: 26 Internal Runtime Error,open cluster metadata database:IO error: lock /home/rhendricks/sqream_storage/LOCK: Resource temporarily unavailable diff --git a/troubleshooting/node_js_related_issues.rst b/troubleshooting/node_js_related_issues.rst deleted file mode 100644 index b3b95b2ed..000000000 --- a/troubleshooting/node_js_related_issues.rst +++ /dev/null @@ -1,54 +0,0 @@ -.. _node_js_related_issues: - -*********************** -Node.js Related Issues -*********************** -The **Node.js Related Issues** page describes how to resolve the following common issues: - -.. toctree:: - :maxdepth: 2 - :glob: - :titlesonly: - -Preventing Heap Out of Memory Errors --------------------------------------------- - -Some workloads may cause Node.JS to fail with the error: - -.. code-block:: none - - FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory - -To prevent this error, modify the heap size configuration by setting the ``--max-old-space-size`` run flag. - -For example, set the space size to 2GB: - -.. code-block:: console - - $ node --max-old-space-size=2048 my-application.js - -Providing Support for BIGINT Data Type ------------------------- - -The Node.JS connector supports fetching ``BIGINT`` values from SQream DB. However, some applications may encounter an error when trying to serialize those values. - -The error that appears is: -.. code-block:: none - - TypeError: Do not know how to serialize a BigInt - -This is because JSON specification do not support BIGINT values, even when supported by Javascript engines. - -To resolve this issue, objects with BIGINT values should be converted to string before serializing, and converted back after deserializing. - -For example: - -.. code-block:: javascript - - const rows = [{test: 1n}] - const json = JSON.stringify(rows, , (key, value) => - typeof value === 'bigint' - ? value.toString() - : value // return everything else unchanged - )); - console.log(json); // [{"test": "1"}] \ No newline at end of file diff --git a/troubleshooting/remedying_slow_queries.rst b/troubleshooting/remedying_slow_queries.rst index 8a109f0c0..c8e7c5896 100644 --- a/troubleshooting/remedying_slow_queries.rst +++ b/troubleshooting/remedying_slow_queries.rst @@ -4,76 +4,68 @@ Remedying Slow Queries *********************** -The **Remedying Slow Queries** page describes how to troubleshoot the causes of slow queries. - -The following table is a checklist you can use to identify the cause of your slow queries: - -.. list-table:: - :widths: auto - :header-rows: 1 - - * - Step - - Description - - Results - * - 1 - - A single query is slow - - - If a query isn't performing as you expect, follow the :ref:`Query best practices` part of the :ref:`sql_best_practices` guide. - - If all queries are slow, continue to step 2. - * - 2 - - All queries on a specific table are slow - - - #. If all queries on a specific table aren't performing as you expect, follow the :ref:`Table design best practices` part of the :ref:`sql_best_practices` guide. - #. Check for active delete predicates in the table. Consult the :ref:`delete_guide` guide for more information. - - If the problem spans all tables, continue to step 3. - * - 3 - - Check that all workers are up - - - Use ``SELECT show_cluster_nodes();`` to list the active cluster workers. - - If the worker list is incomplete, follow the :ref:`cluster troubleshooting` section below. - - If all workers are up, continue to step 4. - * - 4 - - Check that all workers are performing well - - - #. Identify if a specific worker is slower than others by running the same query on different workers. (e.g. by connecting directly to the worker or through a service queue) - #. If a specific worker is slower than others, investigate performance issues on the host using standard monitoring tools (e.g. ``top``). - #. Restart SQream DB workers on the problematic host. - - If all workers are performing well, continue to step 5. - * - 5 - - Check if the workload is balanced across all workers - - - #. Run the same query several times and check that it appears across multiple workers (use ``SELECT show_server_status()`` to monitor) - #. If some workers have a heavier workload, check the service queue usage. Refer to the :ref:`workload_manager` guide. +This page describes how to troubleshoot the causes of slow queries. + +Slow queries may be the result of various factors, including inefficient query practices, suboptimal table designs, or issues with system resources. If you're experiencing sluggish query performance, it's essential to diagnose and address the underlying causes promptly. + +.. glossary:: + + Step 1: A single query is slow + If a query isn't performing as you expect, follow the :ref:`Query best practices` part of the :ref:`sql_best_practices` guide. - If the workload is balanced, continue to step 6. - * - 6 - - Check if there are long running statements - - - #. Identify any currently running statements (use ``SELECT show_server_status()`` to monitor) - #. If there are more statements than available resources, some statements may be in an ``In queue`` mode. - #. If there is a statement that has been running for too long and is blocking the queue, consider stopping it (use ``SELECT stop_statement()``). + If all queries are slow, continue to step 2. + + Step 2: All queries on a specific table are slow + #. If all queries on a specific table aren't performing as you expect, follow the :ref:`Table design best practices` part of the :ref:`sql_best_practices` guide. + #. Check for active delete predicates in the table. Consult the :ref:`delete_guide` guide for more information. - If the statement does not stop correctly, contact SQream support. + If the problem spans all tables, continue to step 3. + + + Step 3: Check that all workers are up + Use ``SELECT show_cluster_nodes();`` to list the active cluster workers. - If there are no long running statements or this does not help, continue to step 7. - * - 7 - - Check if there are active locks - - - #. Use ``SELECT show_locks()`` to list any outstanding locks. - #. If a statement is locking some objects, consider waiting for that statement to end or stop it. - #. If after a statement is completed the locks don't free up, refer to the :ref:`concurrency_and_locks` guide. + If the worker list is incomplete, locate and start the missing worker(s). - If performance does not improve after the locks are released, continue to step 8. - * - 8 - - Check free memory across hosts - - - #. Check free memory across the hosts by running ``$ free -th`` from the terminal. - #. If the machine has less than 5% free memory, consider **lowering** the ``limitQueryMemoryGB`` and ``spoolMemoryGB`` settings. Refer to the :ref:`configuration` guide. - #. If the machine has a lot of free memory, consider **increasing** the ``limitQueryMemoryGB`` and ``spoolMemoryGB`` settings. + If all workers are up, continue to step 4. + + Step 4: Check that all workers are performing well + + #. Identify if a specific worker is slower than others by running the same query on different workers. (e.g. by connecting directly to the worker or through a service queue) + #. If a specific worker is slower than others, investigate performance issues on the host using standard monitoring tools (e.g. ``top``). + #. Restart SQream DB workers on the problematic host. + + If all workers are performing well, continue to step 5. + + Step 5: Check if the workload is balanced across all workers + + #. Run the same query several times and check that it appears across multiple workers (use ``SELECT show_server_status()`` to monitor) + #. If some workers have a heavier workload, check the service queue usage. Refer to the :ref:`workload_manager` guide. - If performance does not improve, contact SQream support for more help. \ No newline at end of file + If the workload is balanced, continue to step 6. + + Step 6: Check if there are long running statements + + #. Identify any currently running statements (use ``SELECT show_server_status()`` to monitor) + #. If there are more statements than available resources, some statements may be in an ``In queue`` mode. + #. If there is a statement that has been running for too long and is blocking the queue, consider stopping it (use ``SELECT stop_statement()``). + + If the statement does not stop correctly, contact `SQream Support `_. + + If there are no long running statements or this does not help, continue to step 7. + + Step 7: Check if there are active locks + + #. Use ``SELECT show_locks()`` to list any outstanding locks. + #. If a statement is locking some objects, consider waiting for that statement to end or stop it. + #. If after a statement is completed the locks don't free up, refer to the :ref:`concurrency_and_locks` guide. + + If performance does not improve after the locks are released, continue to step 8. + + Step 8: Check free memory across hosts + + #. Check free memory across the hosts by running ``$ free -th`` from the terminal. + #. If the machine has less than 5% free memory, consider **lowering** the ``limitQueryMemoryGB`` and ``spoolMemoryGB`` settings. Refer to the :ref:`spooling` guide. + #. If the machine has a lot of free memory, consider **increasing** the ``limitQueryMemoryGB`` and ``spoolMemoryGB`` settings. + + If performance does not improve, contact `SQream Support `_. \ No newline at end of file diff --git a/troubleshooting/resolving_common_issues.rst b/troubleshooting/resolving_common_issues.rst index fd90472d3..73a9c07e8 100644 --- a/troubleshooting/resolving_common_issues.rst +++ b/troubleshooting/resolving_common_issues.rst @@ -40,7 +40,7 @@ Troubleshooting Connectivity Issues Troubleshooting Query Performance ------------------------------------ -#. Use :ref:`show_node_info` to examine which building blocks consume time in a statement. If the query has finished, but the results are not yet materialized in the client, it could point to a problem in the application's data buffering or a network throughput issue.. +#. Use :ref:`show_node_info` to examine which building blocks consume time in a statement. If the query has finished, but the results are not yet materialized in the client, it could point to a problem in the application's data buffering or a network throughput issue. Alternatively, you may also :ref:`retrieve the query execution plan output` using SQreamDB Studio. #. If a problem occurs through a 3\ :sup:`rd` party client, try reproducing it directly with :ref:`the built in SQL client`. If the performance is better in the local client, it could point to a problem in the application or network connection. diff --git a/troubleshooting/retrieving_execution_plan_output_using_studio.rst b/troubleshooting/retrieving_execution_plan_output_using_studio.rst new file mode 100644 index 000000000..d7d0a2598 --- /dev/null +++ b/troubleshooting/retrieving_execution_plan_output_using_studio.rst @@ -0,0 +1,27 @@ +.. _retrieving_execution_plan_output_using_studio: + +******************************************************* +Retrieving Execution Plan Output Using SQreamDB Studio +******************************************************* + +You may use SQreamDB Studio to create a query plan snapshot to be used for monitoring and troubleshooting slow running statements and for identifying long-running execution Workers (components that process data), that may cause performance issues. + +Retrieving Execution Plan Output +================================ + +You can retrieve the execution plan output either after the query execution has completed, in the case of a hanging query, or if you suspect no progress is being made. + +1. In the **Result Panel**, select **Execution Details View**. + + The **Execution Tree** window opens. + +.. |icon-execution-details-view| image:: /_static/images/studio_icon_execution_details_view.png + +2. From the upper-right corner, select the |icon-download| to download a CSV execution plan table. + +.. |icon-download| image:: /_static/images/studio_icon_download.png + :align: middle + +3. Save the execution plan on your local machine. + +You can analyze this information using :ref:`monitoring_query_performance` or with assistance from `SQreamDB Support `_. diff --git a/troubleshooting/sas_viya_related_issues.rst b/troubleshooting/sas_viya_related_issues.rst deleted file mode 100644 index 6661dec95..000000000 --- a/troubleshooting/sas_viya_related_issues.rst +++ /dev/null @@ -1,55 +0,0 @@ -.. _sas_viya_related_issues: - -*********************** -SAS Viya Related Issues -*********************** - -This section describes the following best practices and troubleshooting procedures when connecting to SQream using SAS Viya: - -.. contents:: - :local: - -Inserting Only Required Data ------- -When using Tableau, SQream recommends using only data that you need, as described below: - -* Insert only the data sources you need into SAS Viya, excluding tables that don’t require analysis. - - :: - - -* To increase query performance, add filters before analyzing. Every modification you make while analyzing data queries the SQream database, sometimes several times. Adding filters to the datasource before exploring limits the amount of data analyze and increases query performance. - - -Creating a Separate Service for SAS Viya ------- -SQream recommends creating a separate service for SAS Viya with the DWLM. This reduces the impact that Tableau has on other applications and processes, such as ETL. In addition, this works in conjunction with the load balancer to ensure good performance. - -Locating the SQream JDBC Driver ------- -In some cases, SAS Viya cannot locate the SQream JDBC driver, generating the following error message: - -.. code-block:: text - - java.lang.ClassNotFoundException: com.sqream.jdbc.SQDriver - -**To locate the SQream JDBC driver:** - -1. Verify that you have placed the JDBC driver in a directory that SAS Viya can access. - - :: - - -2. Verify that the classpath in your SAS program is correct, and that SAS Viya can access the file that it references. - - :: - - -3. Restart SAS Viya. - -For more troubleshooting assistance, see the `SQream Support Portal `_. - - -Supporting TEXT ------- -In SAS Viya versions lower than 4.0, casting ``TEXT`` to ``CHAR`` changes the size to 1,024, such as when creating a table including a ``TEXT`` column. This is resolved by casting ``TEXT`` into ``CHAR`` when using the JDBC driver. diff --git a/troubleshooting/solving_code_126_odbc_errors.rst b/troubleshooting/solving_code_126_odbc_errors.rst deleted file mode 100644 index 2e652b113..000000000 --- a/troubleshooting/solving_code_126_odbc_errors.rst +++ /dev/null @@ -1,14 +0,0 @@ -.. _solving_code_126_odbc_errors: - -*********************** -Solving "Code 126" ODBC Errors -*********************** -After installing the ODBC driver, you may experience the following error: - -.. code-block:: none - - The setup routines for the SQreamDriver64 ODBC driver could not be loaded due to system error - code 126: The specified module could not be found. - (c:\Program Files\SQream Technologies\ODBC Driver\sqreamOdbc64.dll) - -This is an issue with the Visual Studio Redistributable packages. Verify you've correctly installed them, as described in the :ref:`Visual Studio 2015 Redistributables ` section above. diff --git a/troubleshooting/sqream_sql_installation_related_issues.rst b/troubleshooting/sqream_sql_installation_related_issues.rst deleted file mode 100644 index 8225a2f18..000000000 --- a/troubleshooting/sqream_sql_installation_related_issues.rst +++ /dev/null @@ -1,33 +0,0 @@ -.. _sqream_sql_installation_related_issues: - -*********************** -SQream SQL Installation Related Issues -*********************** - -The **SQream SQL Installation Related Issues** page describes how to resolve SQream SQL installation related issues. - -Upon running sqream sql for the first time, you may get an error ``error while loading shared libraries: libtinfo.so.5: cannot open shared object file: No such file or directory``. - -Solving this error requires installing the ncruses or libtinfo libraries, depending on your operating system. - -* Ubuntu: - - #. Install ``libtinfo``: - - ``$ sudo apt-get install -y libtinfo`` - #. Depending on your Ubuntu version, you may need to create a symbolic link to the newer libtinfo that was installed. - - For example, if ``libtinfo`` was installed as ``/lib/x86_64-linux-gnu/libtinfo.so.6.2``: - - ``$ sudo ln -s /lib/x86_64-linux-gnu/libtinfo.so.6.2 /lib/x86_64-linux-gnu/libtinfo.so.5`` - -* CentOS / RHEL: - - #. Install ``ncurses``: - - ``$ sudo yum install -y ncurses-libs`` - #. Depending on your RHEL version, you may need to create a symbolic link to the newer libtinfo that was installed. - - For example, if ``libtinfo`` was installed as ``/usr/lib64/libtinfo.so.6``: - - ``$ sudo ln -s /usr/lib64/libtinfo.so.6 /usr/lib64/libtinfo.so.5`` \ No newline at end of file diff --git a/troubleshooting/tableau_related_issues.rst b/troubleshooting/tableau_related_issues.rst deleted file mode 100644 index 99b4a04dd..000000000 --- a/troubleshooting/tableau_related_issues.rst +++ /dev/null @@ -1,73 +0,0 @@ -.. _tableau_related_issues: - -*********************** -Tableau Related Issues -*********************** -This section describes the following best practices and troubleshooting procedures when connecting to Tableau: - -.. contents:: - :local: - -Inserting Only Required Data -~~~~~~~~~~~~~~~~~~ -When using Tableau, SQream recommends using only data that you need, as described below: - -* Insert only the data sources you need into Tableau, excluding tables that don't require analysis. - - :: - -* To increase query performance, add filters before analyzing. Every modification you make while analyzing data queries the SQream database, sometimes several times. Adding filters to the datasource before exploring limits the amount of data analyze and increases query performance. - -Using Tableau's Table Query Syntax -~~~~~~~~~~~~~~~~~~~ -Dragging your desired tables into the main area in Tableau builds queries based on its own syntax. This helps ensure increased performance, while using views or custom SQL may degrade performance. In addition, SQream recommends using the :ref:`create_view` to create pre-optimized views, which your datasources point to. - -Creating a Separate Service for Tableau -~~~~~~~~~~~~~~~~~~~ -SQream recommends creating a separate service for Tableau with the DWLM. This reduces the impact that Tableau has on other applications and processes, such as ETL. In addition, this works in conjunction with the load balancer to ensure good performance. - -Error Saving Large Quantities of Data as Files -~~~~~~~~~~~~~~~~~~~ -An **FAB9A2C5** error can when saving large quantities of data as files. If you receive this error when writing a connection string, add the ``fetchSize`` parameter to ``1``, as shown below: - -.. code-block:: text - - jdbc:Sqream:///;user=;password=sqream;[; fetchSize=1...] - -For more information on troubleshooting error **FAB9A2C5**, see the `Tableau Knowledge Base `_. - -Troubleshooting Workbook Performance Before Deploying to the Tableau Server -~~~~~~~~~~~~~~~~~~~ -Tableau has a built-in `performance recorder `_ that shows how time is being spent. If you're seeing slow performance, this could be the result of a misconfiguration such as setting concurrency too low. - -Use the Tableau Performance Recorder for viewing the performance of queries run by Tableau. You can use this information to identify queries that can be optimized by using views. - -Troubleshooting Error Codes -~~~~~~~~~~~~~~~~~~~ -Tableau may be unable to locate the SQream JDBC driver. The following message is displayed when Tableau cannot locate the driver: - -.. code-block:: console - - Error Code: 37CE01A3, No suitable driver installed or the URL is incorrect - -**To troubleshoot error codes:** - -If Tableau cannot locate the SQream JDBC driver, do the following: - - 1. Verify that the JDBC driver is located in the correct directory: - - * **Tableau Desktop on Windows:** *C:\Program Files\Tableau\Drivers* - * **Tableau Desktop on MacOS:** *~/Library/Tableau/Drivers* - * **Tableau on Linux**: */opt/tableau/tableau_driver/jdbc* - - 2. Find the file path for the JDBC driver and add it to the Java classpath: - - * **For Linux** - ``export CLASSPATH=;$CLASSPATH`` - - :: - - * **For Windows** - add an environment variable for the classpath: - - - -If you experience issues after restarting Tableau, see the `SQream support portal `_.