diff --git a/guides/features/external_data/hdfs_yaniv.rst b/guides/features/external_data/hdfs_yaniv.rst new file mode 100644 index 000000000..4c1a49c81 --- /dev/null +++ b/guides/features/external_data/hdfs_yaniv.rst @@ -0,0 +1,260 @@ +.. _hdfs_yaniv.rst: + +.. _back_to_top: + +Launching SQream in an HDFS Environment +======================================= +This page describes how to: + + + +* :ref:`Configure an HDFS environment for the user sqream ` +* :ref:`Authenticate Hadoop servers that require Kerberos ` + +.. _configuring_an_hdfs_environment_for_the_user_sqream: + +Configuring an HDFS Environment for the User **sqream** +---------------------------------------------------------- + +This section describes how to configure an HDFS environment for the user **sqream** and is only relevant for users with an HDFS environment. + +**To configure an HDFS environment for the user sqream:** + +1. Open your **bash_profile** configuration file for editing: + + .. code-block:: console + + $ vim /home/sqream/.bash_profile + +2. Make the following edits: + +.. + Comment: - see below; do we want to be a bit more specific on what changes we're talking about? + + .. code-block:: console + + $ #PATH=$PATH:$HOME/.local/bin:$HOME/bin + + $ #export PATH + + $ # PS1 + $ #MYIP=$(curl -s -XGET "http://ip-api.com/json" | python -c 'import json,sys; jstr=json.load(sys.stdin); print jstr["query"]') + $ #PS1="\[\e[01;32m\]\D{%F %T} \[\e[01;33m\]\u@\[\e[01;36m\]$MYIP \[\e[01;31m\]\w\[\e[37;36m\]\$ \[\e[1;37m\]" + + $ SQREAM_HOME=/usr/local/sqream + $ export SQREAM_HOME + + $ export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk + $ export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop + $ export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob` + $ export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native + $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR + + + $ PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin + $ export PATH + +3. Verify that the edits have been made: + + .. code-block:: console + + source /home/sqream/.bash_profile + +4. Check if you can access Hadoop from your machine: + + .. code-block:: console + + $ hadoop fs -ls hdfs://:8020/ + +.. + Comment: - + **NOTICE:** If you cannot access Hadoop from your machine because it uses Kerberos, see `Connecting a SQream Server to Cloudera Hadoop with Kerberos `_ + + +5. Verify that an HDFS environment exists for SQream services: + + .. code-block:: console + + $ ls -l /etc/sqream/sqream_env.sh + +.. _step_6: + + +6. If an HDFS environment does not exist for SQream services, create one (sqream_env.sh): + + .. code-block:: console + + $ #!/bin/bash + + $ SQREAM_HOME=/usr/local/sqream + $ export SQREAM_HOME + + $ export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk + $ export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop + $ export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob` + $ export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native + $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR + + + $ PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin + $ export PATH + +:ref:`Back to top ` + + +.. _authenticate_hadoop_servers_that_require_kerberos: + +Authenticate Hadoop Servers that Require Kerberos +--------------------------------------------------- + +If your Hadoop server requires Kerberos authentication, do the following: + +1. Create a principal for the user **sqream**. + + .. code-block:: console + + $ kadmin -p root/admin@SQ.COM + $ addprinc sqream@SQ.COM + +2. If you do not know yor Kerberos root credentials, connect to the Kerberos server as a root user with ssh and run **kadmin.local**: + + .. code-block:: console + + $ kadmin.local + + Running **kadmin.local** does not require a password. + +3. If a password is not required, change your password to **sqream@SQ.COM**. + + .. code-block:: console + + $ change_password sqream@SQ.COM + +4. Connect to the hadoop name node using ssh: + + .. code-block:: console + + $ cd /var/run/cloudera-scm-agent/process + +5. Check the most recently modified content of the directory above: + + .. code-block:: console + + $ ls -lrt + +5. Look for a recently updated folder containing the text **hdfs**. + +The following is an example of the correct folder name: + + .. code-block:: console + + cd -hdfs- + + This folder should contain a file named **hdfs.keytab** or another similar .keytab file. + + + +.. + Comment: - Does "something" need to be replaced with "file name" + + +6. Copy the .keytab file to user **sqream's** Home directory on the remote machines that you are planning to use Hadoop on. + +7. Copy the following files to the **sqream sqream@server:/hdfs/hadoop/etc/hadoop:** directory: + + * core-site.xml + * hdfs-site.xml + +8. Connect to the sqream server and verify that the .keytab file's owner is a user sqream and is granted the correct permissions: + + .. code-block:: console + + $ sudo chown sqream:sqream /home/sqream/hdfs.keytab + $ sudo chmod 600 /home/sqream/hdfs.keytab + +9. Log into the sqream server. + +10. Log in as the user **sqream**. + +11. Navigate to the Home directory and check the name of a Kerberos principal represented by the following .keytab file: + + .. code-block:: console + + $ klist -kt hdfs.keytab + + The following is an example of the correct output: + + .. code-block:: console + + $ sqream@Host-121 ~ $ klist -kt hdfs.keytab + $ Keytab name: FILE:hdfs.keytab + $ KVNO Timestamp Principal + $ ---- ------------------- ------------------------------------------------------ + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 HTTP/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + $ 5 09/15/2020 18:03:05 hdfs/nn1@SQ.COM + +12. Verify that the hdfs service named **hdfs/nn1@SQ.COM** is shown in the generated output above. + +13. Run the following: + + .. code-block:: console + + $ kinit -kt hdfs.keytab hdfs/nn1@SQ.COM + + 13. Check the output: + + .. code-block:: console + + $ klist + + The following is an example of the correct output: + + .. code-block:: console + + $ Ticket cache: FILE:/tmp/krb5cc_1000 + $ Default principal: sqream@SQ.COM + $ + $ Valid starting Expires Service principal + $ 09/16/2020 13:44:18 09/17/2020 13:44:18 krbtgt/SQ.COM@SQ.COM + +14. List the files located at the defined server name or IP address: + + .. code-block:: console + + $ hadoop fs -ls hdfs://:8020/ + +15. Do one of the following: + + * If the list below is output, continue with Step 16. + * If the list is not output, verify that your environment has been set up correctly. + +If any of the following are empty, verify that you followed :ref:`Step 6 ` in the **Configuring an HDFS Environment for the User sqream** section above correctly: + + .. code-block:: console + + $ echo $JAVA_HOME + $ echo $SQREAM_HOME + $ echo $CLASSPATH + $ echo $HADOOP_COMMON_LIB_NATIVE_DIR + $ echo $LD_LIBRARY_PATH + $ echo $PATH + +16. Verify that you copied the correct keytab file. + +17. Review this procedure to verify that you have followed each step. + +:ref:`Back to top ` diff --git a/guides/operations/setup/index.rst b/guides/operations/setup/index.rst index b422d9c5b..173c8ae26 100644 --- a/guides/operations/setup/index.rst +++ b/guides/operations/setup/index.rst @@ -14,6 +14,6 @@ The guides below cover installing SQream DB. before_you_begin local_docker - recommended_configuration + recommended_pre-installation_configurations diff --git a/guides/operations/setup/installing_sqream_with_binary.rst b/guides/operations/setup/installing_sqream_with_binary.rst new file mode 100644 index 000000000..1d715f96b --- /dev/null +++ b/guides/operations/setup/installing_sqream_with_binary.rst @@ -0,0 +1,264 @@ +.. _installing_sqream_with_binary: + +********************************************* +Installing SQream with Binary +********************************************* +This procedure describes how to install SQream using Binary. + +**To install SQream with Binary:** + +1. Copy the SQream package to the **/home/sqream** directory for the current version: + + .. code-block:: console + + $ tar -xf sqream-db-v<2020.2>.tar.gz + +2. Append the version number to the name of the SQream folder. The version number in the following example is **v2020.2**: + + .. code-block:: console + + $ mv sqream sqream-db-v<2020.2> + +3. Move the new version of the SQream folder to the **/usr/local/** directory: + + .. code-block:: console + + $ sudo mv sqream-db-v<2020.2> /usr/local/ + +4. Change the ownership of the folder to **sqream folder**: + + .. code-block:: console + + $ sudo chown -R sqream:sqream /usr/local/sqream-db-v<2020.2> + +5. Navigate to the **/usr/local/** directory and create a symbolic link to SQream: + + .. code-block:: console + + $ cd /usr/local + $ sudo ln -s sqream-db-v<2020.2> sqream + +6. Verify that the symbolic link that you created points to the folder that you created: + + .. code-block:: console + + $ ls -l + +7. Verify that the symbolic link that you created points to the folder that you created: + + .. code-block:: console + + $ sqream -> sqream-db-v<2020.2> + +8. Create the SQream configuration file destination folders and set their ownership to **sqream**: + + .. code-block:: console + + $ sudo mkdir /etc/sqream + $ sudo chown -R sqream:sqream /etc/sqream + +9. Create the SQream service log destination folders and set their ownership to **sqream**: + + .. code-block:: console + + $ sudo mkdir /var/log/sqream + $ sudo chown -R sqream:sqream /var/log/sqream + +10. Navigate to the **/usr/local/** directory and copy the SQream configuration files from them: + + .. code-block:: console + + $ cd /usr/local/sqream/etc/ + $ cp * /etc/sqream + +The configuration files are **service configuration files**, and the JSON files are **SQream configuration files**, for a total of four files. The number of SQream configuration files and JSON files must be identical. + +**NOTICE** - Verify that the JSON files have been configured correctly and that all required flags have been set to the correct values. + +In each JSON file, the following parameters **must be updated**: + +* instanceId +* machineIP +* metadataServerIp +* spoolMemoryGB +* limitQueryMemoryGB +* gpu +* port +* ssl_port + +Note the following: + +* The value of the **metadataServerIp** parameter must point to the IP that the metadata is running on. +* The value of the **machineIP** parameter must point to the IP of your local machine. + +It would be same on server running metadataserver and different on other server nodes. + +11. **Optional** - To run additional SQream services, copy the required configuration files and create additional JSON files: + + .. code-block:: console + + $ cp sqream2_config.json sqream3_config.json + $ vim sqream3_config.json + +**NOTICE:** A unique **instanceID** must be used in each JSON file. IN the example above, the instanceID **sqream_2** is changed to **sqream_3**. + +12. **Optional** - If you created additional services in **Step 11**, verify that you have also created their additional configuration files: + + .. code-block:: console + + $ cp sqream2-service.conf sqream3-service.conf + $ vim sqream3-service.conf + +13. For each SQream service configuration file, do the following: + + 1. Change the **SERVICE_NAME=sqream2** value to **SERVICE_NAME=sqream3**. + + 2. Change **LOGFILE=/var/log/sqream/sqream2.log** to **LOGFILE=/var/log/sqream/sqream3.log**. + +14. Set up **servicepicker**: + + 1. Do the following: + + .. code-block:: console + + $ vim /etc/sqream/server_picker.conf + + 2. Change the IP **127.0.0.1** to the IP of the server that the **metadataserver** service is running on. + + 3. Change the **CLUSTER** to the value of the cluster path. + +15. Set up your service files: + + .. code-block:: console + + $ cd /usr/local/sqream/service/ + $ cp sqream2.service sqream3.service + $ vim sqream3.service + +16. Increment each **EnvironmentFile=/etc/sqream/sqream2-service.conf** configuration file for each SQream service file, as shown below: + + .. code-block:: console + + $ EnvironmentFile=/etc/sqream/sqream<3>-service.conf + +17. Copy and register your service files into systemd: + + .. code-block:: console + + $ sudo cp metadataserver.service /usr/lib/systemd/system/ + $ sudo cp serverpicker.service /usr/lib/systemd/system/ + $ sudo cp sqream*.service /usr/lib/systemd/system/ + +18. Verify that your service files have been copied into systemd: + + .. code-block:: console + + $ ls -l /usr/lib/systemd/system/sqream* + $ ls -l /usr/lib/systemd/system/metadataserver.service + $ ls -l /usr/lib/systemd/system/serverpicker.service + $ sudo systemctl daemon-reload + +19. Copy the license into the **/etc/license** directory: + + .. code-block:: console + + $ cp license.enc /etc/sqream/ + + +If you have an HDFS environment, see Configuring an HDFS Environment for the User sqream :ref:`.. _hdfs_yaniv.rst:`. + +Upgrading SQream Version +------------------------- +Upgrading your SQream version requires stopping all running services while you manually upgrade SQream. + +**To upgrade your version of SQream:** + +1. Stop all actively running SQream services. + +2. Verify that SQream has stopped listening on ports **500X**, **510X**, and **310X**: + + .. code-block:: console + + $ sudo netstat -nltp #to make sure sqream stopped listening on 500X, 510X and 310X ports. + +3. Replace the old version ``sqream-db-v2020.2``, with the new version ``sqream-db-v2021.1``: + + .. code-block:: console + + $ cd /home/sqream + $ mkdir tempfolder + $ mv sqream-db-v2021.1.tar.gz tempfolder/ + $ tar -xf sqream-db-v2021.1.tar.gz + $ sudo mv sqream /usr/local/sqream-db-v2021.1 + $ cd /usr/local + $ sudo chown -R sqream:sqream sqream-db-v2021.1 + +4. Remove the symbolic link: + + .. code-block:: console + + $ sudo rm sqream + +5. Create a new symbolic link named "sqream" pointing to the new version: + + .. code-block:: console + + $ sudo ln -s sqream-db-v2021.1 sqream + +6. Verify that the symbolic SQream link points to the real folder: + + .. code-block:: console + + $ ls -l + + The following is an example of the correct output: + + .. code-block:: console + + $ sqream -> sqream-db-v2021.1 + +5. **Optional-** (For major versions) Upgrade your version of SQream storage cluster, as shown in the following example: + + .. code-block:: console + + $ ./upgrade_storage + + The following is an example of the correct output: + + .. code-block:: console + + get_leveldb_version path{/home/rhendricks/raviga_database} + current storage version 23 + upgrade_v24 + upgrade_storage to 24 + upgrade_storage to 24 - Done + upgrade_v25 + upgrade_storage to 25 + upgrade_storage to 25 - Done + upgrade_v26 + upgrade_storage to 26 + upgrade_storage to 26 - Done + validate_leveldb + ... + upgrade_v37 + upgrade_storage to 37 + upgrade_storage to 37 - Done + validate_leveldb + storage has been upgraded successfully to version 37 + +6. Verify that the latest version has been installed: + + .. code-block:: console + + $ ./sqream sql --username sqream --password sqream --host localhost --databasename master -c "SELECT SHOW_VERSION();" + + The following is an example of the correct output: + + .. code-block:: console + + v2021.1 + 1 row + time: 0.050603s + +For more information, see the `upgrade_storage `_ command line program. + diff --git a/guides/operations/setup/recommended_pre-installation_configurations.rst b/guides/operations/setup/recommended_pre-installation_configurations.rst new file mode 100644 index 000000000..330fbab57 --- /dev/null +++ b/guides/operations/setup/recommended_pre-installation_configurations.rst @@ -0,0 +1,1155 @@ +.. _recommended_pre-installation_configurations: + +********************************************* +Recommended Pre-Installation Configuration +********************************************* + +Once you've :ref:`installed SQream DB`, you can and should tune your system for better performance and stability. + +This page provides recommendations for production deployments of SQream DB. + +.. contents:: In this topic: + :local: + +Recommended BIOS Settings +========================== +The BIOS settings may have a variety of names, or may not exist on your system. Each system vendor has a different set of settings and variables. + +It is safe to skip any and all of the configuration steps, but this may impact performance. + +If any doubt arises, consult the documentation for your server or your hardware vendor for the correct way to apply the settings. + +.. list-table:: + :widths: 25 25 50 + :header-rows: 1 + + * - Item + - Setting + - Rationale + * - **Management console access - ensure the physical connection?** + - **Connected** + - Connection to OOB required to preserve continuous network uptime. + * - **All drives.** + - **Connected and displayed on RAID interface** + - Prerequisite for cluster or OS installation. + * - **RAID volumes.** + - **Configured according to project guidelines. Must be rebooted to take effect.** + - Clustered to increase logical volume and provide redundancy. + * - **Fan speed Thermal Configuration.** + - Dell fan speed: **High Maximum**. Specified minimum setting: **60**. HPe thermal configuration: **Increased cooling**. + - NVIDIA Tesla GPUs are passively cooled and require high airflow to operate at full performance. + * - Power regulator or iDRAC power unit policy + - HPe: **HP static high performance** mode enabled. Dell: **iDRAC power unit policy** (power cap policy) disabled. + - Other power profiles (such as "balanced") throttle the CPU and diminishes performance. Throttling may also cause GPU failure. + * - **System Profile**, **Power Profile**, or **Performance Profile** + - **High Performance** + - The Performance profile provides potentially increased performance by maximizing processor frequency, and the disabling certain power saving features such as C-states. Use this setting for environments that are not sensitive to power consumption. + * - **Power Cap Policy** or **Dynamic power capping** + - **Disabled** + - Other power profiles (like "balanced") throttle the CPU and may diminish performance or cause GPU failure. This setting may appear together with the above (Power profile or Power regulator). **(In BIOS?)** This setting allows disabling system ROM power calibration during the boot process. Power regulator settings are named differently in BIOS and iLO/iDRAC. + +.. + **Comment: is it necessary to show the different power regulator setting names in this document?** + * - **Intel Turbo Boost** + - **Enabled** + - Intel Turbo Boost enables overclocking the processor to boost CPU-bound operation performance. Overclocking may risk computational jitter due to changes in the processor's turbo frequency. This causes brief pauses in processor operation, introducing uncertainty into application processing time. Turbo operation is a function of power consumption, processor temperature, and the number of active cores. + * - **Logical Processor** + - **HPe**: Enable **Hyperthreading** **Dell**: Enable **Logical Processor** + - Hyperthreading doubles the amount of logical processors, which may improve performance by ~5-10% for CPU-bound operations. + * - **Intel Virtualization Technology** (VT-d) + - **Disable** + - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. + * - **Logical Processor** + - **HPe**: Enable **Hyperthreading** **Dell**: Enable **Logical Processor** + - Hyperthreading doubles the amount of logical processors, which may improve performance by ~5-10% for CPU-bound operations. + * - **Intel Virtualization Technology** (VT-d) + - **Disable** + - VT-d is optimal for running VMs. However, when running Linux natively, disabling VT-d boosts performance by up to 10%. + * - **Processor C-States** (Minimum processor idle power core state) + - **Disable** + - Processor C-States reduce server power when the system is in an idle state. This causes slower cold-starts when the system transitions from an idle to a load state, and may reduce query performance by up to 15%. **Comment: a hyperlinked footnote to an internal source was inserted into the Confluence doc here. Do we want to include this in the final version? Linked URL: https://www.dell.com/support/kbdoc/en-il/000060621/what-is-the-c-state** + * - **HPe**: **Energy/Performance bias** + - **Maximum performance** + - Configures processor sub-systems for high-performance and low-latency. Other power profiles (like "balanced") throttle the CPU and may diminish performance. Use this setting for environments that are not sensitive to power consumption. + * - **HPe**: **DIMM voltage** + - **Optimized for Performance** + - Setting a higher voltage for DIMMs may increase performance. + * - **Memory Operating Mode** + - **Optimizer Mode**, **Disable Node Interleaving**, **Auto Memory Operating Voltage** + - Memory Operating Mode is tuned for performance in **Optimizer** mode. Other modes may improve reliability, but reduce performance. **Node Interleaving** should be disabled because enabling it interleaves the memory between memory nodes, which harms NUMA-aware applications such as SQream DB. + * - **HPe**: **Memory power savings mode** + - **Maximum performance** + - This setting configures several memory parameters to optimize the performance of memory sub-systems. The default setting is **Balanced**. + * - HPe **ACPI SLIT** + - **Enabled** + - ACPI SLIT sets the relative access times between processors and memory and I/O sub-systems. ACPI SLIT enables operating systems to use this data to improve performance by more efficiently allocating resources and workloads. + * - **QPI Snoop** **Comment: should we write that it is HPe or Intel? HPe: QPI Snoop** + - **Cluster on Die** or **Home Snoop** + - QPI (QuickPath Interconnect) Snoop lets you configure different Snoop modes that impact the QPI interconnect. Changing this setting may improve the performance of certain workloads. The default setting of **Home Snoop** provides high memory bandwidth in an average NUMA environment. **Cluster on Die** may provide increased memory bandwidth in highly optimized NUMA workloads. **Early Snoop** may decrease memory latency, but may result in lower overall bandwidth compared to other modes. + + + + + + +Installing the Operating System +=================================================== +Either the CentOS (versions 7.6-7.9) or RHEL (versions 7.6-7.9) must be installed before installing the SQream database. Either the customer or a SQream representative can perform the installation. + +.. + **Comment: I recommend leaving contact information here - Please call xxx-xxx-xxxx to contact a SQream representative.** + +**To install the operating system:** + +#. Select a language (English recommended). +#. From **Software Selection**, select **Minimal**. +#. Select the **Development Tools** group checkbox. +#. Continue the installation. +#. Set up the necessary drives and users as per the installation process. + + Using Debugging Tools is recommended for future problem-solving if necessary. + + .. + **Comment: In Step 4, why don't we document the entire procedure? I.e., why do we stop here and say "Continue the installation" and "Set up the necessary drives..."?** + +Selecting the **Development Tools** group installs the following tools: + + * autoconf + * automake + * binutils + * bison + * flex + * gcc + * gcc-c++ + * gettext + * libtool + * make + * patch + * pkgconfig + * redhat-rpm-config + * rpm-build + * rpm-sign + + + +The root user is created and the OS shell is booted up. + +Configuring the Operating System +=================================================== +When configuring the operating system, several basic settings related to creating a new server are required. Configuring these as part of your basic set-up increases your server's security and usability. + +Logging In to the Server +-------------------------------- +You can log in to the server using the server's IP address and password for the **root** user. The server's IP address and **root** user were created while installing the operating system above. + +Automatically Creating a SQream User +------------------------------------ + +**To automatically create a SQream user:** + +#. If a SQream user was created during installation, verify that the same ID is used on every server: + + .. code-block:: console + + $ sudo id sqream + +The ID **1000** is used on each server in the following example: + + .. code-block:: console + + $ uid=1000(sqream) gid=1000(sqream) groups=1000(sqream) + +2. If the ID's are different, delete the SQream user and SQream group from both servers: + + .. code-block:: console + + $ sudo userdel sqream + +3. Recreate it using the same ID: + + .. code-block:: console + + $ sudo rm /var/spool/mail/sqream + +Manually Creating a SQream User +-------------------------------- + +**To manually create a SQream user:** + +SQream enables you to manually create users. This section shows you how to manually create a user with the UID **1111**. You cannot manually create during the operating system installation procedure. + +1. Add a user with an identical UID on all cluster nodes: + + .. code-block:: console + + $ useradd -u 1111 sqream + +2. Add the user **sqream** to the **wheel** group. + + .. code-block:: console + + $ sudo usermod -aG wheel sqream + +You can remove the SQream user from the **wheel** group when the installation and configuration are complete: + + .. code-block:: console + + $ passwd sqream + +3. Log out and log back in as **sqream**. + +**Note:** If you deleted the **sqream** user and recreated it with different ID, to avoid permission errors, you must change its ownership to /home/sqream. + +4. Change the **sqream** user's ownership to /home/sqream: + + .. code-block:: console + + $ sudo chown -R sqream:sqream /home/sqream + +Setting Up A Locale +-------------------------------- + +SQream enables you to set up a locale. In this example, the locale used is your own location. + +**To set up a locale:** + +1. Set the language of the locale: + + .. code-block:: console + + $ sudo localectl set-locale LANG=en_US.UTF-8 + +2. Set the time stamp (time and date) of the locale: + + .. code-block:: console + + $ sudo timedatectl set-timezone Asia/Jerusalem + +If needed, you can run the **timedatectl list-timezones** command to see your current time-zone. + + +Installing the Required Packages +-------------------------------- +You can install the required packages by running the following command: + +.. code-block:: console + + $ sudo yum install ntp pciutils monit zlib-devel openssl-devel kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc net-tools wget jq + + +Installing the Recommended Tools +-------------------------------- +You can install the recommended tools by running the following command: + +.. code-block:: console + + $ sudo yum install bash-completion.noarch vim-enhanced vim-common net-tools iotop htop psmisc screen xfsprogs wget yum-utils deltarpm dos2unix + + +Installing Python 3.6.7 +-------------------------------- +1. Download the Python 3.6.7 source code tarball file from the following URL into the **/home/sqream** directory: + + .. code-block:: console + + $ wget https://www.python.org/ftp/python/3.6.7/Python-3.6.7.tar.xz + +2. Extract the Python 3.6.7 source code into your current directory: + + .. code-block:: console + + $ tar -xf Python-3.6.7.tar.xz + +3. Navigate to the Python 3.6.7 directory: + + .. code-block:: console + + $ cd Python-3.6.7/Python-3 + +4. Run the **./configure** script: + + .. code-block:: console + + $ ./configure + +5. Build the software: + + .. code-block:: console + + $ make -j30 + +6. Install the software: + + .. code-block:: console + + $ sudo make install + +7. Verify that Python 3.6.7 has been installed: + + .. code-block:: console + + $ python3.6.7 + +Installing NodeJS on CentOS +-------------------------------- +**To install the node.js on CentOS:** + +1. Download the `setup_12.x file `__ as a root user logged in shell: + + .. code-block:: console + + $ curl -sL https://rpm.nodesource.com/setup_12.x | sudo bash - + +2. Clear the YUM cache and update the local metadata: + + .. code-block:: console + + $ sudo yum clean all && sudo yum makecache fast + +3. Install the **node.js**) file: + + .. code-block:: console + + $ sudo yum install -y nodejs + +Installing NodeJS on Ubuntu +-------------------------------- +**To install the node.js file on Ubuntu:** + +1. Download the `setup_12.x file `__ as a root user logged in shell: + + .. code-block:: console + + $ curl -sL https://rpm.nodesource.com/setup_12.x | sudo bash - + +2. Install the node.js file: + + .. code-block:: console + + $ sudo apt-get install -y nodejs + +3. Verify the node version: + +.. + **Comment: - is this step relevant only for installing on Ubuntu, or on CentOS as well?** + + .. code-block:: console + + $ node -v + +Configuring the Network Time Protocol (NTP) +------------------------------------------- +This section describes how to configure your NTP. + +If you don't have internet access, see `Configure NTP Client to Synchronize with NTP Server `__. + +.. + **Comment: - Is this the correct procedure on the linked URL: Configure NTP Client to Synchronize with NTP Server?* + +**To configure your NTP:** + +1. Install the NTP file. + + .. code-block:: console + + $ sudo yum install ntp + +2. Enable the **ntpd** program. + + .. code-block:: console + + $ sudo systemctl enable ntpd + +3. Start the **ntdp** program. + + .. code-block:: console + + $ sudo systemctl start ntpd + +4. Print a list of peers known to the server and a summary of their states. + + .. code-block:: console + + $ sudo ntpq -p + +Configuring the Network Time Protocol Server +-------------------------------------------- +If your organization has an NTP server, you can configure it. + +**To configure your NTP server:** + +1. Output your NTP server address and append ``/etc/ntpd.conf`` to the outuput. + + .. code-block:: console + + $ echo -e "\nserver \n" | sudo tee -a /etc/ntp.conf + +2. Restart the service. + + .. code-block:: console + + $ sudo systemctl restart ntpd + +3. Check that synchronization is enabled: + + .. code-block:: console + + $ sudo timedatectl + +Checking that synchronization is enabled generates the following output: + + .. code-block:: console + + $ Local time: Sat 2019-10-12 17:26:13 EDT + Universal time: Sat 2019-10-12 21:26:13 UTC + RTC time: Sat 2019-10-12 21:26:13 + Time zone: America/New_York (EDT, -0400) + NTP enabled: yes + NTP synchronized: yes + RTC in local TZ: no + DST active: yes + Last DST change: DST began at + Sun 2019-03-10 01:59:59 EST + Sun 2019-03-10 03:00:00 EDT + Next DST change: DST ends (the clock jumps one hour backwards) at + Sun 2019-11-03 01:59:59 EDT + Sun 2019-11-03 01:00:00 EST + + + +Configuring the Server to Boot Without the UI +--------------------------------------------- +You can configure your server to boot without a UI in cases when it is not required (recommended) by running the following command: + +.. code-block:: console + + $ sudo systemctl set-default multi-user.target + +Running this command activates the **NO-UI** server mode. + +Configuring the Security Limits +-------------------------------- +The security limits refers to the number of open files, processes, etc. + +You can configure the security limits by running the **echo -e** command as a root user logged in shell: + +.. code-block:: console + + $ sudo bash + +.. code-block:: console + + $ echo -e "sqream soft nproc 1000000\nsqream hard nproc 1000000\nsqream soft nofile 1000000\nsqream hard nofile 1000000\nsqream soft core unlimited\nsqream hard core unlimited" >> /etc/security/limits.conf + +Configuring the Kernel Parameters +--------------------------------- +**To configure the kernel parameters: + +1. Insert a new line after each kernel parameter: + + .. code-block:: console + + $ echo -e "vm.dirty_background_ratio = 5 \n vm.dirty_ratio = 10 \n vm.swappiness = 10 \n vm.vfs_cache_pressure = 200 \n vm.zone_reclaim_mode = 0 \n" >> /etc/sysctl.conf + +**Notice:** In the past, the **vm.zone_reclaim_mode** parameter was set to **7.** In the latest Sqream version, the vm.zone_reclaim_mode parameter must be set to **0**. If it is not set to **0**, when a numa node runs out of memory, the system will get stuck and will be unable to pull memory from other numa nodes. + +2. Check the maximum value of the **fs.file**. + + .. code-block:: console + + $ sysctl -n fs.file-max + +3. *Optional* - If the maximum value of the **fs.file** is smaller than **2097152**, run the following command: + + .. code-block:: console + + $ echo "fs.file-max=2097152" >> /etc/sysctl.conf + +**IP4 forward** must be enabled for Docker and K8s installation only. + +Configuring the Firewall +-------------------------------- +The example in this section shows the open ports for four sqreamd sessions. If more than four are required, open the required ports as needed. Port 8080 in the example below is a new UI port. + +**To configure the firewall:** + +1. Start the service and enable FirewallID on boot: + + .. code-block:: console + + $ systemctl start firewalld + +2. Add the following ports to the permanent firewall: + + .. code-block:: console + + $ firewall-cmd --zone=public --permanent --add-port=8080/tcp + $ firewallfirewall-cmd --zone=public --permanent --add-port=3105/tcp + $ firewall-cmd --zone=public --permanent --add-port=3108/tcp + $ firewall-cmd --zone=public --permanent --add-port=5000-5003/tcp + $ firewall-cmd --zone=public --permanent --add-port=5100-5103/tcp + $ firewall-cmd --permanent --list-all + + +.. + **Comment: - does *--list-all* add the entire list of ports to the permanent firewall?** + +3. Reload the firewall: + + .. code-block:: console + + $ firewall-cmd --reload + +4. Start the service and enable FirewallID on boot: + + .. code-block:: console + + $ systemctl start firewalld + + If you do not need the firewall, you can disable it: + + .. code-block:: console + + $ sudo systemctl disable firewalld + + +Disabling selinux +-------------------------------- +**To disable selinux:** + +1. Show the status of **selinux**: + + .. code-block:: console + + $ sudo sestatus + +2. If the output is not **disabled**, edit the **/etc/selinux/config** file: + + .. code-block:: console + + $ sudo vim /etc/selinux/config + +3. Change **SELINUX=enforcing** to **SELINUX=disabled**. + +The above changes will only take effect after rebooting the server. + +You can disable selinux immediately after rebooting the server by running the following command: + +.. code-block:: console + + $ sudo setenforce 0 + +Configuring the /etc/hosts File +-------------------------------- +**To configure the /etc/hosts file:** + +1. Edit the **/etc/hosts** file: + + .. code-block:: console + + $ sudo vim /etc/hosts + +2. Call your local host: + + .. code-block:: console + + $ 127.0.0.1 localhost + $ + +.. + **Comment: - Is the above an output or a step?** + +Configuring the DNS +-------------------------------- +**To configure the DNS:** + +1. Run the **ifconfig** commasnd to check your NIC name. In the following example, **eth0** is the NIC name: + + .. code-block:: console + + $ sudo vim /etc/sysconfig/network-scripts/ifcfg-eth0 + +2. Replace the DNS lines from the example above with your own DNS addresses : + + .. code-block:: console + + $ sudo vim /etc/sysconfig/network-scripts/ifcfg-4.4.4.4 + $ sudo vim /etc/sysconfig/network-scripts/ifcfg-8.8.8.8 + + +.. + **Comment: - is the above input correct?** + +Installing the Nvidia CUDA Driver +=================================================== + + +**Warning:** If your UI runs on the server, the server must be stopped before installing the CUDA drivers. + +CUDA Driver Prerequisites +-------------------------------- +1. Verify that the NVIDIA card has been installed and is detected by the system: + + .. code-block:: console + + $ lspci | grep -i nvidia + +2. Check which version of gcc has been installed: + + .. code-block:: console + + $ gcc --version + +3. If gcc has not been installed, install it for one of the following operating systems: + + * On RHEL/CentOS: + + .. code-block:: console + + $ sudo yum install -y gcc + + * On Ubuntu: + + .. code-block:: console + + $ sudo apt-get install gcc + + +Updating the Kernel Headers +-------------------------------- +1. Update the kernel headers one of the following operating systems: + + * On RHEL/CentOS: + + .. code-block:: console + + $ sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) + + * On Ubuntu: + + .. code-block:: console + + $ sudo apt-get install linux-headers-$(uname -r) + +2. Install **wget** one of the following operating systems: + + * On RHEL/CentOS: + + .. code-block:: console + + $ sudo yum install wget + + * On Ubuntu: + + .. code-block:: console + + $ sudo apt-get install wget + +Disabling Nouveau +-------------------------------- +You can disable Nouveau, which is the default driver. + +**To disable Nouveau:** + +1. Check if the Nouveau driver has been loaded: + + .. code-block:: console + + $ lsmod | grep nouveau + +If the Nouveau driver has been loaded, the command above generates output. + +2. Blacklist the Nouveau drivers to disable them: + + .. code-block:: console + + $ cat <`__ for the additional set-up requirements. + + + c. **For K80**: + + .. code-block:: console + + $ nvidia-persistenced + $ nvidia-smi -pm 1 + $ nvidia-smi -acp 0 + $ nvidia-smi --auto-boost-permission=0 + $ nvidia-smi --auto-boost-default=0 + +4. Reboot the server and run the **NVIDIA System Management Interface (NVIDIA SMI)**: + + .. code-block:: console + + $ nvidia-smi + + +Disabling Automatic Bug Reporting Tools +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +**To disable automatic bug reporting tools:** + +1. Run the following **abort** commands: + + .. code-block:: console + + $ for i in abrt-ccpp.service abrtd.service abrt-oops.service abrt-pstoreoops.service abrt-vmcore.service abrt-xorg.service ; do sudo systemctl disable $i; sudo systemctl stop $i; done + + +The server is ready for the SQream software installation. + + +2. Run the following checks: + + a. Check the OS release: + + .. code-block:: console + + $ cat /etc/os-release + + b. Verify that a SQream user exists and has the same ID on all cluster member services: + + .. code-block:: console + + $ id sqream + + c. Verify that the storage is mounted: + + .. code-block:: console + + $ mount + + d. Verify that the driver has been installed correctly: + + .. code-block:: console + + $ nvidia-smi + + e. Check the maximum value of the **fs.file**: + + .. code-block:: console + + $ sysctl -n fs.file-max + + The desired output when checking the maximum value of the **fs.file** is greater or equal to **2097152**. + + f. Run the following command as a SQream user: + +.. + **Comment: - **Question - what do the following command parameter do? -c?** + + .. code-block:: console + + $ ulimit -c -u -n + +.. + **Comment: - See https://ss64.com/bash/ulimit.html** + + The following shows the desired output when ****: + + .. code-block:: console + + $ core file size (blocks, -c) unlimited + $ max user processes (-u) 1000000 + $ open files (-n) 1000000 + +3. Configure the security limits by running the **echo -e** command as a root user logged in shell: + + .. code-block:: console + + $ sudo bash + $ echo -e "sqream soft nproc 1000000\nsqream hard nproc 1000000\nsqream soft nofile 1000000\nsqream hard nofile 1000000\nsqream soft core unlimited\nsqream hard core unlimited" >> /etc/security/limits.conf + +Enabling Core Dumps +=================================================== + +Enabling core dumps is recommended, but optional. + +**To enable core dumps:** + +1. Check the *8abrtd** Status + +2. Set the limits + +3. Create the core dumps directory. + + +Checking the abrtd Status +--------------------------------------------------- + +**To check the abrtd status:** + +1. Check if **abrtd** is running: + + .. code-block:: console + + $ sudo ps -ef |grep abrt + +2. If **abrtd** is running, stop it: + + .. code-block:: console + + $ sudo service abrtd stop + $ sudo chkconfig abrt-ccpp off + $ sudo chkconfig abrt-oops off + $ sudo chkconfig abrt-vmcore off + $ sudo chkconfig abrt-xorg off + $ sudo chkconfig abrtd off + + +Setting the Limits +--------------------------------------------------- + +**To set the limits:** + +1. Set the limits: + + .. code-block:: console + + $ ulimit -c + +2. If the output is **0**, add the following lines to the **limits.conf** file (/etc/security): + + .. code-block:: console + + $ * soft core unlimited + $ * hard core unlimited + +3. Log out and log in to apply the limit changes. + +Creating the Core Dumps Directory +--------------------------------------------------- + +**To set the core dumps directory:** + +1. Make the **/tmp/core_dumps** directory: + + .. code-block:: console + + $ mkdir /tmp/core_dumps + +2. Set the ownership of the **/tmp/core_dumps** directory: + + .. code-block:: console + + $ sudo chown sqream.sqream /tmp/core_dumps + +3. Grant read, write, and execute permissions to all users: + + .. code-block:: console + + $ sudo chmod -R 777 /tmp/core_dumps + +Setting the Output Directory of the /etc/sysctl.conf File +----------------------------------------------------------------- + +**To set the output directory of the /etc/sysctl.conf file:** + +1. Edit the **/etc/sysctl.conf** file: + + .. code-block:: console + + $ sudo vim /etc/sysctl.conf + +2. Add the following to the bottom of the file: + + .. code-block:: console + + $ kernel.core_uses_pid = 1 + $ kernel.core_pattern = //core-%e-%s-%u-%g-%p-%t + $ fs.suid_dumpable = 2 + +.. + **Comment: - leave a note that the user can choose his correct location of the folder.** + +3. To apply the changes without rebooting the server, run: + + .. code-block:: console + + $ sudo sysctl -p + +4. Check that the core output directory points to the following: + + .. code-block:: console + + $ sudo cat /proc/sys/kernel/core_pattern + + The following shows the correct generated output: + + .. code-block:: console + + $ /tmp/core_dumps/core-%e-%s-%u-%g-%p-%t + +5. Verify that the core dumping works: + + .. code-block:: console + + $ select abort_server(); + + +Verifying that the Core Dumps Work +--------------------------------------------------- +You can verify that the core dumps work only after installing and running SQream. This causes the server to crash and a new core.xxx file to be included in the folder that is written in **/etc/sysctl.conf** + +**To verify that the core dumps work:** + +1. Stop and restart all SQream services. + +2. Connect to SQream with ClientCmd and run the following command: + + .. code-block:: console + + $ select abort_server(); + + +.. + **Comment: - what did the author mean by "Stage 3"?** + +Troubleshooting Core Dumping +--------------------------------------------------- +This section describes the troubleshooting procedure to be followed if all parameters have been configured correctly, but the cores have not been created. + +**To troubleshoot core dumping:** + +1. Reboot the server. + +2. Verify that you have folder permissions: + + .. code-block:: console + + $ sudo chmod -R 777 /tmp/core_dumps + +3. Verify that the limits have been set correctly: + + .. code-block:: console + + $ ulimit -c + + If all parameters have been configured correctly, the correct output is: + + .. code-block:: console + + $ unlimited + +4. If all parameters have been configured correctly, but running **ulimit -c** outputs **0**, run the following: + + .. code-block:: console + + $ sudo vim /etc/profile + +5. Search for line and tag it with the **hash** symbol: **Search for which line?** + + .. code-block:: console + + $ ulimit -S -c 0 > /dev/null 2>&1 + + +6. If the line is not found in **/etc/profile** directory, do the following: + + a. Run the following command: + + .. code-block:: console + + $ sudo vim /etc/init.d/functions + + b. Search for the following: + + .. code-block:: console + + $ ulimit -S -c ${DAEMON_COREFILE_LIMIT:-0} >/dev/null 2>&1 + + c. If the line is found, tag it with the **hash** symbol and reboot the server. \ No newline at end of file