asdaraujo · NecromuncherDev · Jun 30, 2024 · Jun 30, 2024 · Jun 30, 2024 · Jun 30, 2024
diff --git a/images/ssb/mv1.png b/images/ssb/mv1.png
diff --git a/images/ssb/mvconfig1.png b/images/ssb/mvconfig1.png
diff --git a/images/ssb/mvconfig2.png b/images/ssb/mvconfig2.png
diff --git a/images/ssb/mvconfig3.png b/images/ssb/mvconfig3.png
diff --git a/images/ssb/mvlist.png b/images/ssb/mvlist.png
diff --git a/images/ssb/ssb-iot-enriched-avro.png b/images/ssb/ssb-iot-enriched-avro.png
diff --git a/images/ssb/ssb-job-running.png b/images/ssb/ssb-job-running.png
diff --git a/images/ssb/ssb-kafka-source.png b/images/ssb/ssb-kafka-source.png
diff --git a/images/ssb/ssb-new-kafka-table.png b/images/ssb/ssb-new-kafka-table.png
diff --git a/images/ssb/ssb_job_status.png b/images/ssb/ssb_job_status.png
diff --git a/workshop_ssb.adoc b/workshop_ssb.adoc
@@ -33,19 +33,21 @@ Albeit simple, this task will show the ease of use and power of SQL Stream Build
 
 Before you can start querying data from Kafka topics you need to register the Kafka clusters as _data sources_ in SSB.
 
+. Login to the Cloudera Manager console using username `admin` and password `Supersecret1`.
+
 . On the Cloudera Manager console, click on the Cloudera logo at the top-left corner to ensure you are at the home page and then click on the *SQL Stream Builder* service.
 
-. Click on the *SQLStreamBuilder Console* link to open the SSB UI.
+. Click on the *SQLStreamBuilder Console* link to open the SSB UI, and login with the same credentials ( `admin` : `Supersecret1` ).
 
-. On the logon screen, authenticate with user `admin` and password `Supersecret1`.
+. You will notice that SSB already has a project named `admin_default`. Click on the blue `-> Open` button to see what's inside.
 
-. You will notice that SSB already has a Kafka cluster registered as a data provider, named `CDP Kafka`. This provider is created automatically for SSB when it is installed on a cluster that also has a Kafka service:
+. Under `Data Sources`, you will see a `Kafka` folder with a ready made cluster named `Local Kafka`:
 +
-image::images/ssb/register-kafka-provider.png[width=800]
+image::images/ssb/ssb-kafka-source.png[width=800]
 
 . You can use this screen to add other external Kafka clusters as data providers to SSB. In this lab you'll add a second data provider using a different host name, just to show how simple it is.
 
-. Click on *Register Kafka Provider* and in the *Add Kafka Provider* window, enter the details for your new data source and click *Save changes*.
+. Click on the 3 dots next to the `Kafka` folder name, then click `New Kafka Data Source`. In the pop-up window enter the details for your new data source, as provided below.
 +
 [source,yaml]
 ----
@@ -55,17 +57,21 @@ Connection protocol: PLAINTEXT
 ----
 +
 image::images/ssb/add-kafka-provider.png[width=400]
+> **_TIP:_** 
+>
+> If you are unsure regarding the format of the brokers list, you can allways reference the existing `Local Kafka` source, or ask your workmates and Cloudera instructors for help 😊
+
+. Finally, click *Validate* (on the bottom left) and *Save changes* (on the bottom right) to create your data source.
+
 
 [[lab_2, Lab 2]]
 == Lab 2 - Create a Table for a topic with JSON messages
 
 Now you can _map_ the `iot_enriched` topic to a _table_ in SQL Stream Builder.
 _Tables_ in SSB are a way to associate a Kafka topic with a schema so that you can use it in your SQL queries.
 
-. To create your first Table, click on *Console* (on the left bar), enter a name for your job (e.g. "my_first_job") and click on the *Create Job* button.
-. On the *Virtual Tables* pane on the left, click *Add Table > Apache Kafka*.
-+
-image::images/ssb/add-table.png[width=800]
+. To create your first Table you first need to create a Job. Click on the 3 dots next to the `Jobs` folder and then click `New Job`. Enter a name for your job (e.g. "my_first_job") and click on the *Create* button.
+.  Click on the 3 dots next to the `Virtual Tables` folder and then click `New Kafka Table`.
 
 . On the *Kafka Table* window, enter the following information:
 +
@@ -77,15 +83,13 @@ Data Format:   JSON
 Topic Name:    iot_enriched
 ----
 +
-image::images/ssb/kafka-source.png[width=400]
+image::images/ssb/ssb-new-kafka-table.png[width=800]
 
 . Ensure the *Schema Definition* tab is selected. Click *Detect Schema* at the bottom of the window.
 SSB will take a sample of the data flowing through the topic and will infer the schema used to parse the content.
-Alternatively you could also specify the schema in this tab.
 +
-image::images/ssb/detect-schema.png[width=800]
+Click *OK* when it's done, to acknowledge the "Schema Detection Complete" message.
 
-. Click *OK* to acknowledge the "Schema Detection Complete" message.
 . Whenever you need to manipulate the source data to fix, cleanse or convert some values, you can define transformations for the table.
 Transformations are defined in Javascript code.
 +
@@ -138,7 +142,7 @@ image::images/ssb/source-properties.png[width=400]
 NOTE: Setting the *Consumer Group* properties for a virtual table will allow SSB to also store offsets in Kafka, in addition to storing offsets in the job state, which is the default.
 
 . Click *Create and Review* to complete the table creation. On the *Review* window, click *Keep*.
-. Let's query the newly created table to ensure things are working correctly. Enter the following query on the SQL editor are (top-right in the Console screen):
+. Let's query the newly created table to ensure things are working correctly. Go to the job you've created (in this example this is "my_first_job"), and to on the top window enter the following query:
 +
 [source,sql]
 ----
@@ -157,7 +161,7 @@ FROM
 +
 NOTE: The first query execution usually takes a bit longer, since SSB has to start the Job Manager that will handle the job execution.
 +
-image::images/ssb/first-query.png[width=800]
+image::images/ssb/ssb-job-running.png[width=800]
 
 
 . Click *Stop* to stop the job and release all the cluster resources used by the query.
@@ -195,9 +199,7 @@ Evolve:        checked
 +
 image::images/ssb/schema-registy-iot-enriched.png[width=800]
 
-. Back on the SQL Stream Builder page, click on *Data Providers* (on the left bar) *> Catalogs > (+) Register Catalog*.
-+
-image::images/ssb/add-catalog-sr.png[width=800]
+. Back on the SQL Stream Builder page, click on the `Data Sources` folder > 3 dots next to the `Catalog` folder > (+) New Catalog.
 
 . In the *Catalog* dialog box, enter the following details:
 +
@@ -210,17 +212,17 @@ Schema Registry URL: http://<CLUSTER_HOSTNAME>:7788/api/v1
 Enable TLS:          No
 ----
 
-. Click on the *Add Filter* button and enter the following configuration for the filter:
+. On the same window under *Filters* enter the following configuration for the filter:
 +
 [source,yaml]
 ----
 Database Filter: .*
 Table Filter:    iot.*
 ----
-
-. Click on the plus sign besides the filter details to register the filter:
 +
-image::images/ssb/add-filter.png[width=400]
+NOTE: Make sure to write `.\*` in the "Database Filter" field, and not `*`, otherwise you'll get a validation error later.
+
+. **IMPORTANT** Click on the plus sign **(+)** besides the filter details to save the filter.
 
 . Click on *Validate*. If the configuration is correct you should see the message "Provider is valid".
 Hover your mouse over the message and you'll see the number of tables (schemas) that matched your filter.
@@ -229,11 +231,11 @@ image::images/ssb/add-sr-catalog.png[width=400]
 
 . Click *Create* to complete the catalog registration.
 
-. On the *Console* screen you should see now the list of tables that were imported from Schema Registry.
+. Under `External Resources/Virtual Tables/sr/default_database/` you should see the list of tables that were imported from the Schema Registry.
 +
-image::images/ssb/sr-tables.png[width=300]
+image::images/ssb/ssb-iot-enriched-avro.png[width=500]
 
-. Query the imported table to ensure it is working correctly.
+. Use your created job or a new job to Query the imported table and ensure it is working correctly.
 +
 Clear the contents of the SQL editor and type the following query:
 +
@@ -276,11 +278,11 @@ Streams Messaging Manager Web UI*).
 ... Availability: `Low`
 ... Cleanup Policy: `delete`
 
-. On the SSB UI, click *New Job* at the top of the *Console* screen.
+. On the SSB UI, create a new job by clicking on the 3 dots next to the `Jobs` folder, then on `New Job`.
 
-. On the *Create New Job* dialog box, enter `Sensor6Stats` for the *Job Name* and click *Create Job*.
+. On the *Create New Job* dialog box, enter `Sensor6Stats` for the *Job Name* and click *Create*.
 
-. In the SQL editor type the query shown below.
+. In the SQL editor type the query shown below, *but do not* execute yet.
 +
 This query will compute aggregates over 30-seconds windows that slide forward every second. For a specific sensor value in the record (`sensor_6`) it computes the following aggregations for each window:
 +
@@ -337,6 +339,30 @@ image::images/ssb/template-table-edited.png[width=400]
 
 . Click *Execute* and the table will be created.
 
+. An additional way to create the table is by running this query: 
+
+[source,sql]
+----
+CREATE TABLE  `ssb`.`ssb_default`.`sensor6stats` (
+  `device_id` BIGINT,
+  `windowEnd` TIMESTAMP(3) NOT NULL,
+  `sensorCount` BIGINT NOT NULL,
+  `sensorSum` BIGINT,
+  `sensorAverage` FLOAT,
+  `sensorMin` BIGINT,
+  `sensorMax` BIGINT,
+  `sensorGreaterThan60` INT NOT NULL
+) WITH (
+  'connector' = 'kafka: edge2ai-kafka', -- Specify what connector to use, for Local Kafka it must be 'kafka: Local Kafka'.
+  'format' = 'json', -- Data format
+  'scan.startup.mode' = 'group-offsets', -- Startup mode for Kafka consumer, valid values are 'earliest-offset', 'latest-offset', 'group-offsets', 'timestamp' and 'specific-offsets'"
+  'topic' = 'sensor6stats', -- To read data from when the table is used as source. It also supports topic list for source by separating topic by semicolon. 
+  'properties.group.id' = 'sensor6stats-group-id',
+  'properties.auto.offset.reset' = 'latest'
+);
+
+----
+
 . Type the original query into the editor again and press *Execute* to run it.
 
 . At the bottom of the screen you will see the log messages generated by your query execution.
@@ -349,9 +375,6 @@ Note that the data displayed on the screen is only a sample of the data returned
 +
 image::images/ssb/sql-aggr-results.png[width=800]
 +
-TIP: If you need more screen space to examine the query results, you can hide the tables pane by clicking on the editor option shown below:
-+
-image::images/ssb/hide-tables.png[width=600]
 
 . Check the job execution details and logs by clicking on *SQL Jobs* (on the left bar). Explore the options on this screen:
 +
@@ -401,9 +424,9 @@ In this lab you'll create and query Materialized Views (MV).
 
 You will define MVs on top of the query you created in the previous lab. Make sure that query is running before executing the steps below.
 
-. On the *SQL Jobs* screen, verify that the `Sensor6Stats` job is running. Select the job and click on the *Edit in Console View* button.
+. On the *SQL Jobs* screen, verify that the `Sensor6Stats` job is running. Select the job 
 +
-image::images/ssb/edit-job.png[width=800]
+image::images/ssb/ssb_job_status.png[width=800]
 
 . In order to add Materialized Views to a query the job needs to be stopped.
 On the job page, click the *Stop* button to pause the job.
@@ -419,19 +442,19 @@ Primary Key:           device_id
 Retention:             300
 ----
 +
-image::images/ssb/mv-config1.png[width=500]
+image::images/ssb/mv1.png[width=500]
 
 . To create a MV you need to have an API Key.
 The API key is the information given to clients so that they can access the MVs.
 If you have multiple MVs and want them to be accessed by different clients you can have multiple API keys to control access to the different MVs.
 +
 If you have already created an API Key in SSB you can select it from the drop-down list.
-Otherwise, create one on the spot by clicking on the "plus" button shown above.
+Otherwise, create one on the spot by clicking on the "New Key" button shown above.
 Use `ssb-lab` as the Key Name.
 +
 Once the API key is created, select it for your MV.
 
-. Click *Add New Query* to create a new MV.
+. Click *New Endpoint* to create a new MV.
 You will create a view that shows all the devices for which `sensor6` has had at least 1 reading above 60 in the last 300-seconds (MV window size).
 +
 For this, enter the following parameters in the MV Query Configuration page:
@@ -441,14 +464,15 @@ For this, enter the following parameters in the MV Query Configuration page:
 URL Pattern:   above60
 Description:   All devices with a sensor6 reading greater than 60
 Query Builder: <click "Select All" to add all columns>
-Filters:       sensorGreatThan60  greater  0
+Filters:       <click "+ Rule" to configure filter>  sensorGreatThan60  greater  0
 ----
 +
-image::images/ssb/mv-config2.png[width=500]
+image::images/ssb/mvconfig1.png[width=500]
 +
-image::images/ssb/mv-config2b.png[width=500]
+image::images/ssb/mvconfig2.png[width=500]
 
-. Click *Apply and Save Job*.
+. Click *Create*.
+. Click *Save*.
 
 . Close the *Materialized Views* tab and click on *Execute* to start the job again.
 
@@ -470,9 +494,9 @@ In this section you will create a new MV that allows filtering by specifying a r
 
 . First, stop the job again so that you can add another MV.
 
-. Click on the *Materialized Views* button and then on *Add New Query* to create a new MV.
+. Click on the *Materialized Views* button and then on *New endpoint* to create a new MV.
 +
-Enter the following property values and click *Apply and Save Job*.
+Enter the following property values and click *Save*.
 +
 [source,yaml]
 ----
@@ -485,11 +509,11 @@ Filters:       sensorGreatThan60  greater           0
                sensorAverage      less or equal     {upperTemp}
 ----
 +
-image::images/ssb/mv-config3.png[width=500]
+image::images/ssb/mvconfig3.png[width=600]
 
 . You will notice that the new URL for this MV has placeholders for the `{lowerTemp}` and `{upperTemp}` parameters:
 +
-image::images/ssb/mv-url-parameters.png[width=500]
+image::images/ssb/mvlist.png[width=500]
 
 . Close the *Materialized View* tab and execute the job again.
 
@@ -511,4 +535,3 @@ image::images/ssb/mv-parameters.png[width=400]
 You have now taken data from one topic, calculated aggregated results and written these to another topic.
 In order to validate that this was successful you have selected the result with an independent select query.
 Finally, you created Materialized Views for one of your jobs and queried those views through their REST endpoints.
-