Conversation
There was a problem hiding this comment.
Thanks for the update — removing the old Service Bus polling / message-count logic is definitely the right direction now that all entities have SB triggers in place.
That said, I don’t think this PR matches the approach we discussed, so I don’t think it should be merged in its current state.
Main concern
We agreed not to use the orchestration file for this routing, and instead to simplify the pipeline by replacing the large set of IfConditions with a single Switch on entity_name.
This PR does not do that. Instead it:
- adds a lookup to orchestration_ServiceBus.json
- filters metadata to find a notebook name
- introduces a Switch based on EMPTY / NOT_EMPTY / NOT_EMPTY_NSIP_PROJECT
- still keeps nsip-project as a hardcoded special case anyway
So this is not really the clean Switch-based routing we discussed. It adds extra complexity and introduces metadata dependency into the pipeline when the intention was to keep the routing explicit and simple inside the pipeline itself.
Orchestration file point:
We also discussed not adding anything to the existing orchestration file.
The current orchestration file is already being used for broader source / ingestion metadata, and we didn’t want to start mixing in extra pipeline-routing metadata there as well.
If we ever decide to go down the orchestration-driven route in future, then it should be done via a separate orchestration/config file built specifically for this pipeline, rather than extending the current shared orchestration metadata and mixing responsibilities together.
So even if we were to accept metadata-driven routing later, it still shouldn’t be implemented by adding more logic to the current orchestration structure.
Why this is a problem
1 - The Switch is not routing by entity
- The case names (EMPTY, NOT_EMPTY, NOT_EMPTY_NSIP_PROJECT) are not clear processing routes.
- We wanted a Switch on the actual entity, which would be much easier to read and maintain.
2 - The orchestration file should not be involved here - This adds unnecessary lookup/filter/set-variable steps.
- It also creates another place where the pipeline can silently break if metadata is missing or inconsistent.
3 - nsip-project is still hardcoded separately - If we are moving to a Switch-based approach, then the entity routing should be explicit and consistent for all entities.
What I think the pipeline should do instead:
After the generic notebooks, the pipeline should go to a single Switch activity on entity_name, for example:
service-user → py_sb_horizon_harmonised_service_user
nsip-project → py_sb_horizon_harmonised_nsip_project then py_sb_harmonised_nsip_invoice and py_sb_harmonised_nsip_meeting
nsip-representation → py_sb_horizon_harmonised_nsip_representation
nsip-document → py_sb_horizon_harmonised_nsip_document
nsip-exam-timetable → py_sb_horizon_harmonised_nsip_exam_timetable
s51-advice → py_sb_horizon_harmonised_nsip_s51_advice
appeal-document → py_sb_horizon_harmonised_appeal_document
appeal-event → py_sb_horizon_harmonised_appeal_event
appeal-has → py_sb_horizon_harmonised_appeal_has
appeal-s78 → py_sb_horizon_harmonised_appeal_s78
default → generic py_sb_std_to_hrm
That would:
- remove the need for the orchestration lookup entirely
- make the pipeline easier to understand
- match the approach we agreed
- keep all entity routing in one visible place
Other cleanup points:
There are also still a few leftover items from the old design that should be removed if we are refactoring this pipeline properly, for example:
- Number_of_messages
- any old naming/messages that still refer to checking/loading Service Bus rather than processing already-landed data
Suggested action:
I’d recommend reworking this PR so it:
- removes the orchestration lookup/filter/variable steps
- replaces them with one Switch directly on entity_name
- routes each entity explicitly
- uses default branch for generic
At the moment this feels like a partial redesign in a different direction than what we agreed.
|
@frederic Jonquieres Thanks for the feedback. I wanted to clarify one point: the use of the orchestration lookup was an intentional design decision. After discussing the options with Ram, we agreed to keep the orchestration lookup because it will support our future maintenance strategy. It provides a central place to manage notebook routing as the number of entities grows, rather than hardcoding the routing logic directly into the pipeline. |
Thanks for the clarification, but I want to be clear that we did not agree to use the existing orchestration file. What we discussed were two possible approaches: 1 - Use a Switch on entity_name
2 - If we wanted to go down the orchestration/config route
That is not what has been done here. In this PR, routing has been added by using the existing orchestration file, which is specifically what we said not to do. So just to be clear, the two agreed options were:
At the moment, this implementation is neither of those, so I don’t think this PR can be merged in its current form. |
…upportServiceBustrigger
…upportServiceBustrigger
…upportServiceBustrigger
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
|
Azure Pipelines: Successfully started running 1 pipeline(s). 4 pipeline(s) were filtered out due to trigger conditions. |
https://pins-ds.atlassian.net/browse/THEODW-2905
Updated Pipeline Flow pln_service_bus
Initialise variables and build pipeline name
Extract entity_name
Construct: pln_service_bus - <entity_name>
Log start to Application Insights
Stage: OnProgress
Message: “Progressing to load service bus: <entity_name>”
Create tables in Standardised, Harmonised, Curated layers
Runs create_table_from_schema for all 3 layers
Includes pipeline metadata
Raw → Standardised notebook execution
Notebook: py_sb_raw_to_std
Handles success + failure logging
Standardised → Harmonised notebook execution
Notebook: py_sb_std_to_hrm
Handles success + failure logging
Load orchestration config for entity‑specific harmonised logic
Loads orchestration_ServiceBus.json
Filters matching Source_Folder = ServiceBus and Source_Frequency_Folder = <entity_name>
Extracts harmonised_notebook_name
Run configured harmonised notebook(s)
Uses Switch condition:
NOT_EMPTY → run configured notebook
NOT_EMPTY_NSIP_PROJECT → run notebooks for nsip-project
EMPTY → skip harmonised step
Log completion or failure status
On success: Stage = Completion
On failure: Stage = Fail
Switch error captured if applicable
PR Template
Note: Run the correct ADO pipeline for this PR - check the list here:
ODW Repositories
JIRA Ticket Reference :
[ Enter JIRA ticket number and Title here]
Summary of the work :
[ Enter Summary here]
New Source-to-Raw Datasets
New Tables in Standardised Layer
New Tables in Harmonised or Curated Layers
Schema or Column Changes
(Only new columns or columns with changed data types are in scope)
Script Execution in Build
Table Creation and Schema Validation
Deployment and Schema Change Documentation
Archiving Process Review