Skip to content

Multi Table Logic Execution

Val Huber edited this page Sep 19, 2020 · 4 revisions

Performance is key

Rules have been used in decision logic, and even in some ORM systems. They can provide great value, but typically do not provide scalable transaction performance.

In some cases such as decision logic, this is inescapable - you have to run all the rules on all the supplied data. But on transactional systems, this is not the case. Instead of just processing an array of row objects, the system can (and must) recognize that:

  • each updated row has an existing row on disk - the old values

  • the system can compare the old values to the new values, and

    1. Prune the rules that do not apply

    2. And when the rules must be run, the declarative nature of rules enables the system to execute the rules in any manner that returns the correct result. In particular, the system can avoid expensive aggregate queries, and use the old/new delta to compute a 1-row adjustment to the parent row.

For example, imagine you need to compute the balance for the credit limit check. You need to add all the order totals (an expensive SQL `sum). But it’s worse - the order total itself is a summed field, so you need to add all of those too.

If a customer has thousands of orders, each with thousands of items, this will be painfully slow.

But if the system leverages the old/new to make an adjustment update, an order of $50 simply means ”add 50 to the existing balance” - no need to aggregate the totals.

Adjustments - sum / counts adjusted in 1 row updates, not expensive aggregate SQLs

Rollups provoke an important design choice: store the aggregate, or sum things on the fly. Here, the stored aggregates are Customer.Balance, and Order.AmountTotal (a chained aggregate). There are good cases to be made for both approaches:

  • Sum on the fly - use sql select sum queries to aggregate child data as required. This eliminates consistency risks with storing redundant data (i.e, the aggregate becomes invalid if an application fails to adjust it in all of the cases).

  • Stored Aggregates - a good choice when data volumes are large, and / or chain, since the application can adjust (make a 1 row update) the aggregate based on the delta of the children.

This design decision can dominate application coding. It's nefarious, since data volumes may not be known when coding begins. (Ideally, this can be a "late binding" decision, like a sql index.)

The logic engine uses the Stored Aggregate approach. This optimizes multi-table update logic chaining, where updates to 1 row trigger updates to other rows, which further chain to still more rows.

Pruning

Pruning was core to changing Order dates:

  • DueDate had no dependencies, so all the logic for adjusting Customers and cascading OrderDetails was pruned.

  • Contrast this to the multiple rows retrieved / update when ShippedDate is changed.

Clone this wiki locally