Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Variant 1: Explicit equality-rules with the ``equality`` property
"equality": [
["eq", "a.x", "b.x"],
["eq", "b.x", "c.y"],
["eq", "c.z", "d.z"],
["eq", "c.z", "d.z"]
],
"supports_signalling": false
}
Expand All @@ -88,7 +88,7 @@ Variant 2: Implicit equality-rules with the ``equality_sets`` property
"datasets": ["A a", "B b", "C c", "D d"],
"equality_sets": [
["a.x", "b.x", "c.y"],
["c.z", "d.z"],
["c.z", "d.z"]
],
"supports_signalling": false
}
Expand Down Expand Up @@ -273,7 +273,7 @@ one at a time, like this:
"equality": [
["eq", "a.x", "b.x"],
["eq", "b.x", "c.y"],
["eq", "c.z", "d.z"],
["eq", "c.z", "d.z"]
],

The ``equality_sets`` property was added as a way to makes it clearer which equality-rules belong together.
Expand All @@ -283,7 +283,7 @@ The equality-rules above could be expressed like this:

"equality_sets": [
["a.x", "b.x", "c.y"],
["c.z", "d.z"],
["c.z", "d.z"]
],

Note that the ``equality_sets`` property is just a bit of syntactic sugar; behind the scenes the implicit
Expand All @@ -296,7 +296,7 @@ if you accidentally specify two equality-sets that are actually overlapping. If

"equality_sets": [
["a.x", "b.x", "c.y"],
["c.y", "d.y"],
["c.y", "d.y"]
],

you won't actually get two equality-sets, since behind the scenes you end up with these equality-rules:
Expand All @@ -314,7 +314,7 @@ which is equivalent to specifying a single equality-set, like this:
.. code-block :: json

"equality_sets": [
["a.x", "b.x", "c.y", "d.y"],
["a.x", "b.x", "c.y", "d.y"]
],

Continuation support
Expand Down Expand Up @@ -369,7 +369,7 @@ Dataset ``C``:
[
{"_id": "c1", "f3": "X"},
{"_id": "c2", "_deleted": true, "f3": "Y"},
{"_id": "c3", "_deleted": true, "f3": "X"},
{"_id": "c3", "_deleted": true, "f3": "X"}
]


Expand All @@ -384,7 +384,7 @@ Pipe configuration:
"datasets": ["A a", "B b", "C c"],
"equality": [
["eq", "a.f1", "b.f1"],
["eq", "b.f2", ["lower", "c.f3"]],
["eq", "b.f2", ["lower", "c.f3"]]
]
}
}
Expand Down
14 changes: 7 additions & 7 deletions hub/features/namespaces.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Namespaces

Namespaces in Sesam are inspired by the Resource Description Framework `(RDF) <https://www.w3.org/RDF/>`_ where namespaces are URL references that allows us to reuse names from different sources without loosing context. In Sesam namespaces are used to determine the origin of attributes, which is essential for master data management in :ref:`global pipes <whatis-global>`. Inside global pipes we often wish to merge entities with other similar entities, such as person data from a CRM system and the equivalent person data from an HR system. Sesam prefixes namespaces to each attribute name in order to merge data from multiple sources without losing context. Namespaces in Sesam are also essential for :ref:`late schema binding <transform-late-schema-binding>` where we map global models to target models.

With some exceptions, attributes will always inherit the pipe name where the attribute was first created as its namespace, i.e. an attribute ``x`` created in pipe ``y`` will use the namespace ``y:``, becoming ``y:x``.
With some exceptions, attributes will always inherit the pipe name where the attribute was first created as its namespace, i.e. an attribute ``x`` created in pipe ``y`` will use the namespace ``y:``, becoming ``y:x``.

See examples below.

Expand All @@ -26,7 +26,7 @@ See examples below.
- visma-person:name
* - hubspot-company
- company_name
- hubspot-company:company_name
- hubspot-company:company_name
* - visma-company
- company_name
- visma-company:company_name
Expand All @@ -35,10 +35,10 @@ How to enable
-------------

**Enable on specific pipes:**
Namespaces can be enabled on specific pipes by setting the required property/properties in the pipe configuration (see properties below).
Namespaces can be enabled on specific pipes by setting the required property/properties in the pipe configuration (see properties below).

**Enable globally in a subscription:**
You can enable namespaces in the service metadata for all the pipes in your subscription. This can be overridden at the pipe level.
You can enable namespaces in the service metadata for all the pipes in your subscription. This can be overridden at the pipe level.

.. important::

Expand Down Expand Up @@ -80,14 +80,14 @@ Properties
- Boolean
- If ``true`` then the current identity namespace will be added to ``_id`` and the current property namespace will be added to all properties. The namespaces are added before the first transform. This property is normally only specified on inbound pipes.

If ``namespaced_identifiers`` is enabled in the service metadata then the source default value is used. The following sources has a default value of ``true``: :ref:`csv <csv_source>`, :ref:`ldap <ldap_source>`, :ref:`sql <sql_source>`, :ref:`embedded <embedded_source>`, :ref:`http_endpoint <http_endpoint_source>`, and :ref:`json <json_source>`.
If ``namespaced_identifiers`` is enabled in the service metadata then the source default value is used. The following sources has a default value of ``true``: :ref:`csv <csv_source>`, :ref:`kafka <kafka_source>`, :ref:`ldap <ldap_source>`, :ref:`sql <sql_source>`, :ref:`embedded <embedded_source>`, :ref:`http_endpoint <http_endpoint_source>`, and :ref:`json <json_source>`.
- Source default
-

* - ``remove_namespaces``
- Boolean
- If ``true`` then namespaces will be removed from ``_id``, properties and namespaced identifier values. The namespaces are removed after the last transform. This property is normally only specified on outbound pipes.

If ``namespaced_identifiers`` is enabled in the service metadata then the sink default value is used. The following sinks has a default value of ``true``: :ref:`csv_endpoint <csv_endpoint_sink>`, :ref:`elasticsearch <elasticsearch_sink>`, :ref:`mail <mail_sink>`, :ref:`rest <rest_sink>`, :ref:`sms <sms_sink>`, :ref:`solr <solr_sink>`, :ref:`sql <sql_sink>`, :ref:`http_endpoint <http_endpoint_sink>`, and :ref:`json <json_sink>`.
If ``namespaced_identifiers`` is enabled in the service metadata then the sink default value is used. The following sinks has a default value of ``true``: :ref:`csv_endpoint <csv_endpoint_sink>`, :ref:`elasticsearch <elasticsearch_sink>`, :ref:`mail <mail_sink>`, :ref:`rest <rest_sink>`, :ref:`sms <sms_sink>`, :ref:`solr <solr_sink>`, :ref:`kafka <kafka_sink>`, :ref:`sql <sql_sink>`, :ref:`http_endpoint <http_endpoint_sink>`, and :ref:`json <json_sink>`.
- Sink default
-
-