Skip to content

Two ways of doing reification with CONSTRUCT are not yielding equivalent and correct results #109

@rogerlucena

Description

@rogerlucena

To illustrate this issue I use the "family" and "grandparents" example below.

Given a ?family graph with the following data:

INSERT DATA INTO ?family {
  /u<joe> "parent_of"@[] /u<mary> .
  /u<joe> "parent_of"@[] /u<peter> .
  /u<peter> "parent_of"@[] /u<john> .
  /u<peter> "parent_of"@[] /u<eve>
};

(Note that "Joe" is the only grandparent here and his grandchildren are "John" and "Eve").

We can use CONSTRUCT to create a new ?grandparents graph extracting only the "grandparent" relation from ?family, while also adding a new information "both_live_in" for each grandparent / granchild fact (using a blank node - reification). For that, following the documentation here, we can proceed in two directions:

1) Using ";" and partial statements

We can do:

CONSTRUCT {
  ?ancestor "grandparent"@[] ?grandchildren ; "both_live_in"@[] /city<NY>
}
INTO ?grandparents
FROM ?family
WHERE {
  ?ancestor "parent_of"@[] ?c .
  ?c "parent_of"@[] ?grandchildren
};

Now, to have a look at the ?grandparents graph we can query all its triples using:

SELECT ?s, ?p, ?o
FROM ?grandparents
WHERE {
  ?s ?p ?o
};

Which returns:

Result:
?s	?p	?o
/_<110e9ee3-30dc-4915-ba70-5907c3ae22b5>	"both_live_in"@[]	/city<NY>
/_<20a64f0d-1523-4946-93d0-6cbe2ff284a0>	"_predicate"@[]	"grandparent"@[]
/_<20a64f0d-1523-4946-93d0-6cbe2ff284a0>	"both_live_in"@[]	/city<NY>
/_<20a64f0d-1523-4946-93d0-6cbe2ff284a0>	"_object"@[]	/u<eve>
/_<20a64f0d-1523-4946-93d0-6cbe2ff284a0>	"_subject"@[]	/u<joe>
/_<110e9ee3-30dc-4915-ba70-5907c3ae22b5>	"_predicate"@[]	"grandparent"@[]
/_<110e9ee3-30dc-4915-ba70-5907c3ae22b5>	"_subject"@[]	/u<joe>
/_<110e9ee3-30dc-4915-ba70-5907c3ae22b5>	"_object"@[]	/u<john>

2) Using the explicit blank node notation

To generate this ?grandparents graph we can also choose to use the explicit blank node notation below:

CONSTRUCT {
  ?ancestor "grandparent"@[] ?grandchildren .
  _:v "_subject"@[] ?ancestor .
  _:v "_predicate"@[] "grandparent"@[] .
  _:v "_object"@[] ?grandchildren .
  _:v "both_live_in"@[] /city<NY>
}
INTO ?grandparents
FROM ?family
WHERE {
  ?ancestor "parent_of"@[] ?c .
  ?c "parent_of"@[] ?grandchildren
};

In this case, if we query all the triples of ?grandparents with the same SELECT query used in 1) we get:

Result:
?s	?p	?o
/u<joe>	"grandparent"@[]	/u<eve>
/_<v>	"both_live_in"@[]	/city<NY>
/_<v>	"_predicate"@[]	"grandparent"@[]
/u<joe>	"grandparent"@[]	/u<john>
/_<v>	"_object"@[]	/u<eve>
/_<v>	"_object"@[]	/u<john>
/_<v>	"_subject"@[]	/u<joe>

Which diverges from the result got in 1).

Note that in the result for 2) we have the expected triples /u<joe> "grandparent"@[] /u<john> and /u<joe> "grandparent"@[] /u<eve>, whereas they do not appear in 1). Also, note that in 1) we get two blank nodes, as predicted by the documentation, while in 2) we get only one blank node with two "_object" relations (one for /u<john> and another for /u<eve>).

Both ways should yield equivalent results, as it is detailed by the documentation here in the paragraph below:

"The above query is equivalent to the query above it, but it explicitly does the reification by using _:v, which express a unique blank node linking the reification together. The CONSTRUCT clause supports creating an arbitrary number of blank nodes. The syntax is always the same, they all start with the prefix _: followed by a logical ID. On insertion of each new fact, BQL guarantees a new unique blank node will be generated by each of them. Example of multiple blank nodes generated at once are _:v0, _:v1, etc."

You can use the code in the file below to reproduce the aforementioned results.

two_ways_reification.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions