Skip to content

Poor performance of nested OPTIONALs #72

@jindrichmynarz

Description

@jindrichmynarz

When you have a SPARQL query with nested OPTIONAL clauses, such as the following, it's performance is poor, typically causing timeouts.

PREFIX bibo:    <http://purl.org/ontology/bibo/>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT *
WHERE {
  {
    SELECT ?article
    WHERE {
      ?article a bibo:Article .
    }
    LIMIT 10
  }

  OPTIONAL {
    OPTIONAL {
      ?article dcterms:issued ?article_issued .
    }
  }
}

Output of Halyard Profile for this query:

Optimized query:
    Projection [2,955,991,897,878,706.5]
        ProjectionElemList
            ProjectionElem "article"
            ProjectionElem "article_issued"
        LeftJoin [2,955,991,897,878,706.5]
            Slice ( limit=10 ) [3,614,563.841]
                Projection [3,614,563.841]
                    ProjectionElemList
                        ProjectionElem "article"
                    StatementPattern [3,614,563.841]
                        Var (name=article)
                        Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
                        Var (name=_const_6dd7acd3_uri, value=http://purl.org/ontology/bibo/Article, anonymous)
            LeftJoin [226.251]
                SingletonSet [1]
                StatementPattern [226.251]
                    Var (name=article)
                    Var (name=_const_884f353b_uri, value=http://purl.org/dc/terms/issued, anonymous)
                    Var (name=article_issued)

The nested OPTIONAL in this query is unnecessary, but it allows to replicate the issue without in a minimal way.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions