Skip to content

improving handling of default reference for text offset #29

@kosloot

Description

@kosloot

When a text content has an offset without a explicit reference, the offset is per definition relative to the text content of the nearest structure parent. In general this is OK, but there are structure elements that MAY NOT carry text.
Notably <table> and <row>, maybe more.
I suggest to extend the search for a suitable parent to the first structure parent that is allowed to carry text.

A simple addition that I already implemented in libfolia.
Sample FoLiA to demonstrate the problem:

<?xml version="1.0" encoding="UTF-8"?>
<FoLiA xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://ilk.uvt.nl/folia" xml:id="tabel" generator="libfolia-v2.10" version="2.5.1">
  <metadata type="native">
    <annotations>
      <paragraph-annotation/>
      <division-annotation />
      <string-annotation/>
      <table-annotation/>
      <text-annotation set="https://raw.githubusercontent.com/proycon/folia/master/setdefinitions/text.foliaset.ttl"/>
    </annotations>
  </metadata>
  <text xml:id="tabel.text">
    <div xml:id="tabel.text.div.1">
      <t>rij 1 veld 1</t>
      <table xml:id="tabel.">
        <row xml:id="tabel.row.1">
          <cell xml:id="tabel.row.1.cell.1">
            <t offset="0">rij 1 veld 1</t>
	  </cell>
        </row>
      </table>
    </div>
  </text>
</FoLiA>

The most recent folialint from libfolia approves this.

But the current foliavalidator states:

EXT VALIDATION ERROR: Text for Cell, ID tabel.row.1.cell.1, textclass current, has incorrect offset 0 or invalid reference: Reference (ID tabel.row.1) has no such text (class=current)
(also checked against older rules prior to FoLiA v2.4.1)
VALIDATION ERROR on full parse by library (stage 2/3), in cell-offset-bug.xml
UnresolvableTextContent: Reference (ID tabel.row.1) has no such text (class=current)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions