Skip to content

Bug in SRLCompositionalGrounder #912

@kwalcock

Description

@kwalcock

EidosEnglishProcessor extends FastNLPProcessorWithSemanticRoles and overrides the tokenizer with
override lazy val tokenizer = new EidosTokenizer(localTokenizer, cutoff)

The SRCCompositionalGrounder makes its own processor with

lazy val proc = {
  Utils.initializeDyNet()
  new CluProcessor()
}

This will have the standard tokenizer. These tokenize differently and will have different sentences and sentence lengths.  This leads to at least one problem in
def outgoingOfType(tok: Int, constraints: Seq[String]): Array[Int] = {

when it (the compositional grounder) does a
dependencies.outgoingEdges(tok)

Tok comes from the Mention which is tokenized one way and sentence has been reparsed and tokenized a different way.  One sentence ends up shorter so that tok is out of bounds and the document crashes.

Probably the SRLCompositionalGrounder should be initialized with the Eidos processor when the grounder gets constructed. Alternatively, the alternate parser might be used with the CluProcessor, perhaps through an anonymous class defined at lazy val proc.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions