Skip to content

Errors with geological time data #50

@seongjinpark-88

Description

@seongjinpark-88

@MihaiSurdeanu

Hi,

I am currently working on geological data, and the data contains the temporal expressions like "cold period", "2.65 million years ago", or "since Paleogene" (Paleogene: 66 million years ago ~ 23.03 million years ago). When I try to normalize those temporal expressions in the data, it showed me the following errors:

scala> import org.clulab.timenorm.scate._
import org.clulab.timenorm.scate._
scala> val parser = new TemporalNeuralParser
parser: org.clulab.timenorm.scate.TemporalNeuralParser = org.clulab.timenorm.scate.TemporalNeuralParser@35d1457d

scala> val anchor = SimpleInterval.of(2019, 11, 20)
anchor: org.clulab.timenorm.scate.SimpleInterval = SimpleInterval(2019-11-20T00:00,2019-11-21T00:00)

scala> val text = "We searched for correlations between timing in diversification and timing of (1) a period of marked volcanism across the Trans-Mexican Volcanic Belt in central Mexico 37.5 million years ago (Ma)"
text: String = We searched for correlations between timing in diversification and timing of (1) a period of marked volcanism across the Trans-Mexican Volcanic Belt in central Mexico 37.5 million years ago (Ma)

scala> for (timex <- parser.parse(text, anchor)) timex match{
     | case interval: Interval =>
     | val Some((charStart, charEnd)) = interval.charSpan
     | println(s"${interval.start} ${interval.end} ${text.substring(charStart, charEnd)}")
     | }
scala.MatchError: periods (of class java.lang.String)
  at org.clulab.timenorm.scate.AnaforaReader.period(Readers.scala:94)
  at org.clulab.timenorm.scate.AnaforaReader.temporal(Readers.scala:306)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1$$anonfun$apply$8.apply(TemporalNeuralParser.scala:154)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1$$anonfun$apply$8.apply(TemporalNeuralParser.scala:154)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.Iterator$class.foreach(Iterator.scala:750)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
  at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1.apply(TemporalNeuralParser.scala:154)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1.apply(TemporalNeuralParser.scala:151)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
  at org.clulab.timenorm.scate.TemporalNeuralParser.parseBatch(TemporalNeuralParser.scala:151)
  at org.clulab.timenorm.scate.TemporalNeuralParser.parse(TemporalNeuralParser.scala:137)
  ... 49 elided

or

scala> import org.clulab.timenorm.scate._
import org.clulab.timenorm.scate._

scala> val parser = new TemporalNeuralParser
parser: org.clulab.timenorm.scate.TemporalNeuralParser = org.clulab.timenorm.scate.TemporalNeuralParser@6ded3740

scala> val anchor = SimpleInterval.of(2019, 11, 20)
anchor: org.clulab.timenorm.scate.SimpleInterval = SimpleInterval(2019-11-20T00:00,2019-11-21T00:00)

scala> val text = "marked volcanism across the Trans-Mexican Volcanic Belt in central Mexico 37.5 million years ago"
text: String = marked volcanism across the Trans-Mexican Volcanic Belt in central Mexico 37.5 million years ago

scala> for (timex <- parser.parse(text, anchor)) timex match {
     | case interval: Interval =>
     | val Some((charStart, charEnd)) = interval.charSpan
     | println(s"${interval.start} ${interval.end} ${text.substring(charStart, charEnd)}")
     | }
2019-11-21 13:20:46.711084: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
scala.MatchError: FractionalNumber(37,1,2,Some((74,78))) (of class org.clulab.timenorm.scate.FractionalNumber)
  at $anonfun$1.apply(<console>:14)
  at $anonfun$1.apply(<console>:14)
  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  ... 49 elided

And for the different sentence, it showed following error messages:

scala> val text = "major volcanic activity at DSDP Site 216 on Ninetyeast Ridge (Kerguelen hotspot) where volcanic sediments above basement basalt are dated ~69.5 Ma (nannofossil zone UC20a, planktic foraminiferal zones CF5CF4) and continued for about 2 million years."
text: String = major volcanic activity at DSDP Site 216 on Ninetyeast Ridge (Kerguelen hotspot) where volcanic sediments above basement basalt are dated ~69.5 Ma (nannofossil zone UC20a, planktic foraminiferal zones CF5CF4) and continued for about 2 million years.

scala> for (timex <- parser.parse(text, anchor)) timex match {
     | case interval: Interval =>
     | val Some((charStart, charEnd)) = interval.charSpan
     | println(s"${interval.start} ${interval.end} ${text.substring(charStart, charEnd)}")
     | }
org.clulab.timenorm.scate.AnaforaReader$Exception: cannot parse RepeatingInterval from "Some(million)" and Vector(<entity>
          <id>2@id</id>
          <span>235,242</span>
          <type>Frequency</type>
          <properties>
            <Number>1@id</Number><Type>million</Type>
          </properties>
        </entity>, <entity>
          <id>1@id</id>
          <span>235,242</span>
          <type>Number</type>
          <properties>
            <Value>1000000</Value>
          </properties>
        </entity>)
  at org.clulab.timenorm.scate.AnaforaReader.repeatingInterval(Readers.scala:280)
  at org.clulab.timenorm.scate.AnaforaReader.temporal(Readers.scala:316)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1$$anonfun$apply$8.apply(TemporalNeuralParser.scala:154)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1$$anonfun$apply$8.apply(TemporalNeuralParser.scala:154)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.Iterator$class.foreach(Iterator.scala:750)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
  at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.AbstractTraversable.map(Traversable.scala:104)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1.apply(TemporalNeuralParser.scala:154)
  at org.clulab.timenorm.scate.TemporalNeuralParser$$anonfun$parseBatch$1.apply(TemporalNeuralParser.scala:151)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
  at org.clulab.timenorm.scate.TemporalNeuralParser.parseBatch(TemporalNeuralParser.scala:151)
  at org.clulab.timenorm.scate.TemporalNeuralParser.parse(TemporalNeuralParser.scala:137)
  ... 49 elided

When I fed the sentence I created, it showed me the result with the error message. Even though it showed me the result, I think the result is different from what it should be.

scala> val text = "I haven't done it since 4.5 million years ago"
text: String = I haven't done it since 4.5 million years ago

scala> for (timex <- parser.parse(text, anchor)) timex match{
     | case interval: Interval =>
     | val Some((charStart, charEnd)) = interval.charSpan
     | println(s"${interval.start} ${interval.end} ${text.substring(charStart, charEnd)}")
     | }
2019-11-21 13:06:46.315482: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
-997981-05-22T00:00 2019-11-21T00:00 since 4.5 million years ago
scala.MatchError: FractionalNumber(4,1,2,Some((24,27))) (of class org.clulab.timenorm.scate.FractionalNumber)
  at $anonfun$1.apply(<console>:14)
  at $anonfun$1.apply(<console>:14)
  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  ... 49 elided

Do you have any thoughts on this type of issue?

Thanks,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions