Skip to content

CoraXMLImporter requires local copy of DTD #148

@chiarcos

Description

@chiarcos

Using the latest stable version, the CoraXMLImporter fails because it requires the CoraXML DTD to be located in pepper/cora-xml.dtd.

Requested action: Read DTD from data directory or bundle it with Pepper.

Error log below, using a sample file from https://www.laudatio-repository.org/download/format/20/36/1.0.
Note that cora-xml.dtd is bundled with the data and available in the data directory.

Configuration (template, with $tmp pointing to the data directory, generated using pepper-wrapper):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://korpling.german.hu-berlin.de/saltnpepper/pepper/schema/10/pepper.rnc" type="application/relax-ng-compact-syntax" ?>
<pepper-job version="1.0">
        <importer name="'$importer'" path="'$tmp'"/>
        <exporter name="'$exporter'" path="'$tgt'"/>
</pepper-job>

Error log:

Cannot map 'salt:/0/rem/M543-N1' with module 'CoraXMLImporter', because of a mapping result was 'FAILED'.
An exception was thrown by the mapper threads 'Thread[CoraXMLImporter_mapper(salt:/rem/M543-N1),5,CoraXMLImporter_mapperGroup]'. 
org.corpus_tools.pepper.modules.exceptions.PepperModuleXMLResourceException: Cannot read xml-file'file:/home/chiarcos/Desktop/github/powla/experimental/salt/samples/rem/M543-N1.xml', because of a nested exception. 
    at org.corpus_tools.pepper.common.PepperUtil.readXMLResource(PepperUtil.java:661)
    at org.corpus_tools.pepper.impl.PepperMapperImpl.readXMLResource(PepperMapperImpl.java:278)
    at org.corpus_tools.peppermodules.coraXMLModules.CoraXML2SaltMapper.mapSDocument(CoraXML2SaltMapper.java:116)
    at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.map(PepperMapperControllerImpl.java:251)
    at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.run(PepperMapperControllerImpl.java:188)
Caused by: java.io.FileNotFoundException: /home/chiarcos/Desktop/github/pepper-wrapper/pepper/cora-xml.dtd (No such file or directory)
    at java.base/java.io.FileInputStream.open0(Native Method)
    at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:112)
    at java.base/sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:86)
    at java.base/sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:184)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:652)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1398)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1364)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:257)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1152)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1040)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:943)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
    at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534)
    at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
    at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
    at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216)
    at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
    at org.corpus_tools.pepper.common.PepperUtil.readXMLResource(PepperUtil.java:641)
    ... 4 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions