HTML fact extractor does not support not marked single tags like
.
The used parser cannot distinguish start tags from single tags.
(The used SAX-Parser is not supporting single-tags cor-
rectly. A <br> is leading to a wrong fragment file whereas <br/> is)
Possible solutions:
-Find another parser
-write a parser that finds single tags
-use a preprocessor that converts single tags to the <br/> style.
...
Issue from the Fact Extraction paper of June 22.