In issue #42 I created an html parser that extracts the whole html into one tree in one pass, but it's still used by the paragraph splitter in a recursive approach.
A new html specific "splitter" should be implemented that does the XLIFF extraction in one go.