see https://github.com/axa-group/Parsr This could potentially allow cutting a lot of my own code in opp/docparser