If I remember correctly, we found out some time ago that a lot of time was spent in checking PN_CHARS.
We could add a strict flag in the parser configuration; when set to true, the parser would use the current code, rejecting any invalid character. When set to false (the default, following Postel's law), the parser would accept any non-ascii character (since only ascii characters are "significant" for the syntax anyway).
Not sure how much this would improve performances, but it is worth a try. @Tpt WDYT?