The record() method creates the XML as a string rather than using XML::LibXML objects, and it passes the $field->data directly into controlfield and datafield strings.
It's possible for invalid XML characters (like STX or US characters) to be passed from the MARC::Field object into the XML string.
--
I'm not 100% sure of the best solution.
Using a regex like /[^\x{0009}\x{000A}\x{000D}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]/ we could raise an exception/die, but that would be a behaviour change.
We could silently or noisily erase the invalid characters, but that would be a bit like hiding the problem.
I suppose having a configurable option for both could be good. What do you think?
The record() method creates the XML as a string rather than using XML::LibXML objects, and it passes the $field->data directly into controlfield and datafield strings.
It's possible for invalid XML characters (like STX or US characters) to be passed from the MARC::Field object into the XML string.
--
I'm not 100% sure of the best solution.
Using a regex like /[^\x{0009}\x{000A}\x{000D}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]/ we could raise an exception/die, but that would be a behaviour change.
We could silently or noisily erase the invalid characters, but that would be a bit like hiding the problem.
I suppose having a configurable option for both could be good. What do you think?