-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Hi all, I wanted to let you know about a project I've started here (scripts) and here (output) to convert the Lewis and Short XML to JSON. There's still a long way to go yet until the JSON starts being useful, but I wanted to open in issue now to make sure that what I'm doing will be the most helpful it can to you upstream. In particular, I want to know how I should respond to data errors. Obviously typo fixes should be sent upstream, but what about possible issues with the markup? For example, in some entries the <sense> level attribute drops by more than one step. I want to change this for the JSON, but I'm not sure if it's a quirk of the actual Lewis and Short text that you would like to preserve. Likewise, I'm not sure what distinguishes type="main" from type="greek" in the <entryFree> tag, as there are many words that seem straightforward Greek adoptions to me, and yet have the type="main". Example:
<entryFree id="n51482" type="main" key="xeromyrrha">
<orth extent="full" lang="la">xērŏmyrrha</orth>, <itype>ae</itype>, <gen>f.</gen> (<foreign lang="greek">chro/s-mu/rra</foreign>), <sense id="n51482.0" n="I" level="1">dry myrrh, <bibl><author>Sedul.</author> Hymn. 2, 81</bibl>.</sense>
</entryFree>I could go on with more questions about the markup, and will likely have yet more as I process the XML further. So I would like to know, in general, would you like me to edit the XML and submit a pull request for all the changes that I make, and you will sort out which ones to accept and which to discard? Or are there certain types of edits that you would be uninterested in and so I shouldn't bother trying to put them into the XML? I want to be as helpful as I can without spamming your pull requests, so let me know what you would like from me.
Gratias vobis, Iohannes