Skip to content

user-facing chunking of lemmata #103

@wujastyk

Description

@wujastyk

I have a case like this:

Image

This is a chapter colophon with three witnesses. Saktumiva has found strings to collate, but the resulting apparatus is messy and misleading. What we have in the witnesses is three different colophon statements. What Saktumiva is seeing is sub-strings that it is trying to collate meaningfully. The outcome is that the witness texts are chopped up in ways that obscure the reality of the transmission, which is more simple that the apparatus suggests.

There are other places where I have wished that I had a way of telling Saktumiva's collation algorithm not to subdivide a string for the purposes of comparison.

Would it be possible to introduce a way of manually marking the preferred chunking of strings? The tag that comes to mind immediately is <seg> (maybe with an attribute).

Then, in the above example, my witnesses would read:

K: <seg>  kāya<abbr>ci</abbr><ex>kitsāyāṃ</ex> || <num style="letter-numeral">16</num>
            || 0 ||</seg>

H:  <seg>uttaratantre<lb n="4"/> ekapañcāśattamaḥ
                        kāyacikitsāyāṃ pratiśyāyacikitsā ṣo<cb/>ḍaśamo dhyāyaḥ ||</seg>

A: <seg> iti suśrutasaṃhitāyām uttaratantrāntargate
            śālākyatantre pratiśyāyapratiṣedho nāma caturviṃśatitamo 'dhyāyaḥ
            || 24 ||</seg>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions