-
-
Notifications
You must be signed in to change notification settings - Fork 148
Rich text calculated from tokenization data model instead of prerendering + bug fixes #870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
killergerbah
commented
Feb 1, 2026
- Subtitle model now has a structured tokenization field which can be used to calculate the rich text.
- This is used to pre-populate some tokens when "detect and display Netflix ruby" is enabled.
- The pre-populated tokens are respected and we just tokenize around them when we see them.
- Fixed various bugs encountered while working on this:
- Fix color updates being applied for wrong subtitle file
- Fix colors not loading if loading same subtitle file twice in a row
- Fix colors disappearing when changing subtitle offset
ShanaryS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the Netflix ruby issue, how is it handled when settings change and the SubtitleModel needs to clear it's data for reparsing with Yomitan. It's not clear to me where the existing data from the subtitle is treated different from the parsing we add later for annotation and if it survives resetting without affecting the Netflix ruby.
| export interface Token { | ||
| pos: number[]; // Relative position inside parent text | ||
| states?: TokenState[]; | ||
| status?: TokenStatus; | ||
| readings?: TokenReading[]; | ||
| } | ||
|
|
||
| export interface Tokenization { | ||
| readonly tokens?: Token[]; | ||
| readonly error?: boolean; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason so many fields are optional? I feel like it reduces the type safety of it. Would Tokenization ever be created with tokens but not error or vice versa? Same with the fields for Token, it seems to be they would always be created together, with the exception of status and readings. It seems like readings could just be an empty array if only colors are enabled but status probably does need to be optional.
| } | ||
|
|
||
| export interface TokenizedSubtitleModel extends RichSubtitleModel { | ||
| tokenization?: Tokenization; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, if we are using TokenizedSubtitleModel, it seems strange that tokenization would be optional.