Rich text calculated from tokenization data model instead of prerendering + bug fixes #870

killergerbah · 2026-02-01T01:07:59Z

Subtitle model now has a structured tokenization field which can be used to calculate the rich text.
This is used to pre-populate some tokens when "detect and display Netflix ruby" is enabled.
The pre-populated tokens are respected and we just tokenize around them when we see them.
Fixed various bugs encountered while working on this:
- Fix color updates being applied for wrong subtitle file
- Fix colors not loading if loading same subtitle file twice in a row
- Fix colors disappearing when changing subtitle offset

…ring + bug fixes

common/app/components/Player.tsx

ShanaryS

With the Netflix ruby issue, how is it handled when settings change and the SubtitleModel needs to clear it's data for reparsing with Yomitan. It's not clear to me where the existing data from the subtitle is treated different from the parsing we add later for annotation and if it survives resetting without affecting the Netflix ruby.

ShanaryS · 2026-02-01T15:43:02Z

common/src/model.ts

+export interface Token {
+    pos: number[]; // Relative position inside parent text
+    states?: TokenState[];
+    status?: TokenStatus;
+    readings?: TokenReading[];
+}
+
+export interface Tokenization {
+    readonly tokens?: Token[];
+    readonly error?: boolean;
+}


Is there a reason so many fields are optional? I feel like it reduces the type safety of it. Would Tokenization ever be created with tokens but not error or vice versa? Same with the fields for Token, it seems to be they would always be created together, with the exception of status and readings. It seems like readings could just be an empty array if only colors are enabled but status probably does need to be optional.

ShanaryS · 2026-02-01T15:44:06Z

common/src/model.ts

 }

+export interface TokenizedSubtitleModel extends RichSubtitleModel {
+    tokenization?: Tokenization;


Same here, if we are using TokenizedSubtitleModel, it seems strange that tokenization would be optional.

Rich text calculated from tokenization data model instead of prerende…

00e9398

…ring + bug fixes

killergerbah requested a review from ShanaryS February 1, 2026 01:08

killergerbah commented Feb 1, 2026

View reviewed changes

common/app/components/Player.tsx Show resolved Hide resolved

ShanaryS reviewed Feb 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rich text calculated from tokenization data model instead of prerendering + bug fixes #870

Rich text calculated from tokenization data model instead of prerendering + bug fixes #870

killergerbah commented Feb 1, 2026

Uh oh!

Uh oh!

ShanaryS left a comment

Uh oh!

ShanaryS Feb 1, 2026

Uh oh!

ShanaryS Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Rich text calculated from tokenization data model instead of prerendering + bug fixes #870

Are you sure you want to change the base?

Rich text calculated from tokenization data model instead of prerendering + bug fixes #870

Conversation

killergerbah commented Feb 1, 2026

Uh oh!

Uh oh!

ShanaryS left a comment

Choose a reason for hiding this comment

Uh oh!

ShanaryS Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

ShanaryS Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants