Skip to content

legacy multi-byte encodings in TextDecoder are incorrect #6038

@ChALkeR

Description

@ChALkeR

Similar to nodejs/node#61041, but this issue scope is only multi-byte decoders subset

These additional tests fail, can be found at https://github.com/ExodusOSS/bytes/blob/main/tests/encoding/mistakes.test.js and web-platform-tests/wpt#56892

✖ FAIL Common implementation mistakes > gb18030 version and ranges
✖ FAIL Common implementation mistakes > gbk version and ranges
✖ FAIL Common implementation mistakes > gbk decoder is gb18030 decoder
✖ FAIL Common implementation mistakes > Replacement, push back ASCII characters > big5 > loose
✖ FAIL Common implementation mistakes > Replacement, push back ASCII characters > iso-2022-jp > loose
✖ FAIL Common implementation mistakes > Replacement, push back ASCII characters > euc-jp > loose
✖ FAIL Common implementation mistakes > Replacement, push back ASCII characters > gbk > loose
✖ FAIL Common implementation mistakes > Sticky multibyte state > iso-2022-jp > loose
✖ FAIL Common implementation mistakes > Sticky multibyte state > gbk > loose
✖ FAIL Common implementation mistakes > Sticky multibyte state > gbk > fatal
✖ FAIL Common implementation mistakes > fatal stream > gb18030

To add a minor cheap improvement, adopt nodejs/node#61099

For others, likely wait for a fix in Node.js and port that

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions