Skip to content

Comments

Avoid cloneStr for small strings and intern tokenizer identifiers#11267

Merged
rchiodo merged 2 commits intomicrosoft:mainfrom
Zaczero:zaczero/tkn
Feb 4, 2026
Merged

Avoid cloneStr for small strings and intern tokenizer identifiers#11267
rchiodo merged 2 commits intomicrosoft:mainfrom
Zaczero:zaczero/tkn

Conversation

@Zaczero
Copy link
Contributor

@Zaczero Zaczero commented Feb 2, 2026

By avoiding cloneStr overhead on small strings we recover about 6% of the performance at no peak rss change. Strings shorter than 13 are never SlicedString and they don't hold a reference to the parent memory:

if (!v8_flags.string_slices || length < SlicedString::kMinLength) return NewCopiedSubstring(...)
measurement          mean ± σ            min … max           outliers         delta
wall_time          14.9s  ±  400ms                              14.2s  … 15.6s                                     0 ( 0%)        ⚡-  6.0% ±  1.4%
peak_rss           2.66GB ± 7.72MB                              2.64GB … 2.67GB                                    1 ( 6%)          +  0.2% ±  0.2%
cpu_cycles          121G  ± 2.75G                                116G  …  125G                                     0 ( 0%)        ⚡-  4.1% ±  1.2%
instructions        198G  ±  193M                                198G  …  198G                                     2 (12%)        ⚡-  1.7% ±  0.1%
cache_references   8.67G  ± 41.2M                               8.57G  … 8.73G                                     0 ( 0%)        ⚡-  4.5% ±  0.4%
cache_misses       2.04G  ± 18.9M                               2.00G  … 2.07G                                     0 ( 0%)          -  1.3% ±  0.6%
branch_misses       734M  ± 3.59M                                728M  …  740M                                     0 ( 0%)          -  0.9% ±  0.4%

By further interning identifier strings we get in total 10% performance gain and reduce peak memory usage by about 1%. Tested on the openstreetmap-ng codebase.

measurement          mean ± σ            min … max           outliers         delta
wall_time          14.2s  ±  395ms                              13.6s  … 14.7s                                     0 ( 0%)        ⚡- 10.7% ±  1.4%
peak_rss           2.63GB ± 2.89MB                              2.62GB … 2.63GB                                    0 ( 0%)          -  0.9% ±  0.1%
cpu_cycles          106G  ± 2.55G                                102G  …  111G                                     0 ( 0%)        ⚡- 15.8% ±  1.1%
instructions        177G  ±  305M                                177G  …  178G                                     1 ( 6%)        ⚡- 11.9% ±  0.1%
cache_references   8.26G  ± 46.0M                               8.15G  … 8.34G                                     1 ( 6%)        ⚡-  9.0% ±  0.4%
cache_misses       1.96G  ± 19.8M                               1.93G  … 2.00G                                     2 (12%)        ⚡-  5.1% ±  0.6%
branch_misses       687M  ± 4.15M                                680M  …  694M                                     0 ( 0%)        ⚡-  7.3% ±  0.4%

This improves the runtime performance while preserving the memory reduction of cloneStr.

Personal note: I find it surprising that #9993 did not mention (or measure) performance regressions at all. Especially since in memory-constrained environments you can opt out of the SlicedString by compiling with --no-string-slices.

Copy link
Collaborator

@rchiodo rchiodo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me. @heejaechang any thoughts?

@github-actions

This comment has been minimized.

@rchiodo rchiodo enabled auto-merge (squash) February 3, 2026 22:56
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

Diff from mypy_primer, showing the effect of this PR on open source code:

sympy (https://github.com/sympy/sympy)
+   .../projects/sympy/sympy/solvers/diophantine/diophantine.py:173:44 - error: Cannot access attribute "expand" for class "Basic"
+     Attribute "expand" is unknown (reportAttributeAccessIssue)
-   .../projects/sympy/sympy/solvers/diophantine/diophantine.py:528:23 - error: Operator "+" not supported for types "Generator[Unknown | int, Unknown, None] | list[Unknown | int]" and "list[Unknown | int]"
+   .../projects/sympy/sympy/solvers/diophantine/diophantine.py:528:23 - error: Operator "+" not supported for types "Generator[Unknown | Literal[1], Unknown, None] | list[Unknown | Literal[1]]" and "list[Unknown | int]"
-     Operator "+" not supported for types "Generator[Unknown | int, Unknown, None]" and "list[Unknown | int]" (reportOperatorIssue)
+     Operator "+" not supported for types "Generator[Unknown | Literal[1], Unknown, None]" and "list[Unknown | int]" (reportOperatorIssue)
+   .../projects/sympy/sympy/solvers/diophantine/tests/test_diophantine.py:313:30 - error: Cannot access attribute "as_independent" for class "Basic"
+     Attribute "as_independent" is unknown (reportAttributeAccessIssue)
+   .../projects/sympy/sympy/solvers/diophantine/tests/test_diophantine.py:382:30 - error: Cannot access attribute "as_independent" for class "Basic"
+     Attribute "as_independent" is unknown (reportAttributeAccessIssue)
+   .../projects/sympy/sympy/solvers/ode/hypergeometric.py:246:67 - error: Operator "**" not supported for types "Basic" and "Literal[2]" (reportOperatorIssue)
-   .../projects/sympy/sympy/solvers/ode/lie_group.py:618:61 - error: Operator "-" not supported for type "Unknown | Basic" (reportOperatorIssue)
-   .../projects/sympy/sympy/solvers/ode/nonhomogeneous.py:467:28 - error: Argument of type "Unknown | None" cannot be assigned to parameter "expr" of type "Expr" in function "make_args"
+   .../projects/sympy/sympy/solvers/ode/nonhomogeneous.py:467:28 - error: Argument of type "Expr | Unknown | None" cannot be assigned to parameter "expr" of type "Expr" in function "make_args"
-     Type "Unknown | None" is not assignable to type "Expr"
+     Type "Expr | Unknown | None" is not assignable to type "Expr"
+   .../projects/sympy/sympy/solvers/ode/ode.py:1515:38 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1516:38 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1526:9 - error: No overloads for "update" match the provided arguments (reportCallIssue)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1526:12 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1526:19 - error: Argument of type "Unknown | dict[Unknown, Unknown] | None" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Unknown]]" in function "update"
+     Type "Unknown | dict[Unknown, Unknown] | None" is not assignable to type "Iterable[tuple[str, Unknown]]"
+       "None" is incompatible with protocol "Iterable[tuple[str, Unknown]]"
+         "__iter__" is not present (reportArgumentType)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1533:43 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1533:64 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1539:9 - error: No overloads for "update" match the provided arguments (reportCallIssue)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1539:12 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1539:19 - error: Argument of type "Unknown | dict[Unknown, Unknown] | None" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Unknown]]" in function "update"
+     Type "Unknown | dict[Unknown, Unknown] | None" is not assignable to type "Iterable[tuple[str, Unknown]]"
+       "None" is incompatible with protocol "Iterable[tuple[str, Unknown]]"
+         "__iter__" is not present (reportArgumentType)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1546:43 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1546:74 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1552:9 - error: No overloads for "update" match the provided arguments (reportCallIssue)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1552:12 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1552:19 - error: Argument of type "Unknown | dict[Unknown, Unknown] | None" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Unknown]]" in function "update"
+     Type "Unknown | dict[Unknown, Unknown] | None" is not assignable to type "Iterable[tuple[str, Unknown]]"
+       "None" is incompatible with protocol "Iterable[tuple[str, Unknown]]"
+         "__iter__" is not present (reportArgumentType)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1559:49 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:1559:70 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
-   .../projects/sympy/sympy/solvers/ode/ode.py:1689:36 - error: Cannot access attribute "lhs" for class "Expr"
-     Attribute "lhs" is unknown (reportAttributeAccessIssue)
-   .../projects/sympy/sympy/solvers/ode/ode.py:1689:55 - error: Cannot access attribute "rhs" for class "Expr"
-     Attribute "rhs" is unknown (reportAttributeAccessIssue)
-   .../projects/sympy/sympy/solvers/ode/ode.py:1690:36 - error: Cannot access attribute "lhs" for class "Expr"
-     Attribute "lhs" is unknown (reportAttributeAccessIssue)
-   .../projects/sympy/sympy/solvers/ode/ode.py:1691:17 - error: No overloads for "__setitem__" match the provided arguments (reportCallIssue)
-   .../projects/sympy/sympy/solvers/ode/ode.py:1691:17 - error: Argument of type "Equality | BooleanFalse | BooleanTrue | Unknown | Expr" cannot be assigned to parameter "value" of type "Equality | BooleanFalse | BooleanTrue" in function "__setitem__"
-     Type "Equality | BooleanFalse | BooleanTrue | Unknown | Expr" is not assignable to type "Equality | BooleanFalse | BooleanTrue"
-       Type "Expr" is not assignable to type "Equality | BooleanFalse | BooleanTrue"
-         "Expr" is not assignable to "Equality"
-         "Expr" is not assignable to "BooleanFalse"
-         "Expr" is not assignable to "BooleanTrue" (reportArgumentType)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3378:7 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3378:38 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3379:7 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3379:38 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3380:14 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3381:14 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3382:14 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3394:50 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3395:50 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3396:50 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3436:5 - error: No overloads for "update" match the provided arguments (reportCallIssue)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3436:7 - error: "update" is not a known attribute of "None" (reportOptionalMemberAccess)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3436:14 - error: Argument of type "Unknown | dict[Unknown, Unknown] | None" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Unknown]]" in function "update"
+     Type "Unknown | dict[Unknown, Unknown] | None" is not assignable to type "Iterable[tuple[str, Unknown]]"
+       "None" is incompatible with protocol "Iterable[tuple[str, Unknown]]"
+         "__iter__" is not present (reportArgumentType)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3437:18 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3437:43 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3438:9 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3438:16 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3438:24 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3438:31 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3439:9 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
+   .../projects/sympy/sympy/solvers/ode/ode.py:3439:15 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)

... (truncated 302 lines) ...

@rchiodo rchiodo merged commit 321c182 into microsoft:main Feb 4, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants