Skip to content

Comments

Port :erlang.binary_to_term/1 to JS#458

Open
0bvim wants to merge 33 commits intobartblast:devfrom
0bvim:port-erlang-binary-to-term-1
Open

Port :erlang.binary_to_term/1 to JS#458
0bvim wants to merge 33 commits intobartblast:devfrom
0bvim:port-erlang-binary-to-term-1

Conversation

@0bvim
Copy link

@0bvim 0bvim commented Dec 15, 2025

Implements :erlang.binary_to_term/1 to decode Erlang External Term Format (ETF) binaries, enabling support for Code.fetch_docs/1 in the browser.

Implementation

Created a dedicated EtfDecoder class that handles ETF parsing with support for the following tag types:

Integers:

  • SMALL_INTEGER_EXT (97), INTEGER_EXT (98)
  • SMALL_BIG_EXT (110), LARGE_BIG_EXT (111)

Atoms:

  • ATOM_EXT (100), SMALL_ATOM_EXT (115)
  • ATOM_UTF8_EXT (118), SMALL_ATOM_UTF8_EXT (119)

Collections:

  • BINARY_EXT (109) - binaries/strings
  • SMALL_TUPLE_EXT (104), LARGE_TUPLE_EXT (105)
  • NIL_EXT (106), STRING_EXT (107), LIST_EXT (108)
  • MAP_EXT (116)

The decoder uses a recursive approach with offset tracking, leveraging DataView for efficient multi-byte value reading. Error handling matches Erlang's exact error messages.

Testing

  • JavaScript unit tests: 23 tests covering all implemented ETF types, edge cases, and error conditions
  • Elixir consistency tests: 26 tests verifying behavior matches OTP exactly
  • All existing tests (544) continue to pass

describetag :binary added to binary_to_term test. To allow the same use of moduletag :consistency. But focused to execute only specific kinds of test instead all tests in module.
Later we can improve test sections to gather all tests from the same type.
If you think that it's a good idea, later I can add it to each section in a separated pr.

I guess that is a good idea add a Makefile to this repository and add targets to ease tests and etc... what you think about?

image image image

Summary by CodeRabbit

  • New Features

    • Public Erlang binary decoder supporting full ETF term set, compressed input, and stricter validation/error messages.
  • Tests

    • Large JS and Elixir test suites covering decoding, round-trip cases, compressed data, edge cases, and many error scenarios.
  • Tests (Helpers)

    • Added an async test helper to assert expected boxed errors and centralized error-validation for test assertions.

Copy link
Owner

@bartblast bartblast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! Added a few comments.

@bartblast
Copy link
Owner

bartblast commented Dec 25, 2025

I guess that is a good idea add a Makefile to this repository and add targets to ease tests and etc... what you think about?

Thanks for the suggestion! There are already a few mix tasks and test tags that provide similar functionality. To be honest, I don't see a strong need for such granular control of specific subsets of tests right now - you can achieve the same goal by:

  • Running tests at a specific line location: mix test path/to/test.exs:123
  • Using the @moduletag or @tag filters in Elixir tests
  • Applying .only to a specific describe block in JS tests

If you find specific workflows that aren't well-supported by the existing tooling, we could revisit this. For now, let's keep the build tooling simple.

@bartblast
Copy link
Owner

Could you pull the new changes from the dev branch? (the CI reports the PR branch has conflicts)

@0bvim
Copy link
Author

0bvim commented Dec 26, 2025

Could you pull the new changes from the dev branch? (the CI reports the PR branch has conflicts)

ofc!

@0bvim
Copy link
Author

0bvim commented Dec 26, 2025

I guess that is a good idea add a Makefile to this repository and add targets to ease tests and etc... what you think about?

There are already a few mix tasks and test tags with similar function. To be honest I don't see a point in such a granular control of a specific subset of sets - you can achieve the same goal with running a test with specific line location in mix test or apply .only to the given describe block in JS tests.

Thanks for the suggestion! There are already a few mix tasks and test tags that provide similar functionality. To be honest, I don't see a strong need for such granular control of specific subsets of tests right now - you can achieve the same goal by:

  • Running tests at a specific line location: mix test path/to/test.exs:123
  • Using the @moduletag or @tag filters in Elixir tests
  • Applying .only to a specific describe block in JS tests

If you find specific workflows that aren't well-supported by the existing tooling, we could revisit this. For now, let's keep the build tooling simple.

Nice!!! I'm used too use Makefile since my first project... So I ever want to use it... So your arguments are pretty nice. I'll get used to elixir tooling.

I will fix all requests and pull new changes from dev branch and update this pr!

@bartblast
Copy link
Owner

Hey @0bvim, just checking in - do you need any help getting this PR across the finish line?

@0bvim
Copy link
Author

0bvim commented Jan 17, 2026

Hey @0bvim, just checking in - do you need any help getting this PR across the finish line?

Hello @bartblast!
Probably I'll do.
Sorry for delay in responding... this week was an chaos to me... :(
Today, sunday and next week I'll work on it...
Now I have enough time... get it? :3

@bartblast
Copy link
Owner

@0bvim Do you think you’ll finish it by next Friday? I’m planning a big release with all the ported functions around January 25th.

@0bvim
Copy link
Author

0bvim commented Jan 17, 2026

@0bvim Do you think you’ll finish it by next Friday? I’m planning a big release with all the ported functions around January 25th.
@bartblast
Ofc! I will ask your help if have any question!

@coderabbitai
Copy link

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Adds an exported async Erlang ETF decoder binary_to_term/1 with full tag-based decoding, COMPRESSED-term decompression using DecompressionStream, extensive validation and offset checks, and corresponding JS and Elixir test suites plus an async boxed-error test helper.

Changes

Cohort / File(s) Summary
Core ETF Decoder Implementation
assets/js/erlang/erlang.mjs
Adds exported async binary_to_term/1. Implements ETF version validation, tag-dispatching decodeTerm with decoders for integers/bigints, atoms (UTF‑8/Latin‑1), binaries/bitstrings (including partial bytes), tuples, lists, maps, floats, refs/PIDs/ports/exports, COMPRESSED handling via zlibInflate using DecompressionStream, offset propagation, strict boundary checks, and unified "invalid external representation of a term" errors.
JavaScript Tests
test/javascript/erlang/erlang_test.mjs
Adds extensive binary_to_term test suite covering ETF tags, round‑trips, compressed terms, malformed/truncated inputs, and edge cases. Imports new async error helper assertBoxedErrorAsync.
Elixir Consistency Tests
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs
Adds comprehensive Elixir tests for parity with JS decoding, exercising many ETF tags, compressed cases, round‑trips via term_to_binary/binary_to_term, and invalid-input/error scenarios.
Test Helpers
test/javascript/support/helpers.mjs
Introduces internal validateBoxedError and exports new assertBoxedErrorAsync(asyncCallable, expectedErrorType, expectedErrorMessage); refactors assertBoxedError to delegate to shared validation logic.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Erlang.binary_to_term
  participant DecompressionStream
  participant Decoder

  Caller->>Erlang.binary_to_term: call(binary)
  Erlang.binary_to_term->>Erlang.binary_to_term: validate type & ETF version
  alt COMPRESSED tag
    Erlang.binary_to_term->>DecompressionStream: decompress payload
    DecompressionStream-->>Erlang.binary_to_term: decompressed bytes
    Erlang.binary_to_term->>Decoder: decodeTerm(decompressed bytes, offset=0)
  else uncompressed
    Erlang.binary_to_term->>Decoder: decodeTerm(input bytes, offset=0)
  end
  Decoder->>Decoder: dispatch on ETF tags, decode nested terms, update offsets
  Decoder-->>Erlang.binary_to_term: parsed term + final offset
  Erlang.binary_to_term->>Erlang.binary_to_term: verify full consumption
  Erlang.binary_to_term-->>Caller: return term or throw error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested reviewers

  • bartblast

Poem

🐰 I nibble bytes and unzip the night,

I hop through atoms, tuples taking flight.
Bits become carrots, lists a leafy heap,
Decompressed secrets wake from sleep.
JS and Elixir dance — a binary leap.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main objective of the PR: porting the Erlang binary_to_term/1 function to JavaScript.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch 2 times, most recently from 3905854 to e859979 Compare January 17, 2026 02:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@assets/js/erlang/etf_decoder.mjs`:
- Around line 110-113: The default branch in the switch that handles ETF tags
currently calls Interpreter.raiseArgumentError with a custom message; update
that call in etf_decoder.mjs (the default: branch that raises via
Interpreter.raiseArgumentError) to use the exact string "invalid external
representation of a term" so it matches the other ETF validation errors and the
tests in test/javascript/erlang/etf_decoder_test.mjs.
- Around line 28-49: The decode(binary) method can throw RangeError/TypeError
from DataView reads when given truncated/malformed binaries; wrap the body of
static decode(binary) in a try-catch that catches RangeError and TypeError and,
on catch, call
Interpreter.raiseArgumentError(Interpreter.buildArgumentErrorMsg(1, "invalid
external representation of a term")); keep the existing successful path that
calls $.#decodeTerm(dataView, bytes, 1) and returns result.term, rethrow any
other unexpected errors.
♻️ Duplicate comments (1)
assets/js/erlang/erlang.mjs (1)

475-769: binary_to_term/1 still calls EtfDecoder, which won’t be bundled.
The compiler only bundles code between the // Start/// End markers, so EtfDecoder won’t exist in the client bundle. This will throw at runtime. Use the inlined helpers and fix recursive calls (decodeTerm instead of $.#decodeTerm).

🛠️ Suggested inlined decode path
-        const result = $.#decodeTerm(dataView, bytes, currentOffset);
+        const result = decodeTerm(dataView, bytes, currentOffset);
-        const result = $.#decodeTerm(dataView, bytes, currentOffset);
+        const result = decodeTerm(dataView, bytes, currentOffset);
-      const tailResult = $.#decodeTerm(dataView, bytes, currentOffset);
+      const tailResult = decodeTerm(dataView, bytes, currentOffset);
-      const keyResult = $.#decodeTerm(dataView, bytes, currentOffset);
-      const valueResult = $.#decodeTerm(dataView, bytes, keyResult.newOffset);
+      const keyResult = decodeTerm(dataView, bytes, currentOffset);
+      const valueResult = decodeTerm(dataView, bytes, keyResult.newOffset);
-    return EtfDecoder.decode(binary);
+    Bitstring.maybeSetBytesFromText(binary);
+    const bytes = binary.bytes;
+    const dataView = new DataView(
+      bytes.buffer,
+      bytes.byteOffset,
+      bytes.byteLength,
+    );
+
+    if (dataView.getUint8(0) !== 131) {
+      Interpreter.raiseArgumentError(
+        Interpreter.buildArgumentErrorMsg(
+          1,
+          "invalid external representation of a term",
+        ),
+      );
+    }
+
+    return decodeTerm(dataView, bytes, 1).term;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 475-771: The file defines a full set of ETF decoding functions
(decodeTerm, decodeSmallInteger, decodeInteger, decodeSmallBig, decodeLargeBig,
decodeAtom, decodeSmallAtom, decodeBinary, decodeSmallTuple, decodeLargeTuple,
decodeString, decodeList, decodeMap) but then incorrectly delegates to
EtfDecoder.decode(binary) which is undefined in production; replace that final
delegation with the inlined decoding flow: obtain the raw bytes/Uint8Array and
DataView from the incoming binary, validate the ETF version byte (e.g., 131) and
start decoding by calling decodeTerm(dataView, bytes, versionOffset) (skip the
version byte), propagate and return the decoded term.term (or raise the same
argument error on failure), using the existing helper decoders (decodeTerm and
the others) so the inlined logic is actually executed instead of calling
EtfDecoder.decode.
♻️ Duplicate comments (3)
assets/js/erlang/etf_decoder.mjs (2)

28-51: Wrap decoder logic in try-catch to normalize truncated/malformed binary errors.

DataView methods like getInt32(), getUint32(), and getUint16() throw RangeError when the offset exceeds available bytes. Currently these bubble out as JavaScript errors instead of the expected ArgumentError with the standard message. This affects truncated binary inputs.

Proposed fix
  static decode(binary) {
+   try {
      Bitstring.maybeSetBytesFromText(binary);

      const bytes = binary.bytes;
      const dataView = new DataView(
        bytes.buffer,
        bytes.byteOffset,
        bytes.byteLength,
      );

      // Check ETF version byte (must be 131)
      if (dataView.getUint8(0) !== 131) {
        Interpreter.raiseArgumentError(
          Interpreter.buildArgumentErrorMsg(
            1,
            "invalid external representation of a term",
          ),
        );
      }

      const result = $.#decodeTerm(dataView, bytes, 1);
      return result.term;
+   } catch (err) {
+     if (err instanceof RangeError || err instanceof TypeError) {
+       Interpreter.raiseArgumentError(
+         Interpreter.buildArgumentErrorMsg(
+           1,
+           "invalid external representation of a term",
+         ),
+       );
+     }
+     throw err;
+   }
  }

111-114: Align unsupported-tag error message with other ETF validation errors.

The default case uses a custom message format that differs from the standard "invalid external representation of a term" used elsewhere in this file and expected by tests.

Proposed fix
      default:
        Interpreter.raiseArgumentError(
-          `unsupported external term format tag: ${tag}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
        );
assets/js/erlang/erlang.mjs (1)

558-561: Align unsupported-tag error message with other validation errors.

Same issue as in etf_decoder.mjs: the default case uses a custom message format instead of the standard error message.

Proposed fix
        default:
          Interpreter.raiseArgumentError(
-            `unsupported external term format tag: ${tag}`,
+            Interpreter.buildArgumentErrorMsg(
+              1,
+              "invalid external representation of a term",
+            ),
          );

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@mix.exs`:
- Around line 19-20: The setup alias currently uses a POSIX-only shell
invocation ("cmd sh -c 'cd test/features && mix deps.get'") which breaks on
Windows; update the mix.exs setup alias (the entry in the aliases/setup list) to
use a plain cmd variant so Mix handles OS differences — e.g. replace the POSIX
"sh -c" command with a cross-platform command such as "cmd cd test/features &&
mix deps.get" (or alternatively use a Mix-invocation that avoids cd, e.g. "cmd
mix deps.get --prefix test/features") to remove the sh dependency.
- Line 144: Replace the incorrect Mix config key `preferred_envs` with
`preferred_cli_env` in the project configuration (where `preferred_envs:
preferred_cli_env(),` appears in mix.exs) so Mix recognizes CLI env overrides;
alternatively for Elixir 1.15+ consider moving CLI envs into a `def cli do ...
end` block and return the `preferred_cli_env` mapping there.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Line 786: The current return at the end of the binary_to_term implementation
returns the whole object from decodeTerm instead of the decoded value; change
the return to extract and return only the term from decodeTerm's result (i.e.,
call decodeTerm(dataView, bytes, 1) and return its .term). Update any variable
names if needed to capture {term, newOffset} from decodeTerm so binary_to_term/1
yields the actual decoded Erlang term.
♻️ Duplicate comments (2)
mix.exs (2)

19-20: Windows compatibility and incorrect config key issues remain unaddressed.

The issues flagged in past reviews are still present:

  1. Line 20 uses sh -c which breaks on Windows
  2. Line 144 uses preferred_envs instead of the correct preferred_cli_env

Please address the proposed fixes from the previous review comments.


14-14: Same Windows compatibility issue as line 20.

Line 14 also uses sh -c which will fail on Windows. Consider using a cross-platform approach consistent with how you handle it on line 20.

🧹 Nitpick comments (1)
assets/js/erlang/erlang.mjs (1)

767-786: Consider adding error handling for malformed/truncated ETF data.

If the ETF binary is truncated, DataView methods will throw RangeError when reading beyond bounds. Erlang's binary_to_term/1 raises ArgumentError with "invalid external representation of a term" for such cases.

Proposed enhancement
     Bitstring.maybeSetBytesFromText(binary);

     const bytes = binary.bytes;
     const dataView = new DataView(
       bytes.buffer,
       bytes.byteOffset,
       bytes.byteLength,
     );

     // Check ETF version byte (must be 131)
     if (dataView.getUint8(0) !== 131) {
       Interpreter.raiseArgumentError(
         Interpreter.buildArgumentErrorMsg(
           1,
           "invalid external representation of a term",
         ),
       );
     }

-    return decodeTerm(dataView, bytes, 1);
+    try {
+      return decodeTerm(dataView, bytes, 1).term;
+    } catch (err) {
+      if (err instanceof RangeError) {
+        Interpreter.raiseArgumentError(
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
+        );
+      }
+      throw err;
+    }
   },

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from f235867 to 8fe18a1 Compare January 19, 2026 02:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 767-786: The code currently creates a DataView and reads bytes
(Bitstring.maybeSetBytesFromText, new DataView(...), dataView.getUint8(0),
decodeTerm) which can throw RangeError/TypeError for malformed/truncated
binaries; wrap the DataView construction and the initial getUint8/check plus the
decodeTerm call in a try/catch that catches RangeError and TypeError and maps
them to the same ArgumentError used elsewhere by calling
Interpreter.raiseArgumentError(Interpreter.buildArgumentErrorMsg(1, "invalid
external representation of a term")); rethrow other unexpected errors unchanged.
- Around line 715-743: The decodeList function incorrectly treats a non-empty
proper list tail as an improper list; change the tail handling in decodeList
(where decodeTerm is used and tailResult is inspected) so that if
Type.isList(tailResult.term) and tailResult.term.data.length > 0 you concatenate
tailResult.term.data onto elements and return Type.list(mergedElements) with the
updated newOffset; keep the existing NIL (empty list) branch and the
improper-list fallback (using Type.improperList) for non-list tails.
♻️ Duplicate comments (1)
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs (1)

1727-1872: Missing tests for several ETF tags (still outstanding).
This block still doesn’t cover LARGE_BIG_EXT, ATOM_EXT, SMALL_ATOM_EXT, ATOM_UTF8_EXT (non‑small), LARGE_TUPLE_EXT, or improper list tails, which were previously requested.

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from 8fe18a1 to 31f3b64 Compare January 19, 2026 03:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 3-8: Remove the Git conflict markers (<<<<<<<, =======, >>>>>>>)
from the import block in erlang.mjs and reconcile the imports so the module
imports are valid; ensure the imports for Bitstring, ERTS (if used) and
HologramBoxedError are present and correctly named (e.g., keep or remove the
ERTS import based on its usage in this file), then run a quick lint/build to
confirm no syntax errors remain.
♻️ Duplicate comments (3)
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs (1)

1894-2039: Remaining ETF tag coverage gaps.
Several supported tags/edge-cases still lack explicit tests (e.g., LARGE_BIG_EXT, ATOM_EXT/SMALL_ATOM_EXT non‑UTF8, ATOM_UTF8_EXT non‑small, LARGE_TUPLE_EXT, improper list tails). Please add targeted cases to close the gap.

assets/js/erlang/erlang.mjs (2)

804-832: Proper list tails are still decoded as improper lists.
Non-empty proper list tails should be merged into a proper list, not treated as improper.

✅ Proposed fix
-      // If tail is NIL (empty list), it's a proper list
-      if (Type.isList(tailResult.term) && tailResult.term.data.length === 0) {
-        return {
-          term: Type.list(elements),
-          newOffset: currentOffset,
-        };
-      }
-
-      // Otherwise, it's an improper list
-      elements.push(tailResult.term);
-      return {
-        term: Type.improperList(elements),
-        newOffset: currentOffset,
-      };
+      // If tail is a list, merge it to preserve proper list semantics
+      if (Type.isList(tailResult.term)) {
+        const merged = elements.concat(tailResult.term.data);
+
+        return Type.isProperList(tailResult.term)
+          ? {term: Type.list(merged), newOffset: currentOffset}
+          : {term: Type.improperList(merged), newOffset: currentOffset};
+      }
+
+      // Otherwise, it's an improper list
+      elements.push(tailResult.term);
+      return {
+        term: Type.improperList(elements),
+        newOffset: currentOffset,
+      };

856-875: Normalize malformed payload errors to ArgumentError.
RangeError/TypeError from DataView reads should map to the same ArgumentError used elsewhere for invalid ETF binaries.

🛡️ Proposed fix
-    Bitstring.maybeSetBytesFromText(binary);
-
-    const bytes = binary.bytes;
-    const dataView = new DataView(
-      bytes.buffer,
-      bytes.byteOffset,
-      bytes.byteLength,
-    );
-
-    // Check ETF version byte (must be 131)
-    if (dataView.getUint8(0) !== 131) {
-      Interpreter.raiseArgumentError(
-        Interpreter.buildArgumentErrorMsg(
-          1,
-          "invalid external representation of a term",
-        ),
-      );
-    }
-
-    return decodeTerm(dataView, bytes, 1).term;
+    try {
+      Bitstring.maybeSetBytesFromText(binary);
+
+      const bytes = binary.bytes;
+      const dataView = new DataView(
+        bytes.buffer,
+        bytes.byteOffset,
+        bytes.byteLength,
+      );
+
+      // Check ETF version byte (must be 131)
+      if (dataView.getUint8(0) !== 131) {
+        Interpreter.raiseArgumentError(
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
+        );
+      }
+
+      return decodeTerm(dataView, bytes, 1).term;
+    } catch (err) {
+      if (err instanceof RangeError || err instanceof TypeError) {
+        Interpreter.raiseArgumentError(
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
+        );
+      }
+      throw err;
+    }
🧹 Nitpick comments (1)
test/javascript/erlang/erlang_test.mjs (1)

21-21: Keep the test runner API consistent (import it with describe).
Mixing node:test’s describe with a global it is fragile if globals aren’t enabled; importing both avoids runner/linter mismatch.

♻️ Suggested tweak
-import { describe } from "node:test";
+import { describe, it } from "node:test";

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from 31f3b64 to 9b42f9c Compare January 19, 2026 04:02
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 710-746: The decode functions (decodeAtom, decodeSmallAtom,
decodeBinary and decodeString) currently call bytes.slice(...) without verifying
the requested payload fits in the buffer, allowing silent truncation; before
slicing, validate that offset + headerSize + length <= bytes.length (headerSize
is 2 for decodeAtom, 1 for decodeSmallAtom, 4 for decodeBinary and the
equivalent for decodeString) and throw an error (or return a parse-failure) if
the payload overruns the buffer so malformed external term data is rejected
instead of decoded truncated data.

In `@mix.exs`:
- Line 16: The prettier command string contains an incorrectly double-quoted
glob (''scripts/**/*.{mjs,js,json}''), which will pass literal quotes to the
shell and break matching; update the command (the string that begins with "cmd
assets/node_modules/.bin/prettier ...") to remove the extra single quotes so the
glob is 'scripts/**/*.{mjs,js,json}' (i.e., replace the doubled single-quotes
with a single pair).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 728-754: The bounds checks are wrong: in decodeSmallAtom the
header is 1 byte so change the check to validate that offset + 1 + length <=
bytes.length (i.e. error if offset + 1 + length > bytes.length) and ensure the
slice uses offset + 1 .. offset + 1 + length and newOffset is offset + 1 +
length; in decodeBinary the header is 4 bytes so check offset + 4 + length <=
bytes.length (error if offset + 4 + length > bytes.length), ensure the payload
slice uses offset + 4 .. offset + 4 + length, update newOffset to offset + 4 +
length, and fix the error message text to mention "binary" instead of "atom";
apply these changes in the functions decodeSmallAtom and decodeBinary.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs`:
- Around line 1925-1928: The test uses 1_000_000_000_000 which only exercises
SMALL_BIG_EXT; replace that literal with a deterministic integer that requires
>255 bytes so LARGE_BIG_EXT (tag 111) is used, e.g. set big_int = 1 <<< (8 *
255) and then use binary = :erlang.term_to_binary(big_int) and assert
:erlang.binary_to_term(binary) == big_int in the test "decodes large positive
integer (LARGE_BIG_EXT)".

In `@test/javascript/erlang/erlang_test.mjs`:
- Around line 2709-2716: The test "decodes large positive integer
(LARGE_BIG_EXT)" is using the SMALL_BIG_EXT tag (110) and a 1-byte length, so it
doesn't actually exercise LARGE_BIG_EXT; update the test vector used by
Bitstring.fromBytes in that test so the tag byte is changed to LARGE_BIG_EXT
(111) and the arity/length is encoded as a 4-byte little-endian unsigned integer
per Erlang's LARGE_BIG_EXT format, keeping the sign byte (0 for positive) and
the same magnitude bytes for 1000000000000n; the change should be made where
binary_to_term is invoked in the test to ensure Type.integer(1000000000000n) is
validated against a true LARGE_BIG_EXT encoding.
♻️ Duplicate comments (2)
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs (1)

2070-2077: Additional error condition tests may be needed.

Per previous review feedback, consider adding tests for:

  • Unsupported ETF tag (exercises the default case in the switch statement)
  • Malformed length fields that exceed available bytes
💡 Suggested additional tests
test "raises ArgumentError for unsupported ETF tag" do
  # Tag 112 (FUN_EXT) is not implemented
  binary = <<131, 112, 0, 0, 0, 1>>
  
  assert_error ArgumentError,
               ~r/unsupported external term format tag/,
               {:erlang, :binary_to_term, [binary]}
end

test "raises ArgumentError for malformed atom length" do
  # Atom claims length 100 but only has 3 bytes of data
  binary = <<131, 119, 100, "abc">>
  
  assert_error ArgumentError,
               build_argument_error_msg(1, "invalid external representation of a term"),
               {:erlang, :binary_to_term, [binary]}
end
assets/js/erlang/erlang.mjs (1)

748-761: Critical: decodeBinary bounds check is off by 2 bytes.

The bounds check uses offset + 2 + length but BINARY_EXT has a 4-byte length header (read via getUint32), so it should be offset + 4 + length. This mismatch allows malformed binaries with insufficient data to pass validation, causing silent truncation via bytes.slice().

Also, the error message incorrectly references "atom" instead of "binary".

🐛 Proposed fix
 const decodeBinary = (dataView, bytes, offset) => {
   const length = dataView.getUint32(offset);
-  if (offset + 2 + length > bytes.length) {
+  if (offset + 4 + length > bytes.length) {
     Interpreter.raiseArgumentError(
-      `atom length exceeds available bytes: ${length}`,
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
     );
   }
   const binaryBytes = bytes.slice(offset + 4, offset + 4 + length);
🧹 Nitpick comments (3)
assets/js/erlang/erlang.mjs (3)

710-726: Error messages should use consistent format.

The error messages in decodeAtom and decodeSmallAtom use a custom format with internal details (atom length exceeds available bytes: ${length}) rather than the standard buildArgumentErrorMsg format used elsewhere in the codebase. For consistency with Erlang's error behavior, use the standard error message.

♻️ Suggested refactor
 const decodeAtom = (dataView, bytes, offset, isUtf8) => {
   const length = dataView.getUint16(offset);
   if (offset + 2 + length > bytes.length) {
     Interpreter.raiseArgumentError(
-      `atom length exceeds available bytes: ${length}`,
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
     );
   }
   // ...
 };

 const decodeSmallAtom = (dataView, bytes, offset, isUtf8) => {
   const length = dataView.getUint8(offset);
   if (offset + 1 + length > bytes.length) {
     Interpreter.raiseArgumentError(
-      `atom length exceeds available bytes: ${length}`,
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
     );
   }
   // ...
 };

Also applies to: 728-743


668-706: Add bounds checks for big integer decoders.

decodeSmallBig and decodeLargeBig read bytes in a loop without first validating that sufficient data exists. While the outer try-catch handles RangeError, explicit validation provides clearer error messages and matches the pattern used in atom/binary decoders.

♻️ Suggested refactor for decodeSmallBig
 const decodeSmallBig = (dataView, bytes, offset) => {
   const n = dataView.getUint8(offset);
   const sign = dataView.getUint8(offset + 1);
+  
+  if (offset + 2 + n > bytes.length) {
+    Interpreter.raiseArgumentError(
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
+    );
+  }

   let value = 0n;
   for (let i = 0; i < n; i++) {
♻️ Suggested refactor for decodeLargeBig
 const decodeLargeBig = (dataView, bytes, offset) => {
   const n = dataView.getUint32(offset);
   const sign = dataView.getUint8(offset + 4);
+  
+  if (offset + 5 + n > bytes.length) {
+    Interpreter.raiseArgumentError(
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
+    );
+  }

   let value = 0n;
   for (let i = 0; i < n; i++) {

801-819: Error message format inconsistency in decodeString.

Similar to the atom decoders, decodeString uses a custom error message format. For consistency, use buildArgumentErrorMsg.

♻️ Suggested fix
 const decodeString = (dataView, bytes, offset) => {
   const length = dataView.getUint16(offset);
   if (offset + 2 + length > bytes.length) {
     Interpreter.raiseArgumentError(
-      `string length exceeds available bytes: ${length}`,
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
     );
   }

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from 7a77563 to 918f417 Compare January 23, 2026 00:11
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 683-686: The default branch currently raises a custom message
using Interpreter.raiseArgumentError with the tag string; change it to use the
same standard malformed ETF message as other cases by calling
Interpreter.raiseArgumentError(Interpreter.buildArgumentErrorMsg(1, "invalid
external representation of a term")) instead of the custom `unsupported external
term format tag: ${tag}` so the error matches the pattern used elsewhere
(referencing the Interpreter.raiseArgumentError and
Interpreter.buildArgumentErrorMsg symbols and the local tag variable in that
default case).

In `@test/javascript/erlang/erlang_test.mjs`:
- Line 21: Remove the explicit Node test import "import { describe } from
\"node:test\""; it shadows Mocha's global describe and breaks the test
runner—delete that import line in erlang_test.mjs and rely on Mocha's globally
provided describe (no other changes needed to the describe usages).
♻️ Duplicate comments (3)
assets/js/erlang/erlang.mjs (1)

788-800: Fix BINARY_EXT payload bounds check (header size + error message).

BINARY_EXT uses a 4‑byte length header; the current offset + 2 + length check can accept truncated binaries and slice will silently clamp. Also the error text references “atom”. This can decode malformed payloads instead of raising the standard badarg.

🔧 Proposed fix
 const decodeBinary = (dataView, bytes, offset) => {
   const length = dataView.getUint32(offset);
-  if (offset + 2 + length > bytes.length) {
-    Interpreter.raiseArgumentError(
-      `atom length exceeds available bytes: ${length}`,
-    );
-  }
+  if (offset + 4 + length > bytes.length) {
+    Interpreter.raiseArgumentError(
+      Interpreter.buildArgumentErrorMsg(
+        1,
+        "invalid external representation of a term",
+      ),
+    );
+  }
   const binaryBytes = bytes.slice(offset + 4, offset + 4 + length);
 
   return {
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs (1)

1925-1928: LARGE_BIG_EXT test still exercises SMALL_BIG_EXT.

1_000_000_000_000 fits in SMALL_BIG_EXT; this test won’t validate the LARGE_BIG_EXT path. Use an integer that requires >255 bytes.

🔧 Proposed fix
 test "decodes large positive integer (LARGE_BIG_EXT)" do
-  binary = :erlang.term_to_binary(1_000_000_000_000)
-  assert :erlang.binary_to_term(binary) == 1_000_000_000_000
+  # LARGE_BIG_EXT requires >255 bytes; 2^(8*256) is 257 bytes.
+  large_int = 1 <<< (8 * 256)
+  binary = :erlang.term_to_binary(large_int)
+  assert :erlang.binary_to_term(binary) == large_int
 end
test/javascript/erlang/erlang_test.mjs (1)

2709-2717: Fix LARGE_BIG_EXT tag in the test vector.

The test claims to verify LARGE_BIG_EXT support but the byte sequence uses tag 110 (SMALL_BIG_EXT) with 1-byte length encoding. LARGE_BIG_EXT uses tag 111 with 4-byte big-endian length.

Proposed fix
       it("decodes large positive integer (LARGE_BIG_EXT)", () => {
-        // :erlang.term_to_binary(1000000000000) but manually crafted with LARGE_BIG_EXT tag
-        // Note: In practice, Erlang might use SMALL_BIG_EXT for this, but we test LARGE_BIG_EXT support
+        // Manually crafted LARGE_BIG_EXT encoding for 1000000000000
+        // Tag 111, 4-byte length (5), sign byte (0), magnitude bytes
         const binary = Bitstring.fromBytes(
-          new Uint8Array([131, 110, 5, 0, 0, 16, 165, 212, 232]),
+          new Uint8Array([131, 111, 0, 0, 0, 5, 0, 0, 16, 165, 212, 232]),
         );
         const result = binary_to_term(binary);
         assert.deepStrictEqual(result, Type.integer(1000000000000n));
       });

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from ff25c46 to 14df279 Compare January 23, 2026 01:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs`:
- Around line 1989-1997: The test "decodes large tuple (LARGE_TUPLE_EXT)" is
building an invalid manual ETF by concatenating <<97, i::8>> which truncates
values >255; instead construct the tuple natively (e.g. build a list of elements
for arity then convert to a tuple) and call :erlang.term_to_binary(tuple) to
produce a correct LARGE_TUPLE_EXT binary, then feed that binary to
:erlang.binary_to_term and keep the existing assertions (references: variables
arity, elements, binary, result and the test name).
🧹 Nitpick comments (1)
test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs (1)

2069-2075: Rename test to clarify it tests NEWER_REFERENCE_EXT, not NEW_REFERENCE_EXT.

On modern OTP (23+), :erlang.term_to_binary(make_ref()) emits NEWER_REFERENCE_EXT (tag 90), not NEW_REFERENCE_EXT (tag 114). The test name is misleading—either rename it to reflect the actual tag being tested, or add a handcrafted NEW_REFERENCE_EXT binary to ensure tag 114 coverage.

✏️ Suggested rename (if keeping term_to_binary)
-    test "decodes NEW_REFERENCE_EXT" do
+    test "decodes reference (NEWER_REFERENCE_EXT via term_to_binary)" do

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@test/javascript/erlang/erlang_test.mjs`:
- Around line 3278-3288: The error message in decodeBinary is a copy/paste from
atom decoding; update the thrown message inside function decodeBinary (in
erlang.mjs) to use "binary length exceeds available bytes" instead of "atom
length exceeds available bytes" when the declared BINARY_EXT length is larger
than remaining bytes; locate the throw or assert inside decodeBinary that fires
on length > available and change only the message text to reference "binary" so
tests and logs reflect the correct type.
♻️ Duplicate comments (1)
test/javascript/erlang/erlang_test.mjs (1)

2708-2716: Fix LARGE_BIG_EXT tag in the test vector.

The test name indicates LARGE_BIG_EXT (tag 111), but the byte sequence uses SMALL_BIG_EXT (tag 110) with a 1-byte length. LARGE_BIG_EXT uses a 4-byte big-endian length.

Proposed fix
       it("decodes large positive integer (LARGE_BIG_EXT)", () => {
-        // :erlang.term_to_binary(1000000000000) but manually crafted with LARGE_BIG_EXT tag
-        // Note: In practice, Erlang might use SMALL_BIG_EXT for this, but we test LARGE_BIG_EXT support
+        // Manually crafted LARGE_BIG_EXT encoding for 1000000000000
         const binary = Bitstring.fromBytes(
-          new Uint8Array([131, 110, 5, 0, 0, 16, 165, 212, 232]),
+          new Uint8Array([131, 111, 0, 0, 0, 5, 0, 0, 16, 165, 212, 232]),
         );
         const result = binary_to_term(binary);
         assert.deepStrictEqual(result, Type.integer(1000000000000n));
       });

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@test/javascript/erlang/erlang_test.mjs`:
- Around line 3007-3014: The BIT_BINARY_EXT test constructs an ETF bitstring
with Bits=0 which is invalid; update the test case in erlang_test.mjs where
Bitstring.fromBytes(...) is used to encode the two-byte bitstring so the Bits
field is 8 (not 0) for the last byte; specifically modify the byte array passed
to Bitstring.fromBytes (used by binary_to_term) so the ETF header's Bits value
is 8, then run the same assertion that Bitstring.calculateBitCount(result)
equals 16.
♻️ Duplicate comments (5)
assets/js/erlang/erlang.mjs (3)

687-690: Use standard Erlang error message for unsupported tags.

The default case uses a custom error message, while all other invalid ETF scenarios use Interpreter.buildArgumentErrorMsg(1, "invalid external representation of a term"). This inconsistency affects parity with Erlang's behavior.

Proposed fix
       default:
         Interpreter.raiseArgumentError(
-          `unsupported external term format tag: ${tag}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
         );

754-788: Inconsistent error messages in atom decoders.

Both decodeAtom and decodeSmallAtom use custom error messages that diverge from the standard format used elsewhere in this function. For OTP parity, use the standard error message.

Proposed fix
     const decodeAtom = (dataView, bytes, offset, isUtf8) => {
       const length = dataView.getUint16(offset);
       if (offset + 2 + length > bytes.length) {
         Interpreter.raiseArgumentError(
-          `atom length exceeds available bytes: ${length}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
         );
       }
       // ...
     };

     const decodeSmallAtom = (dataView, bytes, offset, isUtf8) => {
       const length = dataView.getUint8(offset);
       if (offset + 1 + length > bytes.length) {
         Interpreter.raiseArgumentError(
-          `atom length exceeds available bytes: ${length}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
         );
       }
       // ...
     };

792-805: Critical: decodeBinary bounds check uses wrong header size and incorrect error message.

The bounds check at line 794 uses offset + 2 + length but BINARY_EXT has a 4-byte length prefix, so it should be offset + 4 + length. The current code may accept truncated binaries that should be rejected. Additionally, the error message incorrectly references "atom" instead of "binary".

Proposed fix
     const decodeBinary = (dataView, bytes, offset) => {
       const length = dataView.getUint32(offset);
-      if (offset + 2 + length > bytes.length) {
+      if (offset + 4 + length > bytes.length) {
         Interpreter.raiseArgumentError(
-          `atom length exceeds available bytes: ${length}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
         );
       }
       const binaryBytes = bytes.slice(offset + 4, offset + 4 + length);
test/javascript/erlang/erlang_test.mjs (2)

2708-2716: LARGE_BIG_EXT test still uses SMALL_BIG_EXT tag (110) instead of LARGE_BIG_EXT (111).

This issue was flagged in a previous review. The test is named "decodes large positive integer (LARGE_BIG_EXT)" but uses tag 110 (SMALL_BIG_EXT). To properly test LARGE_BIG_EXT support, use tag 111 with a 4-byte big-endian length encoding.

Proposed fix
       it("decodes large positive integer (LARGE_BIG_EXT)", () => {
-        // :erlang.term_to_binary(1000000000000) but manually crafted with LARGE_BIG_EXT tag
-        // Note: In practice, Erlang might use SMALL_BIG_EXT for this, but we test LARGE_BIG_EXT support
+        // Manually crafted LARGE_BIG_EXT (tag 111) with 4-byte length encoding
         const binary = Bitstring.fromBytes(
-          new Uint8Array([131, 110, 5, 0, 0, 16, 165, 212, 232]),
+          new Uint8Array([131, 111, 0, 0, 0, 5, 0, 0, 16, 165, 212, 232]),
         );
         const result = binary_to_term(binary);
         assert.deepStrictEqual(result, Type.integer(1000000000000n));
       });

3299-3309: Test asserts incorrect error message from implementation.

This issue was flagged in a previous review. The test verifies behavior for malformed BINARY_EXT but expects the error message "atom length exceeds available bytes". The implementation's decodeBinary function has a copy-paste error—it should say "binary length exceeds available bytes" instead.

The test correctly captures current implementation behavior, but the implementation error message needs correction in erlang.mjs.

🧹 Nitpick comments (1)
assets/js/erlang/erlang.mjs (1)

845-863: Inconsistent error message in decodeString.

Uses custom error message format instead of the standard Interpreter.buildArgumentErrorMsg(1, "invalid external representation of a term").

Proposed fix
     const decodeString = (dataView, bytes, offset) => {
       const length = dataView.getUint16(offset);
       if (offset + 2 + length > bytes.length) {
         Interpreter.raiseArgumentError(
-          `string length exceeds available bytes: ${length}`,
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
         );
       }

0bvim added 2 commits February 2, 2026 20:44
Replace cli with preferred_cli_env in the project config.
Update format aliases: use 'cmd cd test/features && mix format' and
broaden the prettier scripts glob.
Raise ArgumentError when BIT_BINARY_EXT 'bits' byte is not 1..8. Update
tests to use bits=8 for the full-byte case and apply minor formatting
fixes in test fixtures.
@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch 2 times, most recently from 43c9a9a to 82983bb Compare February 3, 2026 20:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@test/javascript/erlang/erlang_test.mjs`:
- Around line 2943-2960: The test "decodes large tuple (LARGE_TUPLE_EXT)"
currently encodes element values >255 using SMALL_INTEGER_EXT (variable
elementBytes and the for loop), causing wraparound; either update the comment to
state that values wrap modulo 256, or change the encoding logic so values >255
are emitted as INTEGER_EXT (tag 98) with a 4-byte big-endian signed integer
payload. To fix, inside the for loop that builds elementBytes, branch on i: if i
<= 255 keep SMALL_INTEGER_EXT (97) + one byte, otherwise emit INTEGER_EXT (98)
followed by the 4 big-endian bytes of i; update tests/comments accordingly to
reflect the chosen behavior.
🧹 Nitpick comments (2)
assets/js/erlang/erlang.mjs (1)

921-924: Use TextDecoder("iso-8859-1") for consistency and robustness.

While ATOM_EXT's 2-byte length field limits payloads to 65,535 bytes (below String.fromCharCode's ~65,536 argument limit), using TextDecoder is more maintainable and avoids potential issues across different JS engines. Note: the proposed encoding should be "iso-8859-1" (TextDecoder does not support "latin1").

💡 Corrected fix
-      const atomString = isUtf8
-        ? new TextDecoder("utf-8").decode(atomBytes)
-        : String.fromCharCode(...atomBytes);
+      const decoder = new TextDecoder(isUtf8 ? "utf-8" : "iso-8859-1");
+      const atomString = decoder.decode(atomBytes);
test/javascript/erlang/erlang_test.mjs (1)

3538-3588: Consider deduplicating the compressed-term fixture.
The “validates internal zlibInflate functionality” test reuses the same binary as the first compressed test; a shared constant would reduce repetition.

♻️ Example refactor
     describe("compressed terms", () => {
+      const compressedHelloBinary = Bitstring.fromBytes(
+        new Uint8Array([
+          131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
+          154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
+        ]),
+      );
+
       it("decodes compressed term (COMPRESSED tag 80) with repeated string", async () => {
-        const binary = Bitstring.fromBytes(
-          new Uint8Array([
-            131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
-            154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
-          ]),
-        );
+        const binary = compressedHelloBinary;
         const result = await binary_to_term(binary);
         const expected = Type.bitstring("hello".repeat(100));
         assertBoxedStrictEqual(result, expected);
       });
@@
       it("validates internal zlibInflate functionality within binary_to_term context", async () => {
-        const binary = Bitstring.fromBytes(
-          new Uint8Array([
-            131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
-            154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
-          ]),
-        );
+        const binary = compressedHelloBinary;
         const result = await binary_to_term(binary);
         const expected = Type.bitstring("hello".repeat(100));
         assertBoxedStrictEqual(result, expected);
       });

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from 82983bb to 906b4ee Compare February 3, 2026 20:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 790-849: The decodeCompressed function currently rejects nested
COMPRESSED by checking offset !== 2; remove that top-level-only restriction and
instead decode the decompressed payload recursively with decodeTerm, then
compute and return the correct newOffset relative to the original bytes buffer
(e.g., newOffset = offset + 4 + result.newOffset where result.newOffset is how
many bytes decodeTerm consumed inside the decompressedData) so nested COMPRESSED
tags are handled; keep the size checks (uncompressedSize vs decompressed.length)
and existing error handling via Interpreter.raiseArgumentError and
Interpreter.buildArgumentErrorMsg.

0bvim added 2 commits February 3, 2026 17:33
Replace hard-coded argument-error strings with
Interpreter.buildArgumentErrorMsg
calls in erlang_test.mjs. Also normalize spacing in imports and
object-literal
entries for consistent formatting.
@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from df8be22 to 28c6ab5 Compare February 3, 2026 21:04
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 921-954: Replace the custom length-exceeded error messages in
decodeSmallAtom and decodeBinary with the standard OTP-style argument error by
calling Interpreter.raiseArgumentError(Interpreter.buildArgumentErrorMsg(1,
"invalid external representation of a term")) when the length check fails;
update the error branches inside decodeSmallAtom and decodeBinary (the checks
using offset + 1 + length and offset + 4 + length) so they use
Interpreter.buildArgumentErrorMsg(1, "invalid external representation of a
term") instead of the current template strings.
- Around line 1081-1142: Replace the custom bounds-check error strings in
decodeFloatExt and decodeBitBinary with the standard argument-error format used
elsewhere: call
Interpreter.raiseArgumentError(Interpreter.buildArgumentErrorMsg(1, "invalid
external representation of a term")) instead of the literal messages ("float
string exceeds available bytes" and "bit binary length exceeds available bytes:
...") so they match the pattern used in decodeSmallAtom/decodeBinary and other
decoders.

In `@test/elixir/hologram/ex_js_consistency/erlang/erlang_test.exs`:
- Around line 1995-1998: The test "decodes UTF-8 atom (SMALL_ATOM_UTF8_EXT)"
currently uses term_to_binary(:test) which yields an ASCII atom and will not
exercise SMALL_ATOM_UTF8_EXT (tag 119); update the test to use a true UTF-8 atom
or an explicit tag-119 binary so the decoder actually sees
SMALL_ATOM_UTF8_EXT—for example replace :test with a UTF-8 atom like :"café"
when calling :erlang.term_to_binary/1 or construct the binary manually with tag
119 before passing it to :erlang.binary_to_term/1 so that the assertion still
checks that :erlang.binary_to_term(binary) returns the expected atom.
🧹 Nitpick comments (3)
assets/js/erlang/erlang.mjs (3)

831-832: Remove console.error from production code.

Logging errors to the console before re-raising as ArgumentError may leak internal decoder state in production environments. Consider removing this line or gating it behind a debug flag.

Suggested fix
       } catch (error) {
-        console.error("Decode error:", error);
         Interpreter.raiseArgumentError(

860-898: Add explicit bounds checks to big integer decoders.

decodeSmallBig and decodeLargeBig access bytes[offset + N + i] without verifying the buffer has sufficient data. While the outer try/catch converts the resulting TypeError to ArgumentError, explicit bounds checking would be more robust and consistent with decodeAtom and decodeBinary.

Suggested fix for decodeSmallBig
     const decodeSmallBig = (dataView, bytes, offset) => {
       const n = dataView.getUint8(offset);
       const sign = dataView.getUint8(offset + 1);
+      if (offset + 2 + n > bytes.length) {
+        Interpreter.raiseArgumentError(
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
+        );
+      }

       let value = 0n;
Suggested fix for decodeLargeBig
     const decodeLargeBig = (dataView, bytes, offset) => {
       const n = dataView.getUint32(offset);
       const sign = dataView.getUint8(offset + 4);
+      if (offset + 5 + n > bytes.length) {
+        Interpreter.raiseArgumentError(
+          Interpreter.buildArgumentErrorMsg(
+            1,
+            "invalid external representation of a term",
+          ),
+        );
+      }

       let value = 0n;

1130-1137: Misleading comments about bits=0 case.

The comments at lines 1131 and 1136 mention bits=0 as a valid case, but the validation at lines 1112-1118 correctly rejects bits < 1. Since bits=0 would already raise an error before reaching this code, the comments are misleading.

Suggested comment fix
-      // Adjust leftoverBitCount based on the bits field
-      // If bits is 0, all bytes are full (leftoverBitCount = 0)
-      // If bits is 1-7, the last byte is partial (leftoverBitCount = bits)
+      // Adjust leftoverBitCount based on the bits field (valid range: 1-8)
+      // If bits is 1-7, the last byte is partial (leftoverBitCount = bits)
+      // If bits is 8, all bytes are full (leftoverBitCount = 0)
       if (bits > 0 && bits < 8) {
         bitstring.leftoverBitCount = bits;
       }
-      // If bits is 0 or 8, leftoverBitCount is already 0 (all full bytes)
+      // If bits is 8, leftoverBitCount is already 0 (all full bytes)

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from 33e0478 to e56bb09 Compare February 3, 2026 21:40
@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch 2 times, most recently from 397e87b to 2b82e36 Compare February 3, 2026 22:01
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 1415-1416: After decoding, ensure no trailing bytes remain: in the
binary_to_term/1 flow (the call to decodeTerm(dataView, bytes, 1) returning
result), check the decoded byte offset (e.g., result.offset or result.consumed)
against the total buffer length and if they differ throw an ArgumentError
indicating extra bytes; update the code around the decodeTerm call (and use the
existing ArgumentError class/constructor) instead of silently returning
result.term when extra bytes exist.
- Around line 860-898: In decodeSmallBig and decodeLargeBig validate the sign
byte read (variable sign) is either 0 or 1 and if not throw an ArgumentError;
specifically, after reading sign in decodeSmallBig (from
dataView.getUint8(offset + 1)) and in decodeLargeBig (from
dataView.getUint8(offset + 4)), add a check that sign === 0 || sign === 1 and
throw new ArgumentError("invalid BIG_EXT sign") (or the project's existing
ArgumentError) before computing the value so invalid sign bytes are not silently
treated as positive.
- Around line 790-830: In decodeCompressed, the code currently slices
compressedData to the buffer end and sets newOffset to
offset+4+compressedData.length which breaks parsing of following terms; change
decodeCompressed in function decodeCompressed to (1) determine the actual
compressed payload length instead of using bytes.slice(...).length (e.g., call
zlibInflate in a way that returns bytesRead or parse the zlib stream to compute
compressedLen), (2) after inflating call decodeTerm on the decompressed DataView
and ensure decodeTerm returns/exports the number of bytes consumed in the
decompressed buffer so you can verify decompressedBytesConsumed ===
uncompressedSize, and (3) set newOffset = offset + 4 + compressedLen (the exact
number of bytes consumed from the original bytes buffer). Ensure you
update/depend on decodeTerm to return both term and bytesRead and use that
bytesRead to validate full decompressed consumption and to compute offsets
correctly (symbols: decodeCompressed, decodeTerm, zlibInflate, uncompressedSize,
compressedData, newOffset).

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch 3 times, most recently from 90da603 to ed3c535 Compare February 4, 2026 01:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@assets/js/erlang/erlang.mjs`:
- Around line 831-833: Remove the debug logging inside the catch block: delete
the console.error("Decode error:", error) statement and leave the existing
error-handling call to Interpreter.raiseArgumentError intact (i.e., keep the
catch block that calls Interpreter.raiseArgumentError to surface the error
without logging internal details to the console).
🧹 Nitpick comments (1)
test/javascript/erlang/erlang_test.mjs (1)

3554-3677: De-duplicate repeated compressed payload in tests.

Two tests reuse the same compressed byte array; extracting a helper keeps the suite DRY and easier to maintain.

♻️ Suggested refactor
     describe("compressed terms", () => {
+      const repeatedHelloBinary = () =>
+        Bitstring.fromBytes(
+          new Uint8Array([
+            131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
+            154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
+          ]),
+        );
+
       it("decodes compressed term (COMPRESSED tag 80) with repeated string", async () => {
         // Real Erlang-generated compressed term for String.duplicate("hello", 100)
         // Generated with: :erlang.term_to_binary(String.duplicate("hello", 100), [compressed: 9])
         // Format: 131 (version) + 80 (COMPRESSED) + uncompressed_size (4 bytes) + zlib compressed data
         // Uncompressed size = 505 bytes
-        const binary = Bitstring.fromBytes(
-          new Uint8Array([
-            131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
-            154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
-          ]),
-        );
+        const binary = repeatedHelloBinary();
         const result = await binary_to_term(binary);
         const expected = Type.bitstring("hello".repeat(100));
         assertBoxedStrictEqual(result, expected);
       });
@@
       it("validates internal zlibInflate functionality within binary_to_term context", async () => {
         // This test validates that the zlibInflate function was successfully moved from Utils
         // to within the binary_to_term context and still works correctly for compressed terms.
 
         // Use the same working test data as the first compressed test
-        const binary = Bitstring.fromBytes(
-          new Uint8Array([
-            131, 80, 0, 0, 1, 249, 120, 218, 203, 101, 96, 96, 252, 146, 145,
-            154, 147, 147, 63, 74, 140, 40, 2, 0, 21, 94, 209, 51,
-          ]),
-        );
+        const binary = repeatedHelloBinary();
 
         const result = await binary_to_term(binary);
         const expected = Type.bitstring("hello".repeat(100));

@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch 4 times, most recently from 6b7fe75 to a1d202d Compare February 4, 2026 02:30
@0bvim 0bvim force-pushed the port-erlang-binary-to-term-1 branch from a1d202d to 6a2cc87 Compare February 4, 2026 02:33
@0bvim 0bvim requested a review from bartblast February 4, 2026 03:28
@bartblast
Copy link
Owner

Thanks for the work on this, @0bvim!

Given the size of the PR, I'd like to push merging this to the next minor release in 1-2 weeks so I can review it thoroughly. I'll get back to you soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants