From 35249c47519cae9e6e3ab6c12c81877016b9a084 Mon Sep 17 00:00:00 2001 From: Tito hiera Date: Tue, 7 Apr 2026 01:21:18 +0200 Subject: [PATCH 1/5] added yield expression test --- CLAUDE.md | 83 ++++++ bazel-bin | 1 + bazel-jsir | 1 + bazel-out | 1 + bazel-testlogs | 1 + .../conversion/tests/yield_expression/BUILD | 41 +++ .../tests/yield_expression/ast.json | 246 ++++++++++++++++++ .../tests/yield_expression/input.js | 4 + .../tests/yield_expression/jshir.mlir | 22 ++ .../tests/yield_expression/output.js | 4 + .../conversion/tests/yield_expression/run.lit | 19 ++ 11 files changed, 423 insertions(+) create mode 100644 CLAUDE.md create mode 120000 bazel-bin create mode 120000 bazel-jsir create mode 120000 bazel-out create mode 120000 bazel-testlogs create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/BUILD create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/ast.json create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/input.js create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/jshir.mlir create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/output.js create mode 100644 maldoca/js/ir/conversion/tests/yield_expression/run.lit diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..6a8a9c3 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,83 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +JSIR is a Google project for JavaScript analysis and source-to-source transformation. It uses an MLIR-based intermediate representation with two dialect levels (JSIR and JSHIR) that supports both dataflow analysis and lossless roundtrip back to source code. Primary use cases are Hermes bytecode decompilation and JavaScript deobfuscation. + +## Build System + +This project uses Bazel (via Bazelisk) with C++17. LLVM/MLIR is fetched as an external dependency and takes significant time and disk space on first build. + +```shell +# Build everything +bazelisk build //... + +# Build a single target +bazelisk build //maldoca/js/ir:jsir_gen + +# Build all targets in a directory +bazelisk build //maldoca/js/ir/... + +# Run all tests +bazelisk test //... + +# Run a specific test +bazelisk test //maldoca/js/quickjs:quickjs_test + +# Run all tests under a directory +bazelisk test //maldoca/js/ir/conversion/... + +# Run the main tool (jsir_gen) +bazelisk run //maldoca/js/ir:jsir_gen -- \ + --input_file=$(pwd)/maldoca/js/ir/conversion/tests/if_statement/input.js \ + --passes=source2ast,ast2hir +``` + +The `.bazelrc` configures C++17, clang compiler, and macOS SDK settings. Both bzlmod (`MODULE.bazel`) and workspace (`WORKSPACE`) are enabled in hybrid mode. + +## Architecture + +### Pipeline: JS Source → AST → JSIR/JSHIR → AST → JS Source + +The core data flow is managed by the **driver** (`maldoca/js/driver/`), which orchestrates the full pipeline: + +1. **Parsing**: JavaScript source → Babel AST JSON (via QuickJS-embedded Babel parser in `maldoca/js/quickjs_babel/`) +2. **AST**: JSON → C++ AST types (`maldoca/js/ast/`) — types are code-generated by `maldoca/astgen/` +3. **IR Conversion**: AST ↔ JSHIR (`maldoca/js/ir/conversion/`) — bidirectional, with both generated and handwritten converters +4. **Transforms**: MLIR passes on JSHIR (`maldoca/js/ir/transforms/`) — constant propagation, expression splitting, etc. +5. **Analyses**: Dataflow analyses on JSHIR (`maldoca/js/ir/analyses/`) — scope analysis, constant propagation analysis +6. **Roundtrip**: JSHIR → AST → JS Source + +### Two IR Dialects + +- **JSIR** (`jsir` dialect): Lower-level IR +- **JSHIR** (`jshir` dialect): High-level IR that preserves control flow structure using MLIR regions; this is the primary working dialect + +Both are defined via MLIR TableGen (`.td` files in `maldoca/js/ir/`) and use `mlir-tblgen` to generate C++ code. + +### Code Generation + +Many source files are generated, not hand-authored: +- `*.generated.cc`/`.h` files in `maldoca/js/ast/` — generated by `maldoca/astgen/` +- `*.generated.cc` in `maldoca/js/ir/conversion/` — AST↔IR conversion code +- `jsir_ops.generated.td` — IR op definitions +- `.h.inc`/`.cc.inc` files — generated by MLIR TableGen from `.td` files + +### Key Components + +- **`maldoca/js/ir/`**: IR dialect definitions, ops, attributes, types. The `jsir_gen` binary is the main CLI tool. +- **`maldoca/js/driver/`**: Orchestrates the full conversion pipeline; `JsRepr` is the central representation type with variants for source, AST, and JSHIR. +- **`maldoca/js/quickjs/`**: QuickJS JavaScript engine wrapper. +- **`maldoca/js/quickjs_babel/`**: Babel parser running inside QuickJS — this is how JS gets parsed without requiring Node.js. +- **`maldoca/js/babel/`**: Babel AST types and scope information (protobuf-based). +- **`maldoca/astgen/`**: Code generator that produces C++ AST types, visitors, and IR conversion code from protobuf definitions. +- **`maldoca/base/`**: Utility library (status macros, filesystem, error handling). + +### Testing + +- C++ tests use GoogleTest (`cc_test` rules) +- IR tests use LLVM's `lit` framework with `FileCheck` — test files have `.js`, `.mlir`, `.lit`, or `.txt` suffixes +- Conversion tests are organized by JS construct in `maldoca/js/ir/conversion/tests//` with `input.js` files +- Transform tests follow a similar pattern in `maldoca/js/ir/transforms//` diff --git a/bazel-bin b/bazel-bin new file mode 120000 index 0000000..83e3a9d --- /dev/null +++ b/bazel-bin @@ -0,0 +1 @@ +/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin \ No newline at end of file diff --git a/bazel-jsir b/bazel-jsir new file mode 120000 index 0000000..ea750c0 --- /dev/null +++ b/bazel-jsir @@ -0,0 +1 @@ +/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main \ No newline at end of file diff --git a/bazel-out b/bazel-out new file mode 120000 index 0000000..ee1fa8c --- /dev/null +++ b/bazel-out @@ -0,0 +1 @@ +/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out \ No newline at end of file diff --git a/bazel-testlogs b/bazel-testlogs new file mode 120000 index 0000000..6142954 --- /dev/null +++ b/bazel-testlogs @@ -0,0 +1 @@ +/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out/darwin_arm64-fastbuild/testlogs \ No newline at end of file diff --git a/maldoca/js/ir/conversion/tests/yield_expression/BUILD b/maldoca/js/ir/conversion/tests/yield_expression/BUILD new file mode 100644 index 0000000..51e32be --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/BUILD @@ -0,0 +1,41 @@ +# Copyright 2024 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +load("//bazel:lit.bzl", "glob_lit_tests") + +package(default_applicable_licenses = ["//:license"]) + +licenses(["notice"]) + +filegroup( + name = "test_files", + srcs = [ + "ast.json", + "input.js", + "jshir.mlir", + "output.js", + ], + tags = ["ignore_srcs"], +) + +glob_lit_tests( + name = "all_tests", + data = [ + ":test_files", + "//maldoca/js/ir:lit_test_files", + ], + test_file_exts = [ + "lit", + ], +) diff --git a/maldoca/js/ir/conversion/tests/yield_expression/ast.json b/maldoca/js/ir/conversion/tests/yield_expression/ast.json new file mode 100644 index 0000000..31954c4 --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/ast.json @@ -0,0 +1,246 @@ +// AST: { +// AST-NEXT: "type": "File", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 0 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 5, +// AST-NEXT: "column": 0 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 0, +// AST-NEXT: "end": 48, +// AST-NEXT: "program": { +// AST-NEXT: "type": "Program", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 0 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 5, +// AST-NEXT: "column": 0 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 0, +// AST-NEXT: "end": 48, +// AST-NEXT: "scopeUid": 0, +// AST-NEXT: "interpreter": null, +// AST-NEXT: "sourceType": "script", +// AST-NEXT: "body": [ +// AST-NEXT: { +// AST-NEXT: "type": "FunctionDeclaration", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 0 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 4, +// AST-NEXT: "column": 1 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 0, +// AST-NEXT: "end": 47, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "definedSymbols": [ +// AST-NEXT: { +// AST-NEXT: "name": "gen", +// AST-NEXT: "defScopeUid": 0 +// AST-NEXT: } +// AST-NEXT: ], +// AST-NEXT: "id": { +// AST-NEXT: "type": "Identifier", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 10 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 13 +// AST-NEXT: }, +// AST-NEXT: "identifierName": "gen" +// AST-NEXT: }, +// AST-NEXT: "start": 10, +// AST-NEXT: "end": 13, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "name": "gen" +// AST-NEXT: }, +// AST-NEXT: "params": [], +// AST-NEXT: "generator": true, +// AST-NEXT: "async": false, +// AST-NEXT: "body": { +// AST-NEXT: "type": "BlockStatement", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 1, +// AST-NEXT: "column": 16 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 4, +// AST-NEXT: "column": 1 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 16, +// AST-NEXT: "end": 47, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "body": [ +// AST-NEXT: { +// AST-NEXT: "type": "ExpressionStatement", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 2 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 10 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 20, +// AST-NEXT: "end": 28, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "expression": { +// AST-NEXT: "type": "YieldExpression", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 2 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 9 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 20, +// AST-NEXT: "end": 27, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "argument": { +// AST-NEXT: "type": "NumericLiteral", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 8 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 2, +// AST-NEXT: "column": 9 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 26, +// AST-NEXT: "end": 27, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "value": 1.0, +// AST-NEXT: "extra": { +// AST-NEXT: "raw": "1", +// AST-NEXT: "rawValue": 1.0 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "delegate": false +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: { +// AST-NEXT: "type": "ExpressionStatement", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 2 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 16 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 31, +// AST-NEXT: "end": 45, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "expression": { +// AST-NEXT: "type": "YieldExpression", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 2 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 15 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 31, +// AST-NEXT: "end": 44, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "argument": { +// AST-NEXT: "type": "ArrayExpression", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 9 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 15 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 38, +// AST-NEXT: "end": 44, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "elements": [ +// AST-NEXT: { +// AST-NEXT: "type": "NumericLiteral", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 10 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 11 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 39, +// AST-NEXT: "end": 40, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "value": 2.0, +// AST-NEXT: "extra": { +// AST-NEXT: "raw": "2", +// AST-NEXT: "rawValue": 2.0 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: { +// AST-NEXT: "type": "NumericLiteral", +// AST-NEXT: "loc": { +// AST-NEXT: "start": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 13 +// AST-NEXT: }, +// AST-NEXT: "end": { +// AST-NEXT: "line": 3, +// AST-NEXT: "column": 14 +// AST-NEXT: } +// AST-NEXT: }, +// AST-NEXT: "start": 42, +// AST-NEXT: "end": 43, +// AST-NEXT: "scopeUid": 1, +// AST-NEXT: "value": 3.0, +// AST-NEXT: "extra": { +// AST-NEXT: "raw": "3", +// AST-NEXT: "rawValue": 3.0 +// AST-NEXT: } +// AST-NEXT: } +// AST-NEXT: ] +// AST-NEXT: }, +// AST-NEXT: "delegate": true +// AST-NEXT: } +// AST-NEXT: } +// AST-NEXT: ], +// AST-NEXT: "directives": [] +// AST-NEXT: } +// AST-NEXT: } +// AST-NEXT: ], +// AST-NEXT: "directives": [] +// AST-NEXT: }, +// AST-NEXT: "comments": [] +// AST-NEXT: } diff --git a/maldoca/js/ir/conversion/tests/yield_expression/input.js b/maldoca/js/ir/conversion/tests/yield_expression/input.js new file mode 100644 index 0000000..423d686 --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/input.js @@ -0,0 +1,4 @@ +function* gen() { + yield 1; + yield* [2, 3]; +} diff --git a/maldoca/js/ir/conversion/tests/yield_expression/jshir.mlir b/maldoca/js/ir/conversion/tests/yield_expression/jshir.mlir new file mode 100644 index 0000000..62a3fbc --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/jshir.mlir @@ -0,0 +1,22 @@ +// JSHIR: "jsir.file"() <{comments = []}> ({ +// JSHIR-NEXT: "jsir.program"() <{source_type = "script"}> ({ +// JSHIR-NEXT: "jsir.function_declaration"() <{async = false, generator = true, id = #jsir, , "gen", 10, 13, 1, "gen">}> ({ +// JSHIR-NEXT: "jsir.exprs_region_end"() : () -> () +// JSHIR-NEXT: }, { +// JSHIR-NEXT: "jshir.block_statement"() ({ +// JSHIR-NEXT: %0 = "jsir.numeric_literal"() <{extra = #jsir, value = 1.000000e+00 : f64}> : () -> !jsir.any +// JSHIR-NEXT: %1 = "jsir.yield_expression"(%0) <{delegate = false}> : (!jsir.any) -> !jsir.any +// JSHIR-NEXT: "jsir.expression_statement"(%1) : (!jsir.any) -> () +// JSHIR-NEXT: %2 = "jsir.numeric_literal"() <{extra = #jsir, value = 2.000000e+00 : f64}> : () -> !jsir.any +// JSHIR-NEXT: %3 = "jsir.numeric_literal"() <{extra = #jsir, value = 3.000000e+00 : f64}> : () -> !jsir.any +// JSHIR-NEXT: %4 = "jsir.array_expression"(%2, %3) : (!jsir.any, !jsir.any) -> !jsir.any +// JSHIR-NEXT: %5 = "jsir.yield_expression"(%4) <{delegate = true}> : (!jsir.any) -> !jsir.any +// JSHIR-NEXT: "jsir.expression_statement"(%5) : (!jsir.any) -> () +// JSHIR-NEXT: }, { +// JSHIR-NEXT: ^bb0: +// JSHIR-NEXT: }) : () -> () +// JSHIR-NEXT: }) : () -> () +// JSHIR-NEXT: }, { +// JSHIR-NEXT: ^bb0: +// JSHIR-NEXT: }) : () -> () +// JSHIR-NEXT: }) : () -> () diff --git a/maldoca/js/ir/conversion/tests/yield_expression/output.js b/maldoca/js/ir/conversion/tests/yield_expression/output.js new file mode 100644 index 0000000..770ed83 --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/output.js @@ -0,0 +1,4 @@ +// SOURCE: function* gen() { +// SOURCE-NEXT: yield 1; +// SOURCE-NEXT: yield* [2, 3]; +// SOURCE-NEXT: } diff --git a/maldoca/js/ir/conversion/tests/yield_expression/run.lit b/maldoca/js/ir/conversion/tests/yield_expression/run.lit new file mode 100644 index 0000000..c7362e4 --- /dev/null +++ b/maldoca/js/ir/conversion/tests/yield_expression/run.lit @@ -0,0 +1,19 @@ +// RUN: INPUT=%s && \ +// RUN: jsir_gen --input_file "$(dirname "${INPUT}")"/input.js \ +// RUN: --passes "source2ast,ast2hir" \ +// RUN: | FileCheck --check-prefix JSHIR "$(dirname "${INPUT}")"/jshir.mlir + +// RUN: INPUT=%s && \ +// RUN: jsir_gen --input_file "$(dirname "${INPUT}")"/input.js \ +// RUN: --passes "source2ast,ast2hir,hir2ast,ast2source" \ +// RUN: | FileCheck --check-prefix SOURCE "$(dirname "${INPUT}")"/output.js + +// RUN: INPUT=%s && \ +// RUN: jsir_gen --input_file "$(dirname "${INPUT}")"/input.js \ +// RUN: --passes source2ast \ +// RUN: | FileCheck --check-prefix AST "$(dirname "${INPUT}")"/ast.json + +// RUN: INPUT=%s && \ +// RUN: jsir_gen --input_file "$(dirname "${INPUT}")"/input.js \ +// RUN: --passes source2ast,ast2hir,hir2ast \ +// RUN: | FileCheck --check-prefix AST "$(dirname "${INPUT}")"/ast.json From 78b170892608ce780e2d020244a2180098a17ecb Mon Sep 17 00:00:00 2001 From: Tito hiera Date: Tue, 7 Apr 2026 01:21:57 +0200 Subject: [PATCH 2/5] ignored bazel-* --- bazel-bin | 1 - bazel-jsir | 1 - bazel-out | 1 - bazel-testlogs | 1 - 4 files changed, 4 deletions(-) delete mode 120000 bazel-bin delete mode 120000 bazel-jsir delete mode 120000 bazel-out delete mode 120000 bazel-testlogs diff --git a/bazel-bin b/bazel-bin deleted file mode 120000 index 83e3a9d..0000000 --- a/bazel-bin +++ /dev/null @@ -1 +0,0 @@ -/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin \ No newline at end of file diff --git a/bazel-jsir b/bazel-jsir deleted file mode 120000 index ea750c0..0000000 --- a/bazel-jsir +++ /dev/null @@ -1 +0,0 @@ -/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main \ No newline at end of file diff --git a/bazel-out b/bazel-out deleted file mode 120000 index ee1fa8c..0000000 --- a/bazel-out +++ /dev/null @@ -1 +0,0 @@ -/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out \ No newline at end of file diff --git a/bazel-testlogs b/bazel-testlogs deleted file mode 120000 index 6142954..0000000 --- a/bazel-testlogs +++ /dev/null @@ -1 +0,0 @@ -/private/var/tmp/_bazel_tito/ef3f53791a0e006b79b8a1e2fb20ac77/execroot/_main/bazel-out/darwin_arm64-fastbuild/testlogs \ No newline at end of file From 820a4992bc6f052e7c524eed0eb48e1e53029821 Mon Sep 17 00:00:00 2001 From: Tito hiera Date: Tue, 7 Apr 2026 11:07:03 +0200 Subject: [PATCH 3/5] removed .md file --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 40bbb9b..2e2afa5 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# JSIR - Next Generation JavaScript Analysis Tooling +# JSIR - Next Generation JavaScript Analysis Tooling JSIR is a next-generation JavaScript analysis tool. At its core is an [MLIR](https://mlir.llvm.org)-based high-level From 38c2d43464b62edaffbe17e40c332273c5593c1e Mon Sep 17 00:00:00 2001 From: Tito hiera Date: Tue, 7 Apr 2026 11:08:20 +0200 Subject: [PATCH 4/5] removed .md file --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2e2afa5..40bbb9b 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# JSIR - Next Generation JavaScript Analysis Tooling +# JSIR - Next Generation JavaScript Analysis Tooling JSIR is a next-generation JavaScript analysis tool. At its core is an [MLIR](https://mlir.llvm.org)-based high-level From eb27205e3ebb6e93d651c4e26807aa076ab478a9 Mon Sep 17 00:00:00 2001 From: Tito hiera Date: Tue, 7 Apr 2026 11:09:28 +0200 Subject: [PATCH 5/5] removed cached file --- CLAUDE.md | 83 ------------------------------------------------------- 1 file changed, 83 deletions(-) delete mode 100644 CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 6a8a9c3..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,83 +0,0 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Project Overview - -JSIR is a Google project for JavaScript analysis and source-to-source transformation. It uses an MLIR-based intermediate representation with two dialect levels (JSIR and JSHIR) that supports both dataflow analysis and lossless roundtrip back to source code. Primary use cases are Hermes bytecode decompilation and JavaScript deobfuscation. - -## Build System - -This project uses Bazel (via Bazelisk) with C++17. LLVM/MLIR is fetched as an external dependency and takes significant time and disk space on first build. - -```shell -# Build everything -bazelisk build //... - -# Build a single target -bazelisk build //maldoca/js/ir:jsir_gen - -# Build all targets in a directory -bazelisk build //maldoca/js/ir/... - -# Run all tests -bazelisk test //... - -# Run a specific test -bazelisk test //maldoca/js/quickjs:quickjs_test - -# Run all tests under a directory -bazelisk test //maldoca/js/ir/conversion/... - -# Run the main tool (jsir_gen) -bazelisk run //maldoca/js/ir:jsir_gen -- \ - --input_file=$(pwd)/maldoca/js/ir/conversion/tests/if_statement/input.js \ - --passes=source2ast,ast2hir -``` - -The `.bazelrc` configures C++17, clang compiler, and macOS SDK settings. Both bzlmod (`MODULE.bazel`) and workspace (`WORKSPACE`) are enabled in hybrid mode. - -## Architecture - -### Pipeline: JS Source → AST → JSIR/JSHIR → AST → JS Source - -The core data flow is managed by the **driver** (`maldoca/js/driver/`), which orchestrates the full pipeline: - -1. **Parsing**: JavaScript source → Babel AST JSON (via QuickJS-embedded Babel parser in `maldoca/js/quickjs_babel/`) -2. **AST**: JSON → C++ AST types (`maldoca/js/ast/`) — types are code-generated by `maldoca/astgen/` -3. **IR Conversion**: AST ↔ JSHIR (`maldoca/js/ir/conversion/`) — bidirectional, with both generated and handwritten converters -4. **Transforms**: MLIR passes on JSHIR (`maldoca/js/ir/transforms/`) — constant propagation, expression splitting, etc. -5. **Analyses**: Dataflow analyses on JSHIR (`maldoca/js/ir/analyses/`) — scope analysis, constant propagation analysis -6. **Roundtrip**: JSHIR → AST → JS Source - -### Two IR Dialects - -- **JSIR** (`jsir` dialect): Lower-level IR -- **JSHIR** (`jshir` dialect): High-level IR that preserves control flow structure using MLIR regions; this is the primary working dialect - -Both are defined via MLIR TableGen (`.td` files in `maldoca/js/ir/`) and use `mlir-tblgen` to generate C++ code. - -### Code Generation - -Many source files are generated, not hand-authored: -- `*.generated.cc`/`.h` files in `maldoca/js/ast/` — generated by `maldoca/astgen/` -- `*.generated.cc` in `maldoca/js/ir/conversion/` — AST↔IR conversion code -- `jsir_ops.generated.td` — IR op definitions -- `.h.inc`/`.cc.inc` files — generated by MLIR TableGen from `.td` files - -### Key Components - -- **`maldoca/js/ir/`**: IR dialect definitions, ops, attributes, types. The `jsir_gen` binary is the main CLI tool. -- **`maldoca/js/driver/`**: Orchestrates the full conversion pipeline; `JsRepr` is the central representation type with variants for source, AST, and JSHIR. -- **`maldoca/js/quickjs/`**: QuickJS JavaScript engine wrapper. -- **`maldoca/js/quickjs_babel/`**: Babel parser running inside QuickJS — this is how JS gets parsed without requiring Node.js. -- **`maldoca/js/babel/`**: Babel AST types and scope information (protobuf-based). -- **`maldoca/astgen/`**: Code generator that produces C++ AST types, visitors, and IR conversion code from protobuf definitions. -- **`maldoca/base/`**: Utility library (status macros, filesystem, error handling). - -### Testing - -- C++ tests use GoogleTest (`cc_test` rules) -- IR tests use LLVM's `lit` framework with `FileCheck` — test files have `.js`, `.mlir`, `.lit`, or `.txt` suffixes -- Conversion tests are organized by JS construct in `maldoca/js/ir/conversion/tests//` with `input.js` files -- Transform tests follow a similar pattern in `maldoca/js/ir/transforms//`