feat: fix build with latest llama.cpp, security-upgrade deps, improve DX#200
feat: fix build with latest llama.cpp, security-upgrade deps, improve DX#200aviallon wants to merge 8 commits intongxson:masterfrom
Conversation
…ted by Playwright
… CMAKE_SYSTEM_PROCESSOR to "wasm", since we rely on it for specific behavior
…perly use wasm code paths
📝 WalkthroughWalkthroughThe changes add comprehensive test infrastructure for Docker and WebAssembly, configure new build options (GGML CPU repack and WASM memory64), update Docker Compose services, and mark auto-generated files. Build scripts are enhanced with strict error handling, ARM64 architecture detection, and EMSDK version updates. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/build_wasm.sh (1)
10-11:$GIDmay be unset on some systems.On macOS and certain Linux distributions, the
$GIDenvironment variable is not set by default (unlike$UID). This could result inD_GIDbeing empty, potentially causing issues in the Docker Compose configuration.Suggested fix
export D_UID=$UID -export D_GID=$GID +export D_GID=${GID:-$(id -g)}
🤖 Fix all issues with AI agents
In `@scripts/test_in_docker.sh`:
- Around line 3-4: Assign TEST_ARGS using "$*" instead of "$@" to preserve a
single concatenated string of all arguments, and set SCRIPT_DIR from the first
element of the BASH_SOURCE array by using ${BASH_SOURCE[0]}; ensure both
assignments are quoted to handle spaces (i.e., set TEST_ARGS to "$*" and
SCRIPT_DIR to "$(dirname -- "${BASH_SOURCE[0]}")" so argument boundaries and
BASH_SOURCE indexing are handled correctly).
In `@scripts/test_wasm.sh`:
- Around line 10-11: The script exports D_UID and D_GID using $UID and $GID but
$GID is not a standard shell variable and may be empty; change the D_GID export
to derive the current group ID reliably (e.g., export D_GID="$(id -g)" or
fallback to "$(getent passwd "$USER" | cut -d: -f4)" if needed) while keeping
the D_UID export as is (export D_UID="$UID" or use id -u for consistency), so
D_GID is always set before being used.
🧹 Nitpick comments (1)
scripts/test_wasm.sh (1)
7-8: Consider quoting the variable for robustness.If the script path contains spaces, the
cdcommand will fail.Proposed fix
CURRENT_PATH="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" -cd $CURRENT_PATH +cd "$CURRENT_PATH"
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.gitattributes.gitignoreCMakeLists.txtcpp/wllama.cpppackage.jsonscripts/build_wasm.shscripts/docker-compose.ymlscripts/test_in_docker.shscripts/test_wasm.shsrc/workers-code/.gitattributes
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-04-18T08:24:14.434Z
Learnt from: ngxson
Repo: ngxson/wllama PR: 0
File: :0-0
Timestamp: 2025-04-18T08:24:14.434Z
Learning: The file `generate_wasm_from_cdn.ts` in the wllama project is a generated script that gets stored on CDN, making it impossible to use `require('../package.json')` to dynamically access version information. This is why version references in this file need to be manually updated when the package version changes.
Applied to files:
scripts/test_wasm.shpackage.jsonscripts/docker-compose.ymlscripts/build_wasm.sh
📚 Learning: 2025-12-29T03:44:02.649Z
Learnt from: reeselevine
Repo: ngxson/wllama PR: 198
File: src/workers-code/llama-cpp.js:269-279
Timestamp: 2025-12-29T03:44:02.649Z
Learning: In wllama's src/workers-code/llama-cpp.js, the isAsync parameter in callWrapper should only be true for functions included in JSPI_EXPORTS build configuration (wllama_start and wllama_action). Other functions like wllama_exit, wllama_debug, and wllama_malloc should use isAsync=false even though the callWrapper returns an async function, because they don't perform async operations at the C/WASM level.
Applied to files:
cpp/wllama.cppscripts/docker-compose.yml
🪛 Shellcheck (0.11.0)
scripts/test_in_docker.sh
[warning] 3-3: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[warning] 4-4: Expanding an array without an index only gives the first element.
(SC2128)
🔇 Additional comments (11)
.gitignore (1)
17-17: LGTM!Appropriate addition to ignore Playwright-generated screenshot artifacts from the new test infrastructure.
src/workers-code/.gitattributes (1)
1-1: LGTM!Marking
generated.tsas linguist-generated is good practice to exclude it from language statistics and collapse diffs in PRs..gitattributes (1)
1-2: LGTM!Marking the multi-thread and single-thread directories as linguist-generated improves the review experience by collapsing generated WASM build outputs in diffs.
cpp/wllama.cpp (1)
173-175: Good type correction for pointer arithmetic.Using
uintptr_tinstead ofunsigned intis the correct approach for storing pointer values fromsbrk(0)and heap sizes. This ensures portability and avoids potential truncation issues on platforms with larger address spaces, aligning with the PR's build compatibility improvements.scripts/build_wasm.sh (2)
3-3: Good DX improvements.
- Enabling
set -eensures the script fails fast on errors.- ARM64 detection with platform and image tag adjustments is well-implemented.
- The
--abort-on-container-exitand--no-log-prefixflags improve build reliability and output readability.Also applies to: 13-19
5-5: EMSDK Docker image version 4.0.23 is available on Docker Hub.Both the standard tag (4.0.23) and ARM64 variant (4.0.23-arm64) are confirmed to exist on Docker Hub. No issues with the version bump.
CMakeLists.txt (1)
4-5: LGTM - Build configuration options are correctly declared.The new cache options properly configure the llama.cpp build. One minor observation: the description for
LLAMA_WASM_MEM64says "disable MEMORY64" but it's actually a toggle (ON enables, OFF disables). Consider rewording to "enable MEMORY64 for wasm" for consistency withGGML_CPU_REPACK's phrasing.package.json (1)
27-29: LGTM!The new test script entries properly reference the shell scripts added in this PR, providing convenient npm commands for Docker-based and WASM testing workflows.
scripts/docker-compose.yml (3)
2-29: Security-sensitive container configuration for testing.The
wllama-testservice uses highly permissive settings (privileged: true,network_mode: host,pid: host,ipc: host). While these may be necessary for Playwright's browser automation to work correctly, be aware this essentially removes container isolation.Ensure this configuration is only used in trusted CI environments and never in production contexts.
30-77: LGTM - Well-structured WASM test service.The
llamacpp-wasm-testsservice properly configures the WASM build environment with appropriate cmake flags and runs the ctest suite. The configuration aligns with the builder service pattern and correctly disablesLLAMA_WASM_MEM64as intended by this PR.
101-122: LGTM - Build configuration improvements.The refactoring to use bash arrays for
SHARED_EMCC_CFLAGSimproves readability. The addition of-DEMSCRIPTEN_SYSTEM_PROCESSOR=wasmensures correct build paths as mentioned in the PR objectives.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| export TEST_ARGS="$@" | ||
| SCRIPT_DIR=$(dirname -- "$BASH_SOURCE") |
There was a problem hiding this comment.
Fix shell variable handling issues.
Two shellcheck warnings apply here:
-
Line 3:
$@is an array; assigning to a string with"$@"may not preserve argument boundaries correctly. Use"$*"if you want a single concatenated string. -
Line 4:
$BASH_SOURCEis an array; use${BASH_SOURCE[0]}to explicitly get the first element.
Proposed fix
-export TEST_ARGS="$@"
-SCRIPT_DIR=$(dirname -- "$BASH_SOURCE")
+export TEST_ARGS="$*"
+SCRIPT_DIR=$(dirname -- "${BASH_SOURCE[0]}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export TEST_ARGS="$@" | |
| SCRIPT_DIR=$(dirname -- "$BASH_SOURCE") | |
| export TEST_ARGS="$*" | |
| SCRIPT_DIR=$(dirname -- "${BASH_SOURCE[0]}") |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 3-3: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[warning] 4-4: Expanding an array without an index only gives the first element.
(SC2128)
🤖 Prompt for AI Agents
In `@scripts/test_in_docker.sh` around lines 3 - 4, Assign TEST_ARGS using "$*"
instead of "$@" to preserve a single concatenated string of all arguments, and
set SCRIPT_DIR from the first element of the BASH_SOURCE array by using
${BASH_SOURCE[0]}; ensure both assignments are quoted to handle spaces (i.e.,
set TEST_ARGS to "$*" and SCRIPT_DIR to "$(dirname -- "${BASH_SOURCE[0]}")" so
argument boundaries and BASH_SOURCE indexing are handled correctly).
| export D_UID=$UID | ||
| export D_GID=$GID |
There was a problem hiding this comment.
$GID is not a standard shell variable and may be unset.
Unlike $UID which is a Bash built-in, $GID is not guaranteed to exist. On many systems, this will result in D_GID being empty, potentially causing permission issues in the container.
Proposed fix
export D_UID=$UID
-export D_GID=$GID
+export D_GID=$(id -g)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export D_UID=$UID | |
| export D_GID=$GID | |
| export D_UID=$UID | |
| export D_GID=$(id -g) |
🤖 Prompt for AI Agents
In `@scripts/test_wasm.sh` around lines 10 - 11, The script exports D_UID and
D_GID using $UID and $GID but $GID is not a standard shell variable and may be
empty; change the D_GID export to derive the current group ID reliably (e.g.,
export D_GID="$(id -g)" or fallback to "$(getent passwd "$USER" | cut -d: -f4)"
if needed) while keeping the D_UID export as is (export D_UID="$UID" or use id
-u for consistency), so D_GID is always set before being used.
There was a problem hiding this comment.
@aviallon, thank you for confirming! Let me know if you need any assistance with the fix. 🐰
┏━━━━━━━━━━━━━━━┓
┃ (\_/) ✓ ┃
┃ (•‿•) ┃
┃ />📝 ┃
┗━━━━━━━━━━━━━━━┛
|
Nice, thanks! I'll try to merge this after #201 to avoid any potential conflicts |
|
Hey @aviallon! I landed here because I was trying to find out if @ngxson I see no references to memory64 in this repo, is it on your roadmap? It bumps the max memory available from Thanks for you work both! If I can support in the implementation hit me up. |
Yes, I disabled memory64 because it caused immediate runtime issues, and I was not competent enough to fix them. |
@ngxson Here are my modifications to be able to build wllama with the latest's llama.cpp's code (including my recent PRs).
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.