ggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten.#18628
ggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten.#18628reeselevine merged 2 commits intoggml-org:masterfrom
Conversation
|
interesting, you see failures with simple-chat specifically? I was running test-backend-ops with the Emscripten build and didn't see any issues related to this. I also have an integration with wllama in progress here: ngxson/wllama#198, which seems to work on my Mac without errors. I know it might be a bit of work, but is there any way to tell why I haven't run into this error yet? Is that assert you're hitting not reached in the paths wllama is taking to call the llama.cpp APIs? |
|
Thank you for the review.
This error originates from params.mem_buffer: 0, ctx->mem_buffer: 0x0x1b3b80 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x2b8980 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x377fa40 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3786340 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3786340 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3786340 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3796680 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3796680 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x37a69c0 (at ggml_init)
simple-chat.js:1744 ...........................................................
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3786280 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x3786280 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0, ctx->mem_buffer: 0x0x394400 (at ggml_init)
simple-chat.html:146 params.mem_buffer: 0x238328, ctx->mem_buffer: 0x0x238328 (at ggml_init)
simple-chat.js:1744 /home/masashiyoshimura.linux/workspace/llama.cpp/ggml/src/ggml.c:1556: GGML_ASSERT(((uintptr_t) (ctx->mem_buffer))%GGML_MEM_ALIGN == 0) failed
simple-chat.js:616 Aborted()
As you say, test-backend-ops works correctly in my environment regardless of this fix.
I have not tested wllama yet, but I will take a look at it later. |
|
Ah, I think I might know what's happening here. We recently set llama.cpp to default to 64-bit builds here, but the wllama integration is using 32-bit builds for now, some updates are needed to support 64-bit. Then, looking at the logic around the change in this PR, it looks like align is set to 4 if the max pointer value is 32 bits long, which it would be in the 32-bit wasm build. So this error only occurs in 64-bit wasm builds. So I think this change will actually work, because if it is a 32-bit wasm build, the first conditional will be true. Otherwise, it must be a 64-bit build, in which case the align fix makes sense. Only suggestion I would make then is to add a comment explaining this in the code. |
93c746c to
8dde886
Compare
|
Thank you for the analysis. As you pointed out, this appears to be an issue specific to 64-bit wasm. I’ve added a comment explaining it—could you please review it again? |
* Fix GGML_MEM_ALIGN to 8 for emscripten. * Add a comment explaining the need for GGML_MEM_ALIGN == 8 in 64-bit wasm with emscripten
Hello. When attempting to run LLM inference in the browser using WebGPU, the following error occurs.
In Emscripten,
max_align_tis 0x8(ref. emscripten-core/emscripten#10072, https://github.com/emscripten-core/emscripten/pull/14599/files), which causesGGML_ASSERT_ALIGNEDto fail.Therefore, I modified the code to set
GGML_MEM_ALIGNto 8 when running in an Emscripten environment.I have confirmed that inference works correctly with
examples/simple-chatfor at least the following three models: