[Bug]: random characters as response and crash opencl backend demo

### Prerequisites

- [x] I have searched the existing issues and confirmed this is not a duplicate.
- [x] I am using the latest version of the MLLM framework.

### Bug Description

The OpenCL backend demo test is able to load and run, but output \<unk\> tokens and exits with SEGV.

Below are the steps to reproduce.

### Steps to Reproduce

1. Build the opencl demo using the steps here: https://github.com/UbiquitousLearning/mllm/issues/570
2. Generate the GGUF_Q4 model using the steps here: https://github.com/UbiquitousLearning/mllm/issues/586
3. Run on Qualcomm snapdragon mobile. 
PD2505:/data/local/tmp/opencl-package-latest $ ./mllm-tiny-llama-runner
[INFO]mllm/mllm/backends/opencl/OpenCLBackend.cpp:71 Initializing OpenCL backend..
[INFO] mllm/mllm/backends/opencl/OpenCLBackend.cpp:72 Device: QUALCOMM Adreno(TM) 840
[INFO] mllm/backends/opencl/OpenCLBackend.cpp:73 GPUType: 1
[INFO] mllm/latest-jan/mllm/mllm/backends/opencl/OpenCLBackend.cpp:74 Global Memory Cache Size: 1024 KB
[INFO] mllm/mllm/backends/opencl/OpenCLBackend.cpp:75 Compute Units: 12
[INFO] mllm/backends/opencl/OpenCLBackend.cpp:76 Max Work Group Size: 1024
[INFO] mllm/backends/opencl/OpenCLBackend.cpp:77 FP16 Supported: Yes
[INFO] mllm/backends/opencl/OpenCLBackend.cpp:78 Int8 dot Supported: No
[INFO] mllm/backends/opencl/OpenCLBackend.cpp:79 Int8 dot Accumulate Supported: No
completed initOpenCLBackend
completed mllm::load tinyllama-1.1b-chat-q40.mllm
completed llama.to(mllm::kOpenCL)
Entering for loop
**\<unk\> \<unk\> \<unk\> \<unk\> \<unk\> \<unk\> \<unk\> \<unk\> \<unk\> \<unk\>**
Completed for loop
Error: Received signal11 - SIGSEGV (Segmentation violation)
Stack trace:
#0 0x61b9777844
#1 0x61b9777708
#2 0x7e6306c874 __kernel_rt_sigreturn
#3 0x7bbecbcbc8 gsl_memory_free_pure
#4 0x7bbe3d4624
#5 0x7bbe3d9d5c
#6 0x7bbe3da060
#7 0x7bbe3d5b08 cb_release_mem_object
#8 0x7bcc669648 qCLDrvAPI_clReleaseMemObject
#9 0x7e5d3771b0 mllm::opencl::OpenCLAllocator::~OpenCLAllocator()
#10 0x7e5fb52f40 mllm::MemoryManager::~MemoryManager()
#11 0x7e5fb4789c mllm::Context::~Context()
#12 0x7e5d91b550 __cxa_finalize
#13 0x7e5d92057c exit
#14 0x7e5d913dac
Possible causes: invalid memory access, dangling pointer, stack overflow.
Shutting down...

### Expected Behavior

Display valid characters as response and exit graciously without crashing

### Operating System

Android

### Device

IQOO15

### MLLM Framework Version

Latest

### Model Information

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: random characters as response and crash opencl backend demo #590

Prerequisites

Bug Description

Steps to Reproduce

Expected Behavior

Operating System

Device

MLLM Framework Version

Model Information

Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: random characters as response and crash opencl backend demo #590

Description

Prerequisites

Bug Description

Steps to Reproduce

Expected Behavior

Operating System

Device

MLLM Framework Version

Model Information

Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions