Can't run example on llama-2-13b-chat q4_0

I apologize in advance if I omit any useful details, I'm just a simple dev with no knowledge or understanding in DS and therefore I'm in trial and error land.

I followed the [instructions from llama.cpp](https://github.com/ggerganov/llama.cpp#prepare-data--run) on the llama-2-13b-chat model, and I now have the q4_0 file: `llama-2-13b-chat/ggml-model-q4_0.gguf`.

I use the example code from this repo and of course have changed it to point to the model file, but loading fails:

## The code: 

```js
import { LLM } from 'llama-node';
import { LLamaCpp } from 'llama-node/dist/llm/llama-cpp.js';
import path from 'path';

const model = path.resolve(
	process.cwd(),
	'../llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf',
);

console.log(model);

const llama = new LLM(LLamaCpp);
/** @type {import('llama-node/dist/llm/llama-cpp').LoadConfig} */
const config = {
	modelPath: model,
	enableLogging: true,
	nCtx: 1024,
	seed: 0,
	f16Kv: false,
	logitsAll: false,
	vocabOnly: false,
	useMlock: false,
	embedding: false,
	useMmap: true,
	nGpuLayers: 128,
};

const template = `How are you?`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;

const params = {
	nThreads: 4,
	nTokPredict: 2048,
	topK: 40,
	topP: 0.1,
	temp: 0.2,
	repeatPenalty: 1,
	prompt,
};

const run = async () => {
	await llama.load(config);

	await llama.createCompletion(params, response => {
		process.stdout.write(response.token);
	});
};

run();
```

## The error:

```
Debugger listening on ws://127.0.0.1:59899/c72280cb-a098-4c15-859f-54025e513896
For help, see: https://nodejs.org/en/docs/inspector
Debugger attached.
/Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
llama.cpp: loading model from /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?
llama_init_from_file: failed to load model
Waiting for the debugger to disconnect...
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Failed to initialize LLama context from file: /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf] {
  code: 'GenericFailure'
}

Node.js v18.17.1
```

I can see that the error refers to some constants which it doesn't expect in the file (`error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?`), and I see that it's a gguf file and not a ggml one.

From a quick google search, I got to [this](https://www.reddit.com/r/LocalLLaMA/comments/15triq2/gguf_is_going_to_make_llamacpp_much_better_and/) post on `r/LocalLLaMA` which stats that gguf is sort of a successor to ggml.

I have literally 0 understanding of what I'm doing, and would appreciate if someone could point me in some direction of how to deal with it. Even just pointing out keywords I might have missed which could have led me to find a better answer in the first place 😅

Thanks in advance for your time!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't run example on llama-2-13b-chat q4_0 #116

The code:

The error:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can't run example on llama-2-13b-chat q4_0 #116

Description

The code:

The error:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions