GitHub - agilord/llamacpp_rpc_client: HTTP client bindings to call the llama.cpp RPC server in Dart

HTTP client bindings to call the llama.cpp RPC server.

Usage

import 'package:llamacpp_rpc_client/llamacpp_rpc_client.dart';

void main() async {
  final client = LlamacppRpcClient('http://localhost:8080');

  // Text completion
  final completion = await client.completion(
    'The capital of France is',
    options: CompletionOptions(
      maxTokens: 50,
      temperature: 0.7,
    ),
  );
  print(completion.content);

  // Streaming completion
  await for (final chunk in client.streamCompletion('Tell me a story')) {
    print(chunk.content);
  }

  // Text embedding
  final embedding = await client.embedding('Hello world');
  print(embedding.embedding.length);

  client.close();
}

CLI Usage

The package includes a command-line interface for easy interaction with llama.cpp servers:

Completion Command

Generate text completions:

dart run bin/llamacpp_rpc_client.dart completion \
  --url http://localhost:8080 \
  --prompt "The capital of France is" \
  --temperature 0.7 \
  --max-tokens 50

# Stream completion in real-time
dart run bin/llamacpp_rpc_client.dart completion \
  --url http://localhost:8080 \
  --prompt "Tell me a story" \
  --stream

# Deterministic generation with seed
dart run bin/llamacpp_rpc_client.dart completion \
  --url http://localhost:8080 \
  --prompt "Hello world" \
  --seed 42

Options:

--url, -u: Base URL of the llama.cpp RPC server (required)
--prompt, -p: Input prompt for completion (required)
--temperature, -t: Temperature for randomness (0.0-2.0)
--max-tokens, -m: Maximum tokens to generate
--top-p: Nucleus sampling parameter (0.0-1.0)
--top-k: Top-k sampling parameter
--seed: Random seed for deterministic generation
--stream, -s: Stream completion in real-time

Embedding Command

Generate text embeddings:

dart run bin/llamacpp_rpc_client.dart embedding \
  --url http://localhost:8080 \
  --input "machine learning"

# Output raw embedding values
dart run bin/llamacpp_rpc_client.dart embedding \
  --url http://localhost:8080 \
  --input "artificial intelligence" \
  --raw

Options:

--url, -u: Base URL of the llama.cpp RPC server (required)
--input, -i: Input text for embedding generation (required)
--raw, -r: Output raw embedding vector values

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
bin		bin
example		example
lib		lib
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
analysis_options.yaml		analysis_options.yaml
build.yaml		build.yaml
pubspec.yaml		pubspec.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Usage

CLI Usage

Completion Command

Embedding Command

About

Uh oh!

Releases

Packages

Languages

License

agilord/llamacpp_rpc_client

Folders and files

Latest commit

History

Repository files navigation

Usage

CLI Usage

Completion Command

Embedding Command

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages