Add LLM benchmarking framework to staging by kubraaksux · Pull Request #2405 · apache/systemds

kubraaksux · 2026-01-19T14:05:51Z

Generic LLM benchmark suite for evaluating inference performance across different backends (vLLM, Ollama, OpenAI, MLX).

Features:

Multiple workload categories: math (GSM8K), reasoning (BoolQ, LogiQA), summarization (XSum, CNN/DM), JSON extraction
Pluggable backend architecture for different inference engines
Performance metrics: latency, throughput, memory usage
Accuracy evaluation per workload type
HTML report generation

This framework can be used to evaluate SystemDS LLM inference components once they are developed.

Generic LLM benchmark suite for evaluating inference performance across different backends (vLLM, Ollama, OpenAI, MLX). Features: - Multiple workload categories: math (GSM8K), reasoning (BoolQ, LogiQA), summarization (XSum, CNN/DM), JSON extraction - Pluggable backend architecture for different inference engines - Performance metrics: latency, throughput, memory usage - Accuracy evaluation per workload type - HTML report generation This framework can be used to evaluate SystemDS LLM inference components once they are developed.

- Connection.java: Changed loadModel(modelName) to loadModel(modelName, workerScriptPath) - Connection.java: Removed findPythonScript() method - LLMCallback.java: Added Javadoc for generate() method - JMLCLLMInferenceTest.java: Updated to pass script path to loadModel()

github-project-automation bot added this to SystemDS PR Queue Jan 19, 2026

github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 19, 2026

Merge upstream main

28b466d

kubraaksux force-pushed the llm-benchmark branch from ced6001 to 37ff5f3 Compare February 13, 2026 16:45

kubraaksux added 2 commits February 13, 2026 17:48

Add LLM inference support to JMLC API via Py4J bridge

477e204

kubraaksux force-pushed the llm-benchmark branch from 37ff5f3 to 53a7d70 Compare February 13, 2026 16:49

kubraaksux closed this Feb 13, 2026

kubraaksux deleted the llm-benchmark branch February 13, 2026 16:51

github-project-automation bot moved this from In Progress to Done in SystemDS PR Queue Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM benchmarking framework to staging#2405

Add LLM benchmarking framework to staging#2405
kubraaksux wants to merge 4 commits intoapache:mainfrom
kubraaksux:llm-benchmark

kubraaksux commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kubraaksux commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant