dsprrr adds signatures, optimization, and tracing on top of ellmer. It implements ideas from DSPy for R.
The problem: Hand-tuned prompts are fragile. They break when you switch models, change requirements, or scale up. dsprrr treats prompts as programs that can be systematically improved using your data.
Use cases:
- RAG pipelines where you want to optimize retrieval + generation together
- Classification or extraction tasks with labeled examples to learn from
- Multi-step agents where you need to trace what went wrong
- Any LLM workflow you want to improve without manually rewriting prompts
When to just use ellmer: If you have a prompt that works and don’t need to optimize it with data. ellmer already tracks conversation history and token costs.
# install.packages("pak")
pak::pak("JamesHWade/dsprrr")A compact notation for defining LLM inputs and outputs:
library(dsprrr)
#>
#> Attaching package: 'dsprrr'
#> The following object is masked from 'package:stats':
#>
#> step
#> The following object is masked from 'package:methods':
#>
#> signature
# Arrow notation: inputs -> output
signature("question -> answer")
#>
#> ── Signature ──
#>
#> ── Inputs
#> • question: "string" - Input: question
#>
#> ── Output
#> Type: "object(answer: string)"
#>
#> ── Instructions
#> Given the fields `question`, produce the fields `answer`.
# Multiple inputs
signature("context, question -> answer")
#>
#> ── Signature ──
#>
#> ── Inputs
#> • context: "string" - Input: context
#> • question: "string" - Input: question
#>
#> ── Output
#> Type: "object(answer: string)"
#>
#> ── Instructions
#> Given the fields `context`, `question`, produce the fields `answer`.
# Typed outputs (uses ellmer types under the hood)
signature("review -> rating: enum('1', '2', '3', '4', '5')")
#>
#> ── Signature ──
#>
#> ── Inputs
#> • review: "string" - Input: review
#>
#> ── Output
#> Type: "object(rating: enum(1, 2, 3, 4, 5))"
#>
#> ── Instructions
#> Given the fields `review`, produce the fields `rating`.
# With instructions
signature("text -> summary", instructions = "Maximum 50 words.")
#>
#> ── Signature ──
#>
#> ── Inputs
#> • text: "string" - Input: text
#>
#> ── Output
#> Type: "object(summary: string)"
#>
#> ── Instructions
#> Maximum 50 words.Reusable, stateful wrappers around LLM calls:
library(ellmer)
# Create a module from a signature
mod <- module(signature("text -> sentiment"), type = "predict")
# Run it
run(mod, text = "This is great!", .llm = chat_openai())
# Or convert an existing Chat
classifier <- chat_openai() |>
as_module("text -> sentiment: enum('positive', 'negative', 'neutral')")
classifier$predict(text = "Terrible experience")Chain modules together with the %>>% operator. Outputs flow
automatically to inputs:
# Chain three modules together
qa_pipeline <- mod_extract %>>% mod_answer %>>% mod_format
# Run the pipeline
result <- run(qa_pipeline, document = "...", .llm = chat_openai())
# With explicit field mapping when names don't match
rag_pipeline <- pipeline(
mod_retrieve,
step(mod_answer, map = c(documents = "context")),
mod_summarize
)Automatically improve prompts using training data. dsprrr implements several optimizers from DSPy:
- LabeledFewShot: Add examples from your training set as demonstrations
- MIPROv2: Joint optimization of instructions and examples using Bayesian search
- GEPA: Reflection-based instruction optimization—sample efficient and often outperforms manual prompts
# Compile with few-shot examples
optimized <- compile(
LabeledFewShot(k = 3),
mod,
trainset = my_labeled_data
)
# Grid search over parameters
mod$optimize_grid(
devset = dev_data,
metric = metric_exact_match(),
parameters = list(temperature = c(0.1, 0.5, 1.0))
)ellmer tracks individual chat history and costs. dsprrr adds module-level traces across pipelines—useful for debugging multi-step workflows:
mod$trace_summary()
export_traces(mod)library(dsprrr)
library(ellmer)
# Define what you want
sig <- signature(
"context, question -> answer",
instructions = "Answer based only on the given context."
)
# Create a module
mod <- module(sig, type = "predict")
# Run it
result <- run(
mod,
context = "R is a programming language for statistical computing.",
question = "What is R used for?",
.llm = chat_openai()
)| Type | Use case |
|---|---|
predict |
Basic text generation |
react |
Tool use (wraps ellmer tools) |
chain_of_thought |
Step-by-step reasoning |
multichain |
Ensemble reasoning with multiple chains |
program_of_thought |
Generate and execute R code |
# ReAct agent with tools
agent <- module(
signature("question -> answer"),
type = "react",
tools = list(my_search_tool)
)
# Chain of thought
mod <- module(signature("question -> answer"), type = "chain_of_thought")dsprrr uses ellmer for all LLM calls. The integration is straightforward:
| ellmer | dsprrr equivalent |
|---|---|
chat_openai() |
Pass to run(..., .llm = ) |
type_string(), type_enum() |
Used inside signatures |
tool() |
Pass to module(..., tools = ) |
chat$chat_structured() |
dsp(chat, signature, ...) |
Start here: - Getting Started — Choose your learning path
Tutorial sequence (learn step by step): 1. Your First LLM Call 2. Building a Classifier 3. Structured Outputs 4. Improving with Examples 5. Optimization 6. Production
How-to guides: - Compile & Optimize - Build RAG Pipelines
Concepts: - The DSPy Philosophy - How Optimization Works
Reference: - Quick Reference - API Documentation
Experimental. The API may change. See PLAN.md for the roadmap.
