Skip to content

Declarative Self-Improving Language Programs for R

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

JamesHWade/dsprrr

dsprrr dsprrr hex sticker

Lifecycle: experimental CRAN status R-CMD-check Codecov test coverage

dsprrr adds signatures, optimization, and tracing on top of ellmer. It implements ideas from DSPy for R.

The problem: Hand-tuned prompts are fragile. They break when you switch models, change requirements, or scale up. dsprrr treats prompts as programs that can be systematically improved using your data.

Use cases:

  • RAG pipelines where you want to optimize retrieval + generation together
  • Classification or extraction tasks with labeled examples to learn from
  • Multi-step agents where you need to trace what went wrong
  • Any LLM workflow you want to improve without manually rewriting prompts

When to just use ellmer: If you have a prompt that works and don’t need to optimize it with data. ellmer already tracks conversation history and token costs.

Installation

# install.packages("pak")
pak::pak("JamesHWade/dsprrr")

What dsprrr adds

Signatures

A compact notation for defining LLM inputs and outputs:

library(dsprrr)
#> 
#> Attaching package: 'dsprrr'
#> The following object is masked from 'package:stats':
#> 
#>     step
#> The following object is masked from 'package:methods':
#> 
#>     signature

# Arrow notation: inputs -> output
signature("question -> answer")
#> 
#> ── Signature ──
#> 
#> ── Inputs
#> • question: "string" - Input: question
#> 
#> ── Output
#> Type: "object(answer: string)"
#> 
#> ── Instructions
#> Given the fields `question`, produce the fields `answer`.

# Multiple inputs
signature("context, question -> answer")
#> 
#> ── Signature ──
#> 
#> ── Inputs
#> • context: "string" - Input: context
#> • question: "string" - Input: question
#> 
#> ── Output
#> Type: "object(answer: string)"
#> 
#> ── Instructions
#> Given the fields `context`, `question`, produce the fields `answer`.

# Typed outputs (uses ellmer types under the hood)
signature("review -> rating: enum('1', '2', '3', '4', '5')")
#> 
#> ── Signature ──
#> 
#> ── Inputs
#> • review: "string" - Input: review
#> 
#> ── Output
#> Type: "object(rating: enum(1, 2, 3, 4, 5))"
#> 
#> ── Instructions
#> Given the fields `review`, produce the fields `rating`.

# With instructions
signature("text -> summary", instructions = "Maximum 50 words.")
#> 
#> ── Signature ──
#> 
#> ── Inputs
#> • text: "string" - Input: text
#> 
#> ── Output
#> Type: "object(summary: string)"
#> 
#> ── Instructions
#> Maximum 50 words.

Modules

Reusable, stateful wrappers around LLM calls:

library(ellmer)

# Create a module from a signature
mod <- module(signature("text -> sentiment"), type = "predict")

# Run it
run(mod, text = "This is great!", .llm = chat_openai())

# Or convert an existing Chat
classifier <- chat_openai() |>
  as_module("text -> sentiment: enum('positive', 'negative', 'neutral')")

classifier$predict(text = "Terrible experience")

Pipelines

Chain modules together with the %>>% operator. Outputs flow automatically to inputs:

# Chain three modules together
qa_pipeline <- mod_extract %>>% mod_answer %>>% mod_format

# Run the pipeline
result <- run(qa_pipeline, document = "...", .llm = chat_openai())

# With explicit field mapping when names don't match
rag_pipeline <- pipeline(
  mod_retrieve,
  step(mod_answer, map = c(documents = "context")),
  mod_summarize
)

Optimization

Automatically improve prompts using training data. dsprrr implements several optimizers from DSPy:

  • LabeledFewShot: Add examples from your training set as demonstrations
  • MIPROv2: Joint optimization of instructions and examples using Bayesian search
  • GEPA: Reflection-based instruction optimization—sample efficient and often outperforms manual prompts
# Compile with few-shot examples
optimized <- compile(
  LabeledFewShot(k = 3),
  mod,
  trainset = my_labeled_data
)

# Grid search over parameters
mod$optimize_grid(
  devset = dev_data,
  metric = metric_exact_match(),
  parameters = list(temperature = c(0.1, 0.5, 1.0))
)

Tracing

ellmer tracks individual chat history and costs. dsprrr adds module-level traces across pipelines—useful for debugging multi-step workflows:

mod$trace_summary()
export_traces(mod)

Quick example

library(dsprrr)
library(ellmer)

# Define what you want
sig <- signature(
  "context, question -> answer",
  instructions = "Answer based only on the given context."
)

# Create a module
mod <- module(sig, type = "predict")

# Run it
result <- run(
  mod,
  context = "R is a programming language for statistical computing.",
  question = "What is R used for?",
  .llm = chat_openai()
)

Module types

Type Use case
predict Basic text generation
react Tool use (wraps ellmer tools)
chain_of_thought Step-by-step reasoning
multichain Ensemble reasoning with multiple chains
program_of_thought Generate and execute R code
# ReAct agent with tools
agent <- module(
  signature("question -> answer"),
  type = "react",
  tools = list(my_search_tool)
)

# Chain of thought
mod <- module(signature("question -> answer"), type = "chain_of_thought")

ellmer compatibility

dsprrr uses ellmer for all LLM calls. The integration is straightforward:

ellmer dsprrr equivalent
chat_openai() Pass to run(..., .llm = )
type_string(), type_enum() Used inside signatures
tool() Pass to module(..., tools = )
chat$chat_structured() dsp(chat, signature, ...)

Learning more

Start here: - Getting Started — Choose your learning path

Tutorial sequence (learn step by step): 1. Your First LLM Call 2. Building a Classifier 3. Structured Outputs 4. Improving with Examples 5. Optimization 6. Production

How-to guides: - Compile & Optimize - Build RAG Pipelines

Concepts: - The DSPy Philosophy - How Optimization Works

Reference: - Quick Reference - API Documentation

Status

Experimental. The API may change. See PLAN.md for the roadmap.

Acknowledgments

Built on ellmer and S7. Inspired by DSPy.

About

Declarative Self-Improving Language Programs for R

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published