-
Notifications
You must be signed in to change notification settings - Fork 76
feat(datasets): support running experiments on versioned datasets #727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
4 similar comments
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 files reviewed, 1 comment
Additional Comments (1)
This test derives |
Important
Adds support for running experiments on versioned datasets by introducing a
versionparameter to dataset fetching and experiment running functions, with a new E2E test to verify functionality.DatasetManager.get()now accepts an optionalversiontimestamp, forwards it todatasetItems.list, and exposes it on the returnedFetchedDatasetobject.runExperiment()from aFetchedDatasetforwardsversionasdatasetVersiontoexperiment.run().ExperimentParamsgains an optionaldatasetVersion?: stringfield.datasets.e2e.test.tsfor fetching a dataset at a given version and running an experiment over that snapshot.createdAt + 1000ms).This description was created by
for 1ddf69c. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Overview
Greptile Summary
This PR adds support for running experiments against a versioned snapshot of a Langfuse dataset.
DatasetManager.get()now accepts an optionalversiontimestamp, forwards it todatasetItems.list, and exposes it on the returnedFetchedDatasetobject. When callingrunExperiment()from that dataset, the same value is forwarded asdatasetVersiontoexperiment.run().ExperimentParamsgains an optionaldatasetVersion?: stringfield.Main issue to address before merge: the new E2E test’s version timestamp computation can be flaky because it relies on local time arithmetic (
createdAt + 1000ms) rather than a server-observed ordering guarantee between the initial item write and the upsert.Confidence Score: 4/5
datasetVersionparam. The main risk is CI instability from the added E2E test relying on local time arithmetic and ingestion timing, which can cause intermittent failures even when the feature works correctly.Important Files Changed
versionthrough item listing, exposes it onFetchedDataset, and forwards it asdatasetVersionwhen callingexperiment.run.ExperimentParamswith optionaldatasetVersionstring to support running experiments against a snapshot of a versioned dataset.Sequence Diagram
sequenceDiagram participant T as Test (datasets.e2e) participant C as LangfuseClient participant D as DatasetManager participant API as Langfuse API participant E as ExperimentRunner T->>C: api.datasets.create(name) T->>C: api.datasetItems.create(item1) T->>C: waitForServerIngestion() T->>D: dataset.get(name) D->>API: datasets.get(name) loop paginate items D->>API: datasetItems.list(datasetName, page, limit) Note over D,API: includes {version} if provided end T->>C: api.datasetItems.create(upsert item1) T->>C: api.datasetItems.create(item2) T->>D: dataset.get(name, {version}) D->>API: datasetItems.list(..., version) D->>E: experiment.run({data: items, datasetVersion: version, ...}) E->>API: create dataset run + item runs E-->>T: ExperimentResult