-
Notifications
You must be signed in to change notification settings - Fork 58
Add LangChain integration to main package with auto_instrument() support #1320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Move LangChain wrapper from integrations/langchain-py into the main braintrust package, enabling auto-instrumentation via braintrust.auto_instrument(). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The deprecation wrapper can be added after the new braintrust package is released. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| """ | ||
| span = current_span() | ||
| if span == NOOP_SPAN: | ||
| init_logger(project=project_name, api_key=api_key, project_id=project_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know what happens if init logger is initialized up front without a project name etc.? I have vague recollection that it could make traces show up in project log instead of in the an ongoing eval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you want to add to this repo
import asyncio
from braintrust import EvalAsync, Score, init_dataset, init_logger
from braintrust_langchain import BraintrustCallbackHandler, set_global_handler
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
project_name = "test-braintrust-converted"
logger = init_logger(project=project_name)
set_global_handler(BraintrustCallbackHandler(logger=logger))
chat_model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
async def toxicity_classifier(inputs: dict) -> dict:
instructions = (
"Please review the user query below and determine if it contains any form of toxic behavior, "
"such as insults, threats, or highly negative comments. Respond with 'Toxic' if it does "
"and 'Not toxic' if it doesn't."
)
messages = [
SystemMessage(content=instructions),
HumanMessage(content=inputs["text"]),
]
result = await chat_model.ainvoke(messages)
return {"class": result.content}
examples = [
{
"input": {"text": "Shut up, idiot"},
"expected": "Toxic",
},
{
"input": {"text": "You're a wonderful person"},
"expected": "Not toxic",
},
{
"input": {"text": "This is the worst thing ever"},
"expected": "Toxic",
},
{
"input": {"text": "I had a great day today"},
"expected": "Not toxic",
},
{
"input": {"text": "Nobody likes you"},
"expected": "Toxic",
},
{
"input": {"text": "This is unacceptable. I want to speak to the manager."},
"expected": "Not toxic",
},
]
dataset = init_dataset(project=project_name, name="Toxic Queries")
if len(list(dataset.fetch())) == 0:
for example in examples:
dataset.insert(**example)
dataset.summarize()
def correct(input, output, expected):
return Score(
name="Correct",
score=1 if output["class"] == expected else 0,
)
async def run_evaluation():
await EvalAsync(
project_name,
data=dataset,
task=toxicity_classifier,
scores=[correct],
experiment_name="gpt-4o-mini, baseline",
metadata={"description": "Testing the baseline system."},
max_concurrency=4,
)
if __name__ == "__main__":
asyncio.run(run_evaluation())
py/noxfile.py
Outdated
| # langchain requires Python >= 3.10 | ||
| # Note: langchain ecosystem packages have tight version coupling, so we pin | ||
| # entire sets of compatible versions rather than testing "latest" | ||
| LANGCHAIN_VERSIONS = ("0.3.27",) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a 1.x now too
| def test_langchain(session, version): | ||
| """Test LangChain integration.""" | ||
| # langchain requires Python >= 3.10 | ||
| if sys.version_info < (3, 10): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't support 3.9 anymore
py/noxfile.py
Outdated
| # langsmith is needed for the wrapper module but not in VENDOR_PACKAGES | ||
| session.install("langsmith") | ||
| # langchain dependencies for the langchain wrapper (pinned compatible versions) | ||
| session.install("langchain==0.3.27", "langchain-openai==0.3.35", "langchain-anthropic==0.3.22", "langgraph>=0.2.1,<0.4.0", "tenacity") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably test 1.x stuff as well. there shouldn't be any breaking changes between 0.x and 1.x but good to have the coverage now
| from .context import clear_global_handler, set_global_handler | ||
|
|
||
| __all__ = ["BraintrustCallbackHandler", "set_global_handler"] | ||
| __all__ = ["BraintrustCallbackHandler", "set_global_handler", "clear_global_handler"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we needed this change. should we just kill the source in the repo? the published pypi may be enough. perhaps we can save a tag or branch if we need to provide patch fixes.
ibolmo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would update and run the langchain.py golden tests https://github.com/braintrustdata/braintrust-sdk/blob/main/internal/golden/langchain.py
I also have a few (separate local repo) I'll try to add to the examples here
- Add LATEST to LANGCHAIN_VERSIONS for testing against newest releases - Remove redundant version pinning and explicit transitive deps (tenacity, pydantic) - Remove conditional skip for langgraph - it's now a required test dependency Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
integrations/langchain-pyinto the main braintrust packagebraintrust.auto_instrument()setup_langchain()for manual setup with global callback handlerTest plan
nox -s "test_langchain(0.3.27)"passes (335 tests)make fixuppassespython py/examples/langchain/auto.py🤖 Generated with Claude Code