Rename Chunk to EmbeddableItem#37623
Conversation
Summary of ChangesHello @claudevdm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request undertakes a significant refactoring effort by renaming the core Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the codebase by renaming Chunk to the more generic EmbeddableItem. The change is applied consistently across multiple files, including type hints, docstrings, and variable names. A backward compatibility alias Chunk = EmbeddableItem is thoughtfully included. Overall, this is a great improvement for code clarity. I've added a couple of minor suggestions to improve consistency in PTransform labels and code formatting.
|
R: @damccorm |
|
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
damccorm
left a comment
There was a problem hiding this comment.
Thanks - we can get an extra pass from Gemini, but this LGTM
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively renames the Chunk class to EmbeddableItem across the codebase, which is a positive change for clarity and generality as outlined in the PR description. The introduction of EmbeddableItem as a universal container for embeddable content, along with factory methods like from_text, significantly improves the API. Backward compatibility is well-maintained through the use of aliases. The changes are consistently applied to imports, type hints, variable names, and docstrings throughout the affected files. Additionally, the Embedding dataclass has been enhanced with a metadata field, which is a valuable functional improvement for richer embedding representations. Minor inconsistencies in example code and the scope of the Embedding dataclass change have been noted for further refinement.
|
Sounds a breaking change: #37646 |
The naming
Chunkis confusing because it is associated with splitting a text document into smaller documents. EmbeddableItem keeps a unit of embeddable content more generic.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.