Add LLM-powered content summarization feature with litellm integratio…#2
Open
KTS-o7 wants to merge 1 commit intofinancial-datasets:mainfrom
Open
Add LLM-powered content summarization feature with litellm integratio…#2KTS-o7 wants to merge 1 commit intofinancial-datasets:mainfrom
KTS-o7 wants to merge 1 commit intofinancial-datasets:mainfrom
Conversation
…n. Update search functionality to include optional summaries, enhance README with new features, and add tests for summarization service.
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces LLM-powered content summarization to the web crawler, allowing users to generate high-quality summaries of search results using GPT-4o-mini and other supported models. The changes update the documentation, add new dependencies, enhance the CLI workflow, and introduce new modules for summarization and improved search functionality.
LLM Summarization Feature
README.mdto announce LLM-powered summarization, document new features, configuration steps, example output, architecture, supported LLM providers, and production considerations. [1] [2]litellmas a dependency inpyproject.tomlto enable LLM integration.CLI and Workflow Enhancements
src/main.pyto prompt users for AI summaries, allow configuration of the number of summaries, and integrate the newSummarizationServicefor summarizing search results. [1] [2] [3]Search and Summarization Modules
src/search/searcher.pyimplementingWebSearcherfor robust, multi-source web search with improved RSS parsing, Google News URL resolution, and result cleaning/sorting.src/summarizer/__init__.pyto expose summarization engine components for use in the crawler.