-
Notifications
You must be signed in to change notification settings - Fork 108
cursor/update debian pipeline for diff structure a696 #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cursor/update debian pipeline for diff structure a696 #133
Conversation
- fixing tests now
package_managers/debian/README.md
Outdated
|
|
||
| ## Approach | ||
|
|
||
| There is a 1 to 1 mapping between Packages and Sources. During the load step, we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many to 1, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, many to 1, my bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the Debian indexing pipeline by refining the diff structure, enhancing the parser and differential processing, and improving related tests and helper modules. Key changes include:
- Extensive modifications across the Debian parser, diff, and main modules to streamline data processing.
- Updates to test suites and fixtures to ensure full coverage of the new diff pipeline.
- Removal of outdated modules (e.g. the transformer and loader) and the introduction of new modules for sources mapping, database handling, and utility functions.
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/package_managers/debian/* | New and updated tests to cover sources, parser, diff, and fixtures |
| package_managers/debian/parser.py | Refactoring to simplify multiline field processing and URL normalization |
| package_managers/debian/main.py | Major revisions to the pipeline including diff processing and fetch handling |
| package_managers/debian/diff.py | New module handling differential comparisons for packages, URLs, and dependencies |
| package_managers/debian/debian_sources.py | New functions for building package-to-source mappings and enriching package data |
| core/utils.py, core/structs.py | Minor updates including a new helper (file_exists) and addition of the DiffResult dataclass |
| Other files (README.md, db.py) | Documentation and database ingestion tweaks supporting the new pipeline |
Comments suppressed due to low confidence (1)
package_managers/debian/debian_sources.py:20
- Consider specifying an encoding (e.g. encoding='utf-8') when opening the sources file, to align with other file operations and ensure consistent behavior across platforms.
with open(sources_file_path) as f:
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Uh oh!
There was an error while loading. Please reload this page.