-
Notifications
You must be signed in to change notification settings - Fork 14
Aws Sdk V3 #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Aws Sdk V3 #206
Conversation
* Remove old S3 files after the events are flushed to the queue * Allow a transform that can either return nothing to be written, 1 event, or multiple events.
…entity-table-loading
…entity-table-loading
…entity-table-loading
…entity-table-loading
…entity-table-loading
…e-loading AWSv3 s3 entity table loading
Use the new Upload class for the ES connector
default host to https:// if not specified
* added new auto-configure Fast S3 read options to entity table read * fixed bug in ES connector where NOT saving the results would result in the bot not checkpointing.
⛔ Snyk checks have failed. 5 issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
chore: upgrade js-beautify version in leo-connector-entity-table to fix vulnerability
For some reason, the OpenSearch project client does NOT add the correct `Accept-Encoding` gzip headers by default. You have to tell it to do that by setting `suggestCompression` to `true`. Without this, large ES response bodies will take 5x - 6x longer to download and can really mess up response times and concurrency because of the added time to download the full response.
- Use a proxy to preserve OpenSearch client methods
…r-connector ES-2352 - use a proxy
It will only accept `scroll_id` now.
- added the ability to filter the stream before loading to DynamoDB (for fanout) - Added support for FastJSON parsing (pass along the whole event if it was parsed initially by FastJSON.
…d-fast-json-support ES-2352 - improvements to entity table connector
| }); | ||
| } else { | ||
| done(err); | ||
| done(err, meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Inconsistent Upload Payload Breaks Backward Compatibility
The S3 upload logic in stream and streamParallel now returns an inconsistent payload. The new Promise-based Upload.done() includes file only on success and uploadError only on failure, unlike the previous s3.upload callback that always provided both. This breaks backward compatibility for downstream consumers.
Additional Locations (1)
- updated `leo-logger` and `leo-sdk` in connectors
| if (needsComparison) { | ||
| if (JSON.stringify(data.payload.old || {}) === JSON.stringify(data.payload.new || {})) { | ||
| return resolve(null); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Data Consistency Varies by Storage Location
The needsComparison flag only triggers when data is fetched from S3, but the comparison skips events when old and new are identical. This means non-S3 records with identical old/new values still emit events, while S3-backed identical records are filtered out, creating inconsistent behavior based on storage location rather than data content.
|
|
||
| if (s3Updates.length > 0) { | ||
| logger.info(`finished writing ${s3Updates.length} records to DynamoDB`); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Log Message Falsely Claims Completion
The log message says "finished writing records to DynamoDB" but appears before the batchWrite call executes. The message should say "finished writing records to S3" or be moved after the DynamoDB write completes to accurately reflect what operation finished.
| oldS3Files.push(image._s3); | ||
| } catch (e) { | ||
| return reject(e); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: S3 Deletes Referenced Files, Causing Data Loss
When an old S3 file is fetched and added to oldS3Files at line 341, but the comparison at lines 375-378 determines old and new are identical and returns null, the S3 file still gets deleted at the end. This causes data loss because the file is deleted even though it's still referenced in DynamoDB (since the event was filtered and no update occurred).
Note
Updates connector packages to AWS SDK v3, bumps shared deps (leo-connector-common v5, leo-sdk v7), and aligns peer deps; includes version bumps (e.g., mongo 4.0.1) and regenerated Postgres lockfile.
aws-sdkv2 to AWS SDK v3 packages (e.g.,@aws-sdk/client-lambda) across connectors.leo-connector-commonto^5.xand add/updateleo-sdkto^7.1.x/peer>=7.1.x.^5.0.0-awsv3ofleo-connector-common.leo-sdk >=7.1.0-awsv3.mongo/package.json: version to4.0.1and dependency/peer updates.postgres/package-lock.json: updated to5.0.0-awsv3with AWS SDK v3 dependency tree.package.jsontweaks to align versions and ranges.Written by Cursor Bugbot for commit 10b4209. This will update automatically on new commits. Configure here.