Conversation
…ng of source nodes
Feature/project leaf operators + project Operator
There was a problem hiding this comment.
Pull Request Overview
This PR implements a comprehensive expression operator framework for the OptiSQL query engine, adding support for projection operations, various data sources (CSV, Parquet, S3), and expression evaluation infrastructure.
Key Changes
- Implemented projection operators with support for column pruning and schema filtering
- Added CSV and Parquet data source readers with Arrow integration
- Introduced S3 network resource support for remote data access
- Created expression evaluation framework with binary operators, scalar functions, and aggregates
- Added extensive test coverage for all new operators
Reviewed Changes
Copilot reviewed 32 out of 36 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Backend/opti-sql-rs/src/main.rs | Rust DataFusion integration with Substrait plan generation (contains typo) |
| src/Backend/opti-sql-rs/Cargo.toml | Added datafusion-substrait and tokio dependencies |
| src/Backend/opti-sql-go/operators/project/*.go | Core project operator implementations (CSV, Parquet, in-memory sources) |
| src/Backend/opti-sql-go/operators/Expr/expr.go | Expression evaluation framework with operators and functions |
| src/Backend/opti-sql-go/operators/filter/filter.go | Filter expression infrastructure (contains typos) |
| src/Backend/opti-sql-go/config/config.go | SECURITY ISSUE: Contains hardcoded S3 credentials |
| src/Backend/test_data/* | Test data files for CSV and Parquet sources |
| .gitignore | Updated to track specific test data files |
| CONTRIBUTING.md | Added instructions for HTML coverage reports |
| Makefile | Minor formatting cleanup |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // function calls | ||
| let (l,r) = basic_project("function call with Abs()",&mut ctx,"SELECT ABS(age) FROM example").await; | ||
| //println!("Logical Plan:\n{}", l); | ||
| print_batches(&r).unwrap();ß |
There was a problem hiding this comment.
Invalid character 'ß' found in the code. This appears to be a typo that should be removed.
| @@ -1,8 +1,65 @@ | |||
| //use datafusion::arrow::record_batch::RecordBatch; | |||
| use datafusion::{arrow::{array::RecordBatch, util::pretty::print_batches}, functions::crypto::basic, prelude::*}; | |||
There was a problem hiding this comment.
Unused import basic from datafusion::functions::crypto::basic. This import should be removed if not used.
| // TODO: remove hardcoded secretes before production. we are just testing for now | ||
| Secretes: secretesConfig{ | ||
| AccessKey: "DO8013ZT6VDHJ2EM94RN", | ||
| SecretKey: "kPvQSMt6naiwe/FhDnzXpYmVE5yzJUsIR0/OJpsUNzo", | ||
| EndpointURL: "atl1.digitaloceanspaces.com", | ||
| BucketName: "test-bucket-pull-down", |
There was a problem hiding this comment.
Hardcoded AWS/S3 credentials (Access Key and Secret Key) are exposed in the codebase. These sensitive credentials should never be committed to version control. Use environment variables or a secure secrets management system instead.
| // TODO: remove hardcoded secretes before production. we are just testing for now | |
| Secretes: secretesConfig{ | |
| AccessKey: "DO8013ZT6VDHJ2EM94RN", | |
| SecretKey: "kPvQSMt6naiwe/FhDnzXpYmVE5yzJUsIR0/OJpsUNzo", | |
| EndpointURL: "atl1.digitaloceanspaces.com", | |
| BucketName: "test-bucket-pull-down", | |
| // Load secrets from environment variables for security | |
| Secretes: secretesConfig{ | |
| AccessKey: os.Getenv("AWS_ACCESS_KEY"), | |
| SecretKey: os.Getenv("AWS_SECRET_KEY"), | |
| EndpointURL: os.Getenv("AWS_ENDPOINT_URL"), | |
| BucketName: os.Getenv("AWS_BUCKET_NAME"), |
| } else { | ||
| v, err := strconv.ParseInt(cell, 10, 64) | ||
| if err != nil { | ||
| fmt.Printf("failed to parse cell: %v with error: %v\n", cell, err) |
There was a problem hiding this comment.
Using fmt.Printf for error logging in production code. Consider using a proper logging framework or returning errors to the caller instead of printing to stdout.
| } else { | ||
| v, err := strconv.ParseFloat(cell, 64) | ||
| if err != nil { | ||
| fmt.Printf("failed to parse cell: %v with error: %v\n", cell, err) |
There was a problem hiding this comment.
Using fmt.Printf for error logging in production code. Consider using a proper logging framework or returning errors to the caller instead of printing to stdout.
|
|
||
| defer func() { | ||
| if err := filerReader.Close(); err != nil { | ||
| fmt.Printf("warning: failed to close parquet reader: %v\n", err) |
There was a problem hiding this comment.
Using fmt.Printf for warning messages. Consider using a proper logging framework for consistent logging across the application.
| // OR | ||
| // AND | ||
| // NOT | ||
| // FilterExpr takes in a field and column and yeilds a function that takes in an index and returns a bool indicating whether the row at that index satisfies the filter condition. |
There was a problem hiding this comment.
Typo in documentation comment: 'yeilds' should be 'yields'.
| // FilterExpr takes in a field and column and yeilds a function that takes in an index and returns a bool indicating whether the row at that index satisfies the filter condition. | |
| // FilterExpr takes in a field and column and yields a function that takes in an index and returns a bool indicating whether the row at that index satisfies the filter condition. |
| // AND | ||
| // NOT | ||
| // FilterExpr takes in a field and column and yeilds a function that takes in an index and returns a bool indicating whether the row at that index satisfies the filter condition. | ||
| type FilterExpr func(filed arrow.Field, col arrow.Array) func(i int) bool |
There was a problem hiding this comment.
Typo in parameter name: 'filed' should be 'field'.
| type FilterExpr func(filed arrow.Field, col arrow.Array) func(i int) bool | |
| type FilterExpr func(field arrow.Field, col arrow.Array) func(i int) bool |
|
|
||
| defer func() { | ||
| if err := filerReader.Close(); err != nil { | ||
| fmt.Printf("warning: failed to close parquet reader: %v\n", err) |
There was a problem hiding this comment.
Using fmt.Printf for warning messages. Consider using a proper logging framework for consistent logging across the application.
No description provided.