-
Notifications
You must be signed in to change notification settings - Fork 85
added logic for finding test data in root and parent dir of project f… #824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #824 +/- ##
============================================
- Coverage 77.18% 77.17% -0.02%
Complexity 607 607
============================================
Files 85 84 -1
Lines 8858 8854 -4
Branches 1043 1043
============================================
- Hits 6837 6833 -4
Misses 1781 1781
Partials 240 240
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @shivendra-dev54, python sdk is built using pybind11 and depends on c++ library |
|
@yangxk1 Thanks for the clarification. |
|
Let me see why this is happening, please wait a moment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds fallback logic for locating test data in Spark and Java tests when the GAR_TEST_DATA environment variable is not set. The implementation checks the environment variable first, then a system property, and finally searches for the testing directory in common relative paths.
Changes:
- Added
resolveTestData()method in BaseTestSuite.scala to locate test data with fallback logic - Added
resolveTestData()method in GraphInfoTest.java to locate test data with fallback logic
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| maven-projects/spark/graphar/src/test/scala/org/apache/graphar/BaseTestSuite.scala | Replaces direct environment variable check with a new resolveTestData() method that tries environment variable, system property, and relative path candidates |
| maven-projects/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.java | Adds a static resolveTestData() method that implements the same fallback logic for Java tests |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return dir.getAbsolutePath(); | ||
| } | ||
| } | ||
| throw new IllegalStateException("GAR_TEST_DATA not found"); |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception type differs from the one used in TestUtil.java, which uses RuntimeException. Consider using RuntimeException instead of IllegalStateException for consistency across the codebase, or document why a different exception type is appropriate here.
| throw new IllegalStateException("GAR_TEST_DATA not found"); | |
| throw new RuntimeException("GAR_TEST_DATA not found"); |
| ) | ||
| .map(_.getAbsolutePath) | ||
| .getOrElse( | ||
| throw new IllegalArgumentException("GAR_TEST_DATA not found") |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception type differs from the one used in TestUtil.java, which uses RuntimeException. Consider using RuntimeException instead of IllegalArgumentException for consistency across the codebase, or document why a different exception type is appropriate here.
| throw new IllegalArgumentException("GAR_TEST_DATA not found") | |
| throw new RuntimeException("GAR_TEST_DATA not found") |
| return dir.getAbsolutePath(); | ||
| } | ||
| } | ||
| throw new IllegalStateException("GAR_TEST_DATA not found"); | ||
| } | ||
| return path; |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation approach differs from TestUtil.java which performs additional verification after finding a path. TestUtil.java re-validates the chosen path to ensure both the directory and the marker file exist. This double-checking approach is more robust and catches edge cases. Consider adding a final validation step after selecting a path to ensure it's valid before returning it.
| return dir.getAbsolutePath(); | |
| } | |
| } | |
| throw new IllegalStateException("GAR_TEST_DATA not found"); | |
| } | |
| return path; | |
| path = dir.getAbsolutePath(); | |
| break; | |
| } | |
| } | |
| if (path == null) { | |
| throw new IllegalStateException("GAR_TEST_DATA not found"); | |
| } | |
| } | |
| java.io.File baseDir = new java.io.File(path).getAbsoluteFile(); | |
| java.io.File markerFile = | |
| new java.io.File(baseDir, "ldbc_sample/csv/ldbc_sample.graph.yml"); | |
| if (!baseDir.isDirectory() || !markerFile.exists()) { | |
| throw new IllegalStateException("GAR_TEST_DATA not found"); | |
| } | |
| return baseDir.getAbsolutePath(); |
| def resolveTestData(): String = { | ||
| Option(System.getenv("GAR_TEST_DATA")) | ||
| .orElse(Option(System.getProperty("gar.test.data"))) | ||
| .getOrElse { | ||
| val candidates = Seq("../../testing", "../testing", "testing") | ||
| candidates | ||
| .map(p => new java.io.File(p).getAbsoluteFile) | ||
| .find(d => | ||
| new java.io.File(d, "ldbc_sample/csv/ldbc_sample.graph.yml") | ||
| .exists() | ||
| ) | ||
| .map(_.getAbsolutePath) | ||
| .getOrElse( | ||
| throw new IllegalArgumentException("GAR_TEST_DATA not found") | ||
| ) | ||
| } | ||
| } | ||
| testData = System.getenv("GAR_TEST_DATA") | ||
|
|
||
| testData = resolveTestData() |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation approach differs from TestUtil.java which performs additional verification after finding a path. TestUtil.java re-validates the chosen path to ensure both the directory and the marker file exist. This double-checking approach is more robust and catches edge cases. Consider adding a final validation step after selecting a path to ensure it's valid before returning it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivendra-dev54 Please consider this comment, it looks good
| if (new java.io.File(dir, "ldbc_sample/csv/ldbc_sample.graph.yml").exists()) { | ||
| return dir.getAbsolutePath(); |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic is inconsistent with the existing implementation in TestUtil.java. The current code only checks if the file exists, but TestUtil.java also verifies that the directory exists and is actually a directory. This could lead to false positives if the parent path doesn't exist. Consider adding an additional check to verify that the parent directory exists and is a directory before checking for the specific file.
| if (new java.io.File(dir, "ldbc_sample/csv/ldbc_sample.graph.yml").exists()) { | |
| return dir.getAbsolutePath(); | |
| if (dir.exists() && dir.isDirectory()) { | |
| java.io.File graphFile = | |
| new java.io.File(dir, "ldbc_sample/csv/ldbc_sample.graph.yml"); | |
| if (graphFile.exists()) { | |
| return dir.getAbsolutePath(); | |
| } |
| new java.io.File(d, "ldbc_sample/csv/ldbc_sample.graph.yml") | ||
| .exists() |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic is inconsistent with the existing implementation in TestUtil.java. The Scala code only checks if the file exists, but TestUtil.java also verifies that the directory exists and is actually a directory. This could lead to false positives if the parent path doesn't exist. Consider adding an additional check to verify that the parent directory exists and is a directory before checking for the specific file.
| new java.io.File(d, "ldbc_sample/csv/ldbc_sample.graph.yml") | |
| .exists() | |
| d.exists() && d.isDirectory && | |
| new java.io.File(d, "ldbc_sample/csv/ldbc_sample.graph.yml") | |
| .isFile |
| ) | ||
| .map(_.getAbsolutePath) | ||
| .getOrElse( | ||
| throw new IllegalArgumentException("GAR_TEST_DATA not found") |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message is less informative than the one used in TestUtil.java. TestUtil.java provides a more detailed message that explains both the environment variable option and the directory structure requirement. Consider updating the message to match the clarity of TestUtil.java's error message which states: "GAR_TEST_DATA not found or invalid. Please set GAR_TEST_DATA environment variable to point to the testing directory or ensure the testing directory exists with ldbc_sample/csv/ldbc_sample.graph.yml".
| throw new IllegalArgumentException("GAR_TEST_DATA not found") | |
| throw new IllegalArgumentException( | |
| "GAR_TEST_DATA not found or invalid. Please set GAR_TEST_DATA environment variable to point to the testing directory or ensure the testing directory exists with ldbc_sample/csv/ldbc_sample.graph.yml" | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivendra-dev54, You can consider adopting this suggestion, it makes the error message clearer
|
@shivendra-dev54 This problem seems to be solved. In addition. |
|
@yangxk1 |
|
Great! In the |
yangxk1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you~ @shivendra-dev54
|
Oh!!! I forgot to check the PR name, please remember to remind me next time |
|
@yangxk1 |
|
here https://github.com/apache/incubator-graphar/blob/main/CONTRIBUTING.md#title, This will make it easier to classify PRs |
Fixes #757
Reason for this PR
Spark and Java tests currently rely on the
GAR_TEST_DATAenvironment variable.This PR adds fallback logic to locate the
testingdirectory from the projectroot or its parent when the environment variable is not set, avoiding test
failures in local environments.
What changes are included in this PR?
The following test files were updated to add fallback test-data resolution logic:
maven-projects/java/src/test/java/org/apache/graphar/graphinfo/GraphInfoTest.javamaven-projects/spark/graphar/src/test/scala/org/apache/graphar/BaseTestSuite.scalaAre these changes tested?
Yes.
Are there any user-facing changes?
No.