Skip to content

Conversation

@TebaleloS
Copy link
Collaborator

@TebaleloS TebaleloS commented Mar 23, 2023

Notes

  • Initialized DEFAULT_PARQUET_DATETIME_READ_MODE and DEFAULT_PARQUET_DATETIME_WRITE_MODE with a default value CORRECTED.
  • Added option condition for assessing the value of datetime read/write mode

Closes #2175

@TebaleloS TebaleloS marked this pull request as ready for review March 23, 2023 07:41
Copy link
Collaborator

@miroslavpojer miroslavpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pulled
  • code review
  • command generation test

Used command-line calls:

  • run_conformance.sh --rest-api-credentials-file ~/.ssh/menasCredential.properties --dataset-name test --dataset-version 1 --report-date 2020-03-03 --dry-run
  • run_standardization.sh --rest-api-credentials-file ~/.ssh/menasCredential.properties --dataset-name test --dataset-version 1 --report-date 2020-03-03 --dry-run
  • run_standardization_conformance.sh --rest-api-credentials-file ~/.ssh/menasCredential.properties --dataset-name test --dataset-version 1 --report-date 2020-03-03 --dry-run

Only bash files file were tested.
There is required to test last cmd one.

Copy link
Contributor

@dk1844 dk1844 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  • code reviewed
  • pulled
  • prepared dummy Enceladus env to see the spark-submit command
  • run

@dk1844
Copy link
Contributor

dk1844 commented Mar 24, 2023

I have tested both the sh and cmd script to see that they correctly pass the datetime configuration fields down to the spark-submit. I have not run the actual job.

sh

./run_standardization.sh ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY -> spark-submit ... --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ... ✔️

./run_conformance.sh ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY -> spark-submit ... --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ... ✔️

./run_standardization_conformance.sh ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY -> spark-submit ... --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ... ✔️

cmd

.\run_standardization.cmd ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY ->
spark-submit ... --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ... ✔️

.\run_conformance.cmd ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY ->
spark-submit --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ✔️

.\run_standardization_conformance.cmd ... --parquet-datetime-read-mode XX --parquet-datetime-write-mode YY -> spark-submit" ... --conf spark.sql.parquet.datetimeRebaseModeInRead=XX --conf spark.sql.parquet.datetimeRebaseModeInWrite=YY --conf spark.sql.parquet.int96RebaseModeInRead=XX --conf spark.sql.parquet.int96RebaseModeInWrite=YY ... ✔️

@dk1844 dk1844 added the PR:tested Only for PR - PR was tested by a tester (person) label Mar 24, 2023
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Collaborator

@lsulak lsulak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just read the code, given that two reviewers tested it

@benedeki benedeki removed their request for review November 20, 2025 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR:tested Only for PR - PR was tested by a tester (person)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ability to configure how Spark handles dates in parquet files.

5 participants