Replies: 1 comment
-
|
@squalud, I note there is already some code to pass |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I use Alluxio's proxy to provide S3 interface access.
By setting
spark.hadoop.fs.ks3.endpointtohttp://<alluxio-proxy-service-name>:39999/api/v1/s3/and setting thespark.hadoop.fs.s3a.path.style.accessparameter totrueto usepath-styleto access S3, I can use pyspark to successfully read csv files through the URL format ofs3a://data/tmp/file.csv; Note that/datais a path, not the bucket.But when I change to gluten and setting
spark.gluten.sql.native.arrow.reader.enabledtotrueto use arrow's reader to read, I get an error:It seems that Arrow's reader treats the first-level path
dataas the bucket, that is, the configurationspark.hadoop.fs.s3a.path.style.accessdoes not take effect to Gluten/Arrow. How can I use gluten + arrow's reader to access S3 based onpath-stylejust like Spark's original reader?Beta Was this translation helpful? Give feedback.
All reactions