-
Notifications
You must be signed in to change notification settings - Fork 182
Open
Description
Whenever I try to use Flint here locally (no Hadoop/EMR involved), it keep barfing at me with the above error message in the subject. It's a setup on top of Python 3.7 with PySpark 2.4.4 and OpenJDK 8; an Ubuntu 19.04 install.
Note: As I'm running locally only, I'm getting this log message from Spark, but everything does run perfectly using vanilla PySpark:
19/10/23 09:59:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
It happens when I try to either read a PySpark dataframe into a ts.flint.TimeSeriesDataFrame. This example is adapted from the Flint Example.ipynb:
import pyspark
import ts.flint
from ts.flint import FlintContext
sc = pyspark.SparkContext('local', 'Flint Example')
spark = pyspark.sql.SparkSession(sc)
flint_context = FlintContext(spark)
sp500 = (
spark.read
.option('header', True)
.option('inferSchema', True)
.csv('sp500.csv')
.withColumnRenamed('Date', 'time')
)
sp500 = flint_context.read.dataframe(sp500)The last line causes the "boom", with this (first part of) the stack trace:
TypeError Traceback (most recent call last)
~/.virtualenvs/pyspark-test/lib/python3.7/site-packages/ts/flint/java.py in new_reader(self)
37 try:
---> 38 return utils.jvm(self.sc).com.twosigma.flint.timeseries.io.read.TSReadBuilder()
39 except TypeError:
TypeError: 'JavaPackage' object is not callable
Any ideas what may be going wrong and how the problem could be solved?
Metadata
Metadata
Assignees
Labels
No labels