Skip to content
This repository was archived by the owner on Apr 27, 2022. It is now read-only.
This repository was archived by the owner on Apr 27, 2022. It is now read-only.

PySpark cannot find Python #3

@jest

Description

@jest

python3 APK installs only /usr/bin/python3 binary, but by default PySpark searches for python binary in PATH. This results in kernel error when enabling Spark in a notebook:

2019-11-19T10:55:15.696952916Z /usr/bin/find-spark-home: line 40: python: command not found
2019-11-19T10:55:15.697407078Z /usr/bin/spark-submit: line 27: /bin/spark-class: No such file or directory

I see two solutions to this problem. In Dockerfile:

  1. Either link python to python3:
    RUN cd /usr/bin && ln -s python3 python
    
  2. or set PYSPARK_PYTHON variable (not tested, as for https://stackoverflow.com/questions/30279783/apache-spark-how-to-use-pyspark-with-python-3):
    ENV PYSPARK_PYTHON python3
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions