Tag Archives: pyspark
Run pyspark on your windows machine
1) Download Spark lib on your local machine and decompress the archive. Then set SPARK_HOME and HADOOP_HOME env variables to point to this decompressed folder location – For example: C:\Users\some_user\PycharmProjects\spark-2.4.4-bin-hadoop2.7 Also lookup the winutils executable online and you need to put it in the spark bin folder. 2) Install Java JDK if you do not […]
0
java-jdk in pyspark project
A pyspark project that is running locally requires JAVA_HOME environment variable setup. If you’re using conda or anaconda-project to manage packages, then you do not need to install the bloated Oracle Java JDK but just add the java-jdk package from bioconda (linux) or cyclus (linux and win) channel and point JAVA_HOME property to the bin […]
Login