Installing Apache Spark from source

1. Introduction

I will show how to intall Apache Spark3.0.1 from source.

2. Install procedure

(1) Install Java8

sudo yum install java-1.8.0-openjdk
sudo yum install java-1.8.0-openjdk-devel

(2) Install Maven3.6.3 which is a Java build tool

sudo tar xf ./apache-maven-3.6.3-bin.tar.gz -C /opt
sudo ln -s /opt/apache-maven-3.6.3 /opt/maven

(3) Set environment variable for maven
sudo vi /etc/profile.d/

export JAVA_HOME=/usr/lib/jvm/jre-openjdk
export M2_HOME=/opt/maven
export MAVEN_HOME=/opt/maven
export PATH=${M2_HOME}/bin:${PATH}
–add the following lines
sudo chmod +x /etc/profile.d/
source /etc/profile.d/

(4) Install Scala2.12

tar xf ./scala-2.12.13.tgz -C /usr/local
vi ~/.bashrc
–add the following lines
export SCALA_HOME=/usr/local/scala

source ~/.bashrc

(5) Install Spark3.0.1

tar xf ./spark-3.0.1.tgz
cd ./spark-3.0.1
./build/mvn -DskipTests clean package

(6) Verfy whether Installing Succeeds or not

./bin/run-example SparkPi 10

3. References

[1] Spark Standalone Mode

[2] Building Spark

[3] Installing Apache Maven on CentOS 7

Published by ktke109

I love open souce database management systems.