Tuesday, April 20, 2021

How to use latest version of Rapids Accelerator for Spark on EMR

Goal:

This article shows how to use latest version of Rapids Accelerator for Spark on EMR. 

Currently the latest EMR 6.2 only ships with Rapids Accelerator 0.2.0 with cuDF 0.15 jar.

However as of today, the latest Rapids Accelerator is 0.4.1 with cuDF 0.18 jar.

Note: This is NOT official steps on enabling rapids+Spark on EMR, but just some technical research.

Env:

EMR 6.2

Concept:

As per EMR Doc on Using the Nvidia Spark-RAPIDS Accelerator for Spark, it provides an option "enableSparkRapids":"true" in the configuration file when creating EMR.

Basically before we look for the solution to use latest version of Rapids Accelerator for Spark, we need to understand what does this option do. 

As per my tests on EMR 6.2, this option will do below stuff:

1. Put the Rapids Accelerator 0.2.0 jar and cuDF 0.15 jar in below location with soft links

/usr/lib/spark/jars/rapids-4-spark_2.12-0.2.0.jar -> /usr/share/aws/emr/spark-rapids/lib/rapids-4-spark_2.12-0.2.0.jar
/usr/lib/spark/jars/cudf-0.15-cuda10-1.jar -> /usr/share/aws/emr/spark-rapids/lib/cudf-0.15-cuda10-1.jar

2. Put the getGpusResources.sh and xgboost4j-spark_3.0-1.0.0-0.2.0.jar

/usr/lib/spark/jars/xgboost4j-spark_3.0-1.0.0-0.2.0.jar
/usr/lib/spark/scripts/gpu/getGpusResources.sh

Now here is another action item which is done regardless of the option(event when enableSparkRapids":"false"):

3.  Install the CUDA toolkit 10.1 with the soft link /usr/local/cuda pointing to it.

/usr/local/cuda -> /mnt/nvidia/cuda-10.1

After knowing all of above, then we may think of how about using bootstrap actions to change those jars, and install a newer version of CUDA toolkit say 11.0? 

The answer is no.   This is because unfortunately, our bootstrap action script will run BEFORE above steps. 

It is like above steps are 2nd bootstrap actions.

Even if we used bootstrap actions script to replace above jars with the latest, and also install the latest CUDA toolkit 11.0 which can change the soft link /usr/local/cuda to point to cuda-11.0, eventually you will see 2 versions of Rapids Accelerator and cuDF jars in the same location, and also the the /usr/local/cuda will be changed back to point to cuda-10.1.

Solution:

The solution is to disable the option to set it false in configuration: "enableSparkRapids":"false".

Since we already know what this option does, we just need to use bootstrap actions to mimic the same thing(of course, using all latest&greatest versions).

1. Install CUDA Toolkit 11.0 and cuda-compat-11-0 

We can not just simply install CUDA Toolkit 11.0 because the nvidia driver installed on EMR 6.2 is R418. To make CUDA Toolkit 11.0 running on the R418 driver, as per the CUDA compatibility matrix, the minimum required driver version is >= 450.36.06. 

To make CUDA Toolkit 11.0 work on lower version of driver(forward compatible), we need to install a package named "cuda-compat".

We need to firstly know which commands to install this version by going to this CUDA download page.

Then how could we know the OS version on EMR? EMR has its own customized linux OS "Amazon Linux 2":

# cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

To figure out which package is compatible, we can get the base OS version by using this command:

rpm -E %{rhel}

Above will tell you it is redhat 7 based or compatible. Then we know which OS version to choose.

Below commands are what we need:

sudo yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
sudo yum clean all
sudo yum -y install cuda-toolkit-11-0
sudo yum -y install cuda-compat-11-0

2. Fetch the Rapids Accelerator jar and cuDF jar

You can always fetch the latest versions(or whatever version you want) by going to this download page

Save the URLs for those 2 jars. Or you can choose to download them firstly and upload on a S3 bucket.

In below example, I will fetch one jar directly from a URL, and fetch another jar from S3 bucket.

3. Fetch the xgboost4j-spark jar

 For spark 3.0, the latest jar can be downloaded here.

https://repo1.maven.org/maven2/com/nvidia/xgboost4j-spark_3.0/

As of today, the latest version is:

https://repo1.maven.org/maven2/com/nvidia/xgboost4j-spark_3.0/1.3.0-0.1.0/xgboost4j-spark_3.0-1.3.0-0.1.0.jar

Save this link.

4. Fetch the getGpusResources.sh

Basically this file exist in Spark directory as well, but sometimes we do not know if our bootstrap script or some other EMR internal bootstrap script will run firstly.

It is better to always choose a stable link. Here let's use below link:

https://raw.githubusercontent.com/apache/spark/master/examples/src/main/scripts/getGpusResources.sh

5. Prepare a bootstrap action script

Sample script named bootstrap-install-cuda-compat-11.sh:

#!/bin/bash

set -ex

sudo chmod a+rwx -R /sys/fs/cgroup/cpu,cpuacct
sudo chmod a+rwx -R /sys/fs/cgroup/devices

echo "Install the cuda-compat-11-0"
sudo yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
sudo yum clean all
sudo yum -y install cuda-toolkit-11-0
sudo yum -y install cuda-compat-11-0
sudo rm -f /usr/lib/spark/jars/rapids-4-spark_2.12-0.2.0.jar
sudo rm -f /usr/share/aws/emr/spark-rapids/lib/rapids-4-spark_2.12-0.2.0.jar
sudo rm -f /usr/lib/spark/jars/cudf-0.15-cuda10-1.jar
sudo rm -f /usr/share/aws/emr/spark-rapids/lib/cudf-0.15-cuda10-1.jar
sudo mkdir -p /usr/share/aws/emr/spark-rapids/lib/
sudo mkdir -p /usr/lib/spark/jars/
sudo wget https://xxx/cudf-<version>.jar -O /usr/share/aws/emr/spark-rapids/lib/cudf-<version>.jar
sudo ln -s /usr/share/aws/emr/spark-rapids/lib/cudf-<version>.jar /usr/lib/spark/jars/cudf-<version>.jar
sudo aws s3 cp s3://<BUCKET-NAME>/rapids-4-spark_<version>.jar /usr/share/aws/emr/spark-rapids/lib/rapids-4-spark_<version>.jar
sudo ln -s /usr/share/aws/emr/spark-rapids/lib/rapids-4-spark_<version>.jar /usr/lib/spark/jars/rapids-4-spark_<version>.jar
sudo wget https://repo1.maven.org/maven2/com/nvidia/xgboost4j-spark_3.0/1.3.0-0.1.0/xgboost4j-spark_3.0-1.3.0-0.1.0.jar -O /usr/lib/spark/jars/xgboost4j-spark_3.0-1.3.0-0.1.0.jar
sudo mkdir -p /usr/lib/spark/scripts/gpu/
sudo wget https://raw.githubusercontent.com/apache/spark/master/examples/src/main/scripts/getGpusResources.sh -O /usr/lib/spark/scripts/gpu/getGpusResources.sh
sudo chmod +x /usr/lib/spark/scripts/gpu/getGpusResources.sh
sudo alternatives --set java /usr/lib/jvm/java-11-amazon-corretto.x86_64/bin/java

Of course, you can make above shell script more robust by adding more checks but this is just a simplest demo.

I can find many other EMR bootstrap action scripts in this github which you can refer to.

And then copy the above bootstrap actions script on S3 bucket:

chmod +x bootstrap-install-cuda-compat-11.sh
aws s3 cp bootstrap-install-cuda-compat-11.sh s3://BUCKET-NAME/bootstrap-install-cuda-compat-11.sh

6. Prepare a configuration file

Say the name is EMR_java11_custom_bootstrap.json:

[
{
"Classification": "spark",
"Properties": {
"enableSparkRapids": "false"
},
"Configurations": []
},
{
"Classification": "yarn-site",
"Properties": {
"yarn.nodemanager.linux-container-executor.cgroups.mount": "true",
"yarn.nodemanager.linux-container-executor.cgroups.mount-path": "/sys/fs/cgroup",
"yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables": "/usr/bin",
"yarn.nodemanager.linux-container-executor.cgroups.hierarchy": "yarn",
"yarn.nodemanager.container-executor.class": "org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor",
"yarn.resource-types": "yarn.io/gpu",
"yarn.nodemanager.resource-plugins": "yarn.io/gpu",
"yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices": "auto"
},
"Configurations": []
},
{
"Classification": "container-executor",
"Properties": {},
"Configurations": [
{
"Classification": "gpu",
"Properties": {
"module.enabled": "true"
},
"Configurations": []
},
{
"Classification": "cgroups",
"Properties": {
"root": "/sys/fs/cgroup",
"yarn-hierarchy": "yarn"
},
"Configurations": []
}
]
},
{
"Classification": "spark-defaults",
"Properties": {
"spark.task.cpus ": "1",
"spark.rapids.sql.explain": "ALL",
"spark.submit.pyFiles": "/usr/lib/spark/jars/xgboost4j-spark_3.0-1.3.0-0.1.0.jar",
"spark.executor.extraLibraryPath": "/usr/local/cuda-11.0/targets/x86_64-linux/lib:/usr/local/cuda-11.0/extras/CUPTI/lib64:/usr/local/cuda-11.0/compat/:/usr/local/cuda-11.0/lib:/usr/local/cuda-11.0/lib64:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native",
"spark.plugins": "com.nvidia.spark.SQLPlugin",
"spark.executor.cores": "1",
"spark.sql.files.maxPartitionBytes": "512m",
"spark.executor.resource.gpu.discoveryScript": "/usr/lib/spark/scripts/gpu/getGpusResources.sh",
"spark.sql.shuffle.partitions": "200",
"spark.executor.defaultJavaOptions": "-XX:+IgnoreUnrecognizedVMOptions",
"spark.task.resource.gpu.amount": "0.0625",
"spark.rapids.memory.pinnedPool.size": "2G",
"spark.executor.resource.gpu.amount": "1",
"spark.rapids.sql.enabled": "true",
"spark.sql.adaptive.enabled": "false",
"spark.locality.wait": "0s",
"spark.sql.sources.useV1SourceList": "",
"spark.executor.memoryOverhead": "2G",
"spark.driver.defaultJavaOptions": "-XX:+IgnoreUnrecognizedVMOptions",
"spark.rapids.sql.concurrentGpuTasks": "1"
},
"Configurations": []
},
{
"Classification": "capacity-scheduler",
"Properties": {
"yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator"
},
"Configurations": []
},
{
"Classification": "spark-env",
"Properties": {},
"Configurations": [
{
"Classification": "export",
"Properties": {
"JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64/"
},
"Configurations": []
}
]
}
]

Note: in above configuration file, we specified the /usr/local/cuda-11.0 in "spark.executor.extraLibraryPath" because the soft link /usr/local/cuda is still pointing to old cuda-10.1.

Note: /usr/local/cuda-11.0/compat/ contains the libs from cuda-compat-11-0 we installed earlier.

7. Start the EMR cluster using CLI

aws emr create-cluster \
--release-label emr-6.2.0 \
--applications Name=Hadoop Name=Spark Name=Livy Name=JupyterEnterpriseGateway \
--service-role EMR_DefaultRole \
--ec2-attributes KeyName=hao-emr,InstanceProfile=EMR_EC2_DefaultRole \
--instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.4xlarge \
InstanceGroupType=CORE,InstanceCount=1,InstanceType=g4dn.2xlarge \
InstanceGroupType=TASK,InstanceCount=1,InstanceType=g4dn.2xlarge \
--configurations file:///xxx/EMR_java11_custom_bootstrap.json \
--bootstrap-actions Name='My Spark Rapids Bootstrap action',Path=s3://BUCKET-NAME/bootstrap-install-cuda-compat-11.sh \
--ebs-root-volume-size 100

Note: EBS root value size should be increased from default 10G to larger to avoid running out of disk space when installing packages using yum.

8. Monitor the bootstrap process

Normally master node will be ready first. So SSH on master node, and find the bootstrap actions' logs here: /mnt/var/log/bootstrap-actions

9. Test

Once all nodes are ready, run below in spark-shell from master node to make sure the GPU plan is shown:

val data = 1 to 100
val df1 = sc.parallelize(data).toDF()
val df2 = sc.parallelize(data).toDF()
val out = df1.as("df1").join(df2.as("df2"), $"df1.value" === $"df2.value")
out.count()
out.explain()

10. Delete the EMR cluster once tests are done.

aws emr terminate-clusters --cluster-ids j-xxxxxxxxxxx

Common issues

1.  ERROR NativeDepsLoader: Could not load cudf jni library...

Below errors and stack trace show in Spark executor logs when launching spark-shell:

Caused by: java.util.concurrent.ExecutionException: java.lang.UnsatisfiedLinkError: /mnt/yarn/usercache/hadoop/appcache/application_xxx_xxx/container_xxx_xxx_01_xxxxx/tmp/nvcomp4429409488498215695.so: libcudart.so.11.0: cannot open shared object file: No such file or directory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at ai.rapids.cudf.NativeDepsLoader.loadNativeDeps(NativeDepsLoader.java:167)
... 34 more
Caused by: java.lang.UnsatisfiedLinkError: /mnt/yarn/usercache/hadoop/appcache/application_xxx_xxx/container_xxx_xxx_01_xxxxx/tmp/nvcomp4429409488498215695.so: libcudart.so.11.0: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1934)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1817)
at java.lang.Runtime.load0(Runtime.java:810)
at java.lang.System.load(System.java:1088)
at ai.rapids.cudf.NativeDepsLoader.loadDep(NativeDepsLoader.java:184)
at ai.rapids.cudf.NativeDepsLoader.loadDep(NativeDepsLoader.java:198)
at ai.rapids.cudf.NativeDepsLoader.lambda$loadNativeDeps$1(NativeDepsLoader.java:161)
... 5 more

Make sure the CUDA Toolkit 11.0 is installed and is set in spark.executor.extraLibraryPath of configuration file.

2. ai.rapids.cudf.CudaException: CUDA driver version is insufficient for CUDA runtime version

Below errors and stack trace show in Spark executor logs when launching spark-shell:

ai.rapids.cudf.CudaException: CUDA driver version is insufficient for CUDA runtime version
at ai.rapids.cudf.Cuda.setDevice(Native Method)
at com.nvidia.spark.rapids.GpuDeviceManager$.setGpuDeviceAndAcquire(GpuDeviceManager.scala:95)
at com.nvidia.spark.rapids.GpuDeviceManager$.$anonfun$initializeGpu$1(GpuDeviceManager.scala:122)
at scala.runtime.java8.JFunction1$mcII$sp.apply(JFunction1$mcII$sp.java:23)
at scala.Option.map(Option.scala:230)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpu(GpuDeviceManager.scala:122)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:130)
at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:168)
at org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:111)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:99)
at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:164)
at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:152)
at org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:220)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:221)
at org.apache.spark.executor.Executor.<init>(Executor.scala:220)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:168)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Make sure the cuda-compat-11-0 is installed and its location is set correctly in spark.executor.extraLibraryPath of configuration file.

 

No comments:

Post a Comment

Popular Posts