Wednesday, March 10, 2021

Understanding RAPIDS Accelerator For Apache Spark parameter -- spark.rapids.memory.gpu.allocFraction and GPU pool related ones.

Goal:

This article explains the RAPIDS Accelerator For Apache Spark parameter -- spark.rapids.memory.gpu.allocFraction and other GPU memory pool related ones: spark.rapids.memory.gpu.maxAllocFraction, spark.rapids.memory.gpu.reserve, spark.rapids.memory.gpu.debug and spark.rapids.memory.gpu.pool.

Env:

Spark 3.1.1

RAPIDS Accelerator For Apache Spark 0.4

Quadro RTX 6000 with 24G memory

Solution:

1. Concept

As per the configuration guidespark.rapids.memory.gpu.pooling.enabled is DEPRECATED and we should use spark.rapids.memory.gpu.pool to switch on or off the GPU memory pooling feature, and also to choose which RMM(RAPIDS Memory Manager) pooling allocator to use.

  • ARENA: rmm::mr::arena_memory_resource
  • DEFAULT: rmm::mr::pool_memory_resource
  • NONE: Turn off pooling, and RMM just passes through to CUDA memory allocation directly

 Even though the value "DEFAULT" could be confusing, but as of now, we would recommend "ARENA". 

To learn more about RMM, this blog "Fast, Flexible Allocation for NVIDIA CUDA with RAPIDS Memory Manager" would help understand.

If you want to dig into the source code of RMM, here it is: https://github.com/rapidsai/rmm .

In this article, I will use ARENA for all below tests.

After GPU memory pooling is enabled, below 3 parameters control how much memory will be pooled:

  • spark.rapids.memory.gpu.allocFraction : The fraction of total GPU memory that should be initially allocated for pooled memory.  Default 0.9.
  • spark.rapids.memory.gpu.maxAllocFraction: The fraction of total GPU memory that limits the maximum size of the RMM pool.  Default 1.0
  • spark.rapids.memory.gpu.reserve : The amount of GPU memory that should remain unallocated by RMM and left for system use such as memory needed for kernels, kernel launches or JIT compilation. Default 1g.

In simple, basically the default setting means, 90% of the GPU memory will be pooled but the max can not exceed 100% - 1g.

In the end, there is another parameter spark.rapids.memory.gpu.debug which can be used to enable debug logging into STDOUT or STDERR. Default is NONE.

2. Test

In below tests, I keep spark.rapids.memory.gpu.maxAllocFraction = default 1 and change spark.rapids.memory.gpu.allocFraction and spark.rapids.memory.gpu.reserve and in the meantime monitoring the logs and nvidia-smi output after "spark-shell" is launched with only 1 executor on a single node.

a. Default 

spark.rapids.memory.gpu.allocFraction 0.9 (default)
spark.rapids.memory.gpu.reserve 1073741824 (default)

GPU memory utilization:

utilization.gpu [%], utilization.memory [%], memory.total [MiB], memory.free [MiB], memory.used [MiB]
0 %, 0 %, 24220 MiB, 24209 MiB, 11 MiB
4 %, 0 %, 24220 MiB, 23801 MiB, 419 MiB
3 %, 0 %, 24220 MiB, 1719 MiB, 22501 MiB
0 %, 0 %, 24220 MiB, 1719 MiB, 22501 MiB
0 %, 0 %, 24220 MiB, 1693 MiB, 22527 MiB

Executor Log:

21/03/10 10:42:25 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/03/10 10:42:30 INFO GpuDeviceManager: Initializing RMM ARENA initial size = 21798.28125 MB, max size = 23196.3125 MB on gpuId 0

b. Increased spark.rapids.memory.gpu.allocFraction from 0.9 to 0.99

spark.rapids.memory.gpu.allocFraction=0.99
spark.rapids.memory.gpu.reserve 1073741824 (default)

GPU memory utilization:

utilization.gpu [%], utilization.memory [%], memory.total [MiB], memory.free [MiB], memory.used [MiB]
0 %, 0 %, 24220 MiB, 24209 MiB, 11 MiB
0 %, 0 %, 24220 MiB, 24161 MiB, 59 MiB
3 %, 0 %, 24220 MiB, 23723 MiB, 497 MiB
0 %, 0 %, 24220 MiB, 321 MiB, 23899 MiB
0 %, 0 %, 24220 MiB, 321 MiB, 23899 MiB
0 %, 0 %, 24220 MiB, 321 MiB, 23899 MiB
0 %, 0 %, 24220 MiB, 297 MiB, 23923 MiB

Executor Log: 

21/03/10 10:46:54 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/03/10 10:46:59 WARN GpuDeviceManager: Initial RMM allocation (23978.109375 MB) is larger than free memory (23519.3125 MB)
21/03/10 10:46:59 WARN GpuDeviceManager: Initial RMM allocation (23978.109375 MB) is larger than the adjusted maximum allocation (23196.3125 MB), lowering initial allocation to the adjusted maximum allocation.
21/03/10 10:46:59 INFO GpuDeviceManager: Initializing RMM ARENA initial size = 23196.3125 MB, max size = 23196.3125 MB on gpuId 0

c. Increased spark.rapids.memory.gpu.allocFraction from 0.9 to 0.99 and also spark.rapids.memory.gpu.reserve from 1g to 2g

spark.rapids.memory.gpu.allocFraction=0.99
spark.rapids.memory.gpu.reserve 2147483648

GPU memory utilization:

utilization.gpu [%], utilization.memory [%], memory.total [MiB], memory.free [MiB], memory.used [MiB]
0 %, 0 %, 24220 MiB, 24209 MiB, 11 MiB
4 %, 0 %, 24220 MiB, 24041 MiB, 179 MiB
5 %, 0 %, 24220 MiB, 23711 MiB, 509 MiB
0 %, 0 %, 24220 MiB, 1345 MiB, 22875 MiB
0 %, 0 %, 24220 MiB, 1345 MiB, 22875 MiB
0 %, 0 %, 24220 MiB, 1345 MiB, 22875 MiB
0 %, 0 %, 24220 MiB, 1321 MiB, 22899 MiB

Executor Log: 

21/03/10 10:49:49 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/03/10 10:49:54 WARN GpuDeviceManager: Initial RMM allocation (23978.109375 MB) is larger than free memory (23519.3125 MB)
21/03/10 10:49:54 WARN GpuDeviceManager: Initial RMM allocation (23978.109375 MB) is larger than the adjusted maximum allocation (22172.3125 MB), lowering initial allocation to the adjusted maximum allocation.
21/03/10 10:49:54 INFO GpuDeviceManager: Initializing RMM ARENA initial size = 22172.3125 MB, max size = 22172.3125 MB on gpuId 0

 d. Disable GPU memory pool

spark.rapids.memory.gpu.pool NONE

GPU memory utilization:

utilization.gpu [%], utilization.memory [%], memory.total [MiB], memory.free [MiB], memory.used [MiB]
0 %, 0 %, 24220 MiB, 24209 MiB, 11 MiB
3 %, 0 %, 24220 MiB, 23891 MiB, 329 MiB
5 %, 0 %, 24220 MiB, 23567 MiB, 653 MiB
0 %, 0 %, 24220 MiB, 23519 MiB, 701 MiB
0 %, 0 %, 24220 MiB, 23519 MiB, 701 MiB
1 %, 0 %, 24220 MiB, 23495 MiB, 725 MiB
21/03/10 12:03:07 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/03/10 12:03:12 INFO GpuDeviceManager: Initializing RMM initial size = 21798.28125 MB, max size = 0.0 MB on gpuId 0

e. Enable DEBUG

spark.rapids.memory.gpu.debug STDOUT

stdout:

$  tail -100f stdout
Thread,Time,Action,Pointer,Size,Stream
15129,11:04:56:292725,allocate,0x7f7192600000,18480,0x0
15129,11:04:56:293529,allocate,0x7f7140000000,50686648,0x0
15129,11:04:56:317040,allocate,0x7f7143200000,14174424,0x0
15129,11:04:56:319691,allocate,0x7f7192800000,13951936,0x0
15129,11:04:56:321843,allocate,0x7f713e000000,13936328,0x0
15129,11:04:56:323874,allocate,0x7f713ee00000,13929272,0x0
15129,11:04:56:325937,allocate,0x7f7192604a00,26432,0x0
15129,11:04:56:326309,allocate,0x7f7134000000,139910792,0x0
15129,11:04:56:326346,allocate,0x7f719260b200,13216,0x0
15129,11:04:56:326371,allocate,0x7f719260e600,6608,0x0
15129,11:04:56:370310,free,0x7f719260e600,6608,0x0
15129,11:04:56:370327,free,0x7f719260b200,13216,0x0
15129,11:04:56:370335,free,0x7f7140000000,50686648,0x0
15129,11:04:56:370490,free,0x7f7143200000,14174424,0x0
15129,11:04:56:371885,free,0x7f7192800000,13951936,0x0

3. Key takeaways

Allocating memory on a GPU can be an expensive operation so it is recommended to use GPU memory pool feature. 

DEBUG log is useful because it can show each allocate/free actions.


No comments:

Post a Comment

Popular Posts