How to Use the YARN API to Determine Resources Available for Spark Application Submission (Part 3)

Capacity Scheduler

The capacity scheduler is pretty straight forward. It assigns each user a percentage of the parent queue that they are allowed to use. The queue object corresponding to your user, will have an “absoluteCapacity” field containing a double. This is the percentage of the entire cluster (both cpu and memory) that is available for jobs submitted by your user.

Thus you need to multiply this percentage, by the total resource available on the cluster. These metrics can be retrieved from the “clusterMetrics” api which contains both an “availableVirtualCores” and an “availableMB” field.

Finding total memory and total vCores available on the cluster
API Call:	http://<rm http address:port>/ws/v1/cluster/metrics
Path to Json	clusterMetrics→ availableVirtualCores → availableMB

(total_vcores,total_memory) =http://<rm http address:port>/ws/v1/cluster/metrics -> “clusterMetrics” -> {“availableVirtualCores”, “availableMB”}

Finding percent of resources available to your user if the queue type is capacity scheduler
API Call:	http://<rm http address:port>/ws/v1/cluster/scheduler
Path to Json	scheduler→ schedulerInfo → rootQueue → queues ….find your queue → myUserName → absoluteCapacity

Then we can calculate the cores and memory available for our job with the following equations

Available vcores for Spark Application = absoluteCapacity x availableVirtualCores

Available memory (mb) for Spark Application = absoluteCapacity x availableMB

Fair Scheduler

The fair scheduler is a little more complicated. Essentially resources are divided equally among other active users on the cluster. If the scheduler type is the fairScheduler, than to get the resource available to our queue we have to query the queue object for the “maxResources” object, which contains a “vCores” and “memory” field. These values represent the resource available on each node. Thus, to get the total resource available in our queue we need to multiply each of these numbers by the number of active nodes on the cluster (which can be retrieved from the cluster metrics API.

Finding the active nodes on the cluster
API Call:	http://<rm http address:port>/ws/v1/cluster/metrics
Path to Json	clusterMetrics→ activeNodes

Finding percent of resources available to your user if the queue type is fair scheduler
API Call:	http://<rm http address:port>/ws/v1/cluster/scheduler
Path to Json	scheduler→ schedulerInfo → rootQueue → queues ….find your queue → myUserName → maxResources → vCores → memory

Then we can calculate the cores and memory available to our Spark Application with the following equations

Available v-cores = activeNodes x v cores per node (see above)

Available memory = activeNodes x memory per node (see above)

How To Calculate Memory Overhead

This is different depending on whether you are running with yarn client or yarn cluster mode. In yarn cluster mode, both the executor memory overhead and driver memory overhead can be set manually in the conf with the “spark.yarn.executor.memoryOverhead” and “spark.yarn.driver.memoryOverhead” values respectively. If these values are unset the memory can be calculated with the following equation:

memory overhead = Max(MEMORY_OVERHEAD_FACTOR *requested memory, MEMORY_OVERHEAD_MINIMUM).

Where MEMORY_OVERHEAD_FACTOR = 0.10 and

MEMORY_OVERHEAD_MINIMUM = 384

That does it for our three-part series on using the YARN API within Chorus. We’re constantly working to deliver new enterprise functionality so check back with our blog to stay up to date and informed on the latest features.