Configure the Fair Scheduler to resolve application delays

Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time.

Using fair scheduler we can separate pools(queues) for each team and configure the resources for the pool which will help in overcoming application delays.

In the exam you may be asked to create a pool with min and max resources so that jobs submitted in the pool won’t be delayed.

Before we begin, ensure that the Fair Scheduler is chosen as the Yarn’s default scheduler.

Yarn – configuration – search ‘scheduler’

<name>yarn.resourcemanager.scheduler.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>

Now go to Yarn – Resource Pools section

Select configuration and you’ll be redirected to Dynamic Resource Pool Configuration.

Create Resource Pool

Here provide resource limits such as the weight for this pool, min/max cores and memory for this pool and other details.

Set scheduling policy, preemption etc., if specified.

Once the pool is configured, you can submit the jobs to this pool using the below parameter in the syntax.

# hadoop jar mapreduce.jar –D mapreduce.job.queue.name=pool name

Problem Scenario:

· Create a resource pool with given min and max resources.

· Create a pool with given resources and set scheduling policy as DRF.

Thus we covered how to use fair scheduler to create resource pools to resolve application delays.

Use the comments section below to post your doubts, questions and feedback.

Please follow my blog to get notified of more certification related posts, exam tips, etc.

 


 

  1. Hi Kannan,
    I have a doubt about YARN resource allocation. Lets say we have 3 projects, each project has 3 environments (DEV, SIT & UAT), so totally 9 environments and 9 queues. In this situation, how do we control the resources to each queue in YARN fair scheduler? What placement policy should set?

Leave a Reply

%d bloggers like this: