Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time.
Using fair scheduler we can separate pools(queues) for each team and configure the resources for the pool which will help in overcoming application delays.
In the exam you may be asked to create a pool with min and max resources so that jobs submitted in the pool won’t be delayed.
Before we begin, ensure that the Fair Scheduler is chosen as the Yarn’s default scheduler.
Yarn – configuration – search ‘scheduler’ <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
Now go to Yarn – Resource Pools section
Select configuration and you’ll be redirected to Dynamic Resource Pool Configuration.
Create Resource Pool
Here provide resource limits such as the weight for this pool, min/max cores and memory for this pool and other details.
Set scheduling policy, preemption etc., if specified.
Once the pool is configured, you can submit the jobs to this pool using the below parameter in the syntax.
# hadoop jar mapreduce.jar –D mapreduce.job.queue.name=pool name
· Create a resource pool with given min and max resources.
· Create a pool with given resources and set scheduling policy as DRF.
Thus we covered how to use fair scheduler to create resource pools to resolve application delays.
Use the comments section below to post your doubts, questions and feedback.
Please follow my blog to get notified of more certification related posts, exam tips, etc.
Great work Kanna, I’m waiting for the topic “Revise YARN resource assignment based on user feedback” in manage section.
Great work ..
I have a doubt about YARN resource allocation. Lets say we have 3 projects, each project has 3 environments (DEV, SIT & UAT), so totally 9 environments and 9 queues. In this situation, how do we control the resources to each queue in YARN fair scheduler? What placement policy should set?