Hadoop

Perform OS-level configuration for Hadoop installation

Before installing CDH in our server, we’ve to make the below configuration changes in OS level for successful installation.

  • Disable SELINUX 

“Security-Enhanced Linux (SELinux) is a Linux kernel security module that provides a mechanism for supporting access control security policies”

If SElinux is enabled, then cloudera server installation will fail in the server.

To disable Selinux temporarily:

# echo 0 > /selinux/enforce

To disable Selinux, edit the selinux config file and set “selinux=disabled”
Requires server restart

# vi /etc/sysconfig/selinux
selinux=disabled

To check the status of selinux:
#sestatus


  • VM swappiness

vm.swappiness is a Linux kernel parameter that controls how aggressively memory pages are swapped to disk. It can be set to a value between 0-100; the higher the value, the more aggressive the kernel is in seeking out inactive memory pages and swapping them to disk.

To check VM swappiness value:

cat /proc/sys/vm/swappiness

On most systems, it is set to 60 by default. This is not suitable for Hadoop cluster nodes, because it can cause processes to get swapped out even when there is free memory available. This can affect stability and performance, and may cause problems such as lengthy garbage collection pauses for important system daemons.

Cloudera recommends that you set this parameter to 10 or less; for example:

# sysctl vm.swappiness=10 
or
# echo 10 > /proc/sys/vm/swappiness
or
# vi /etc/sysctl.conf
vm.swappiness=10

 

  • Hostname resolution

If you’re not using any DNS server, then ensure that your hostname details are updated with fqdn in /etc/hosts file.

Other recommendations:

  • In fstab, mount disks with “noatime” option as we don’t the disks to store the blocks accessed time in memory.
# cat /etc/fstab
/dev/sda2 /hadoop ext3 noatime,defaults 0 0
  • Most Linux platforms supported by CDH 5 include a feature called transparent hugepage compaction which interacts poorly with Hadoop workloads and can seriously degrade performance.

To disable THP, add the below entry in /etc/rc.local file

# vi /etc/rc.local
echo 'never' > /sys/kernel/mm/redhat_transparent_hugepage/defrag

One thought on “Perform OS-level configuration for Hadoop installation

Leave a Reply

Your email address will not be published. Required fields are marked *