Learn Hadoop Administration

CCA 131 Perform OS-level configuration for Hadoop installation

CCA 131 Perform OS-level configuration for Hadoop installation

Note: This post is part of the CCA Administrator Exam (CCA131) objectives series

In the last post we have setup the local CDH repository. But before we can perform any installation or even before setting up the repository you must perform few OS-level configurations.

 

The OS-level configuration includes:

1.      Enabling NTP

2.      Configuring Network Names (hostnames/FQDNs)

3.      Disabling SELinux

4.      Disabling the Firewall

 

1. Enabling NTP

Step 1. To install NTP on CentOS/RHEL 7 systems:

# yum install ntp

 

Step 2. Edit the ntp configuration file /etc/ntp.conf and add the available NTP servers in your setup.

# vi /etc/ntp.conf

server 192.168.1.100

server 192.168.1.101

server 192.168.1.102

 

Step 3. Start the ntpd service/target and enable ot to start automatically on boot.

# systemctl start ntpd

# systemctl enable ntpd

 

Note: In CentOS/RHEL 7 ntpd is replaced by chronyd as the default network time protocol daemon. ntpd is still included in the yum repository for customers who need to run an NTP service. We can run both NTP and chrony together and I have included the NTP configuration as per exam objectives of CCA 131

 

2. Configuring Network Names (hostnames/FQDNs)

It is important that all the nodes in the CDH cluster should communicate with each other and their FQDNs are fully resolvable to their respective IPs.

 

Step 1. Add the IP address of each node of the CDH cluster and their respective FQDN in the file /etc/hosts:

# vi /etc/hosts

192.168.1.10   master.localdomain

192.168.1.11   node01.localdomain

192.168.1.12   node02.localdomain

192.168.1.13   node03.localdomain

Note: If you are using DNS, storing this information in /etc/hosts is not required, but it is good practice.

 

Step 2. Verify that you can get the FQDN of each node of the cluster using the command:

# hostname -f

 

3. Disabling SELinux

Step 1. To get the current status of the SELinux:

# sestatus

SELinux status:                 enabled

SELinuxfs mount:                /sys/fs/selinux

SELinux root directory:         /etc/selinux

Loaded policy name:             targeted

Current mode:                   enforcing

Mode from config file:          enforcing

Policy MLS status:              enabled

Policy deny_unknown status:     allowed

Max kernel policy version:      31

 

# getenforce

Enforcing

 

In my case, SELinux is enabled and is in enforcing mode as well. You can skip disabling SELinux if the mode is “permissive“.

 

Step 2. Edit the /etc/selinux/config file and change parameter value “SELINUX=enforcing” to “SELINUX=permissive”.

# vi /etc/sysconfig/selinux

selinux=permissive

 

Step 3. You can either reboot the system or execute the below command for the changes to take effect immediately.

# setenforce 0

 

I usually disable the SELinux using “SELINUX=disabled” and reboot the system after all the OS-level configuration is completed.

 

Step 4. Verify the status again:

# getenforce

Permissive

 

4. Disabling the Firewall

Last but not least, disable the firewall on the system. The default firewall is CentOS/RHEL 6 is iptables whereas in CentOS/RHEL 7 we use firewalld.

For CentOS/RHEL 6

# chkconfig iptables off

# service iptables stop

 

For CentOS/RHEL 7

# systemctl disable firewalld

# systemctl stop firewalld

 

Other Recommended Settings

Along with the above 4 mentioned OS-level configurations, it is recommended to disable “transparent hugepage” and setting “vm.swappiness” to recommended value by Cloudera.

 

Verify if THP is enabled

Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned.

 

To verify if THP enabled or disabled:

# cat /sys/kernel/mm/transparent_hugepage/enabled

[always] madvise never

Note : Transparent Huge Pages cannot be enabled/disabled on a running machine and requires a reboot.

 

Disabling THP

Step 1. Add the “transparent_hugepage=never” kernel parameter option to the grub2 configuration file. Append or change the “transparent_hugepage=never” kernel parameter on the GRUB_CMDLINE_LINUX option in /etc/default/grub file.

# vi /etc/default/grub

GRUB_TIMEOUT=5

GRUB_DEFAULT=saved

GRUB_DISABLE_SUBMENU=true

GRUB_TERMINAL_OUTPUT="console"

GRUB_CMDLINE_LINUX="nomodeset crashkernel=auto rd.lvm.lv=vg_os/lv_root rd.lvm.lv=vg_os/lv_swap rhgb quiet transparent_hugepage=never"

GRUB_DISABLE_RECOVERY="true"

 

Step 2. Rebuild the /boot/grub2/grub.cfg file by running the grub2-mkconfig -o command. Before rebuilding the GRUB2 configuration file, ensure to take a backup of the existing /boot/grub2/grub.cfg.

# grub2-mkconfig -o /boot/grub2/grub.cfg

 

Step 3. Reboot the system and verify option are in effect.

# shutdown -r now

 

Step 4. Verify the parameter is set correctly

# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.10.0-514.10.2.el7.x86_64 root=/dev/mapper/vg_os-lv_root ro nomodeset crashkernel=auto rd.lvm.lv=vg_os/lv_root rd.lvm.lv=vg_os/lv_swap rhgb quiet transparent_hugepage=never LANG=en_US.UTF-8

For more info on disablig THP, refer the below posts:

 

CentOS / RHEL 7 : How to disable Transparent Huge pages (THP)

CentOS / RHEL 6 : How to disable Transparent Huge pages (THP)

 

VM swappiness

Swappiness is a property for the Linux kernel that changes the balance between swapping out runtime memory, as opposed to dropping pages from the system page cache. Swappiness can be set to values between 0 and 100, inclusive. A low value means the kernel will try to avoid swapping as much as possible where a higher value instead will make the kernel aggressively try to use swap space.

 

Step 1. Cloudera recommends to set the value of swapiness equal to or below 10. To view the current value of swapiness:

# grep vm.swappiness /usr/lib/tuned/virtual-guest/tuned.conf

vm.swappiness = 30

 

Step 2. Let’s set the value of vm.swapiness to 10.

# echo "vm.swappiness = 10" > /usr/lib/tuned/virtual-guest/tuned.conf

 

Step 3. Verify the value of vm.swapness again.

# grep vm.swappiness /usr/lib/tuned/virtual-guest/tuned.conf

vm.swappiness = 10