After the installation of Cloudera Manager (SCM server), we can install CDH on our hosts using Cloudera Manager.
Step 1: Login to CM url.
When you login to CM for the first time after the installation of Cloudera SCM server, the login will redirect you to the following steps.
Select the desired edition. Choose the Data hub trial edition (60 days) which has all the advanced features of Cloudera.
Step 2 :
In this page provide the list of hosts, IPs you want to add as the part of the cluster.
The CM will try to ping the servers and if they’re reachable, it will be shown as Host ready.
Step 3 :
In this step, you have to choose which version of CDH to be installed, additional parcels, etc.
Remember we configured a local repository in the very first task?
If you want the CM to install CDH from repository instead of connecting to the internet, then you can specify it in the “Remote Parcel Repositoy URLs”.
Click ‘More options’ near the ‘Use Parcels’, and edit the remote parcel repository urls.
Note: By default, the parcel directory is /opt/cloudera/parcels. If you don’t have a separate filesystem as /opt in the servers, then the parcels will be installed in / root filesystem.
This will impact the server’s performance as / root fs is reserved only for system related files.
The best practice is to create a 15G filesystem for /opt or change the parcel directory to a bigger mount.
Step 4 :
Java installation option
Since we don’t want to install Java manually, select install Oracle Java SE development kit and continue
Step 5 :
In this step, you have to provide an user account details which have SSH access to all the hosts.
It’s not recommended to give root as the user and you should have a separate account for Hadoop. Ensure that the non-root account you’re going to give is present in all the nodes and have sudo access.
As I’m using AWS, I’ve chosen private key as the authentication.
Step 6 :
In the next step, CM will automatically try to login to each server and install Cloudera Manager Agent package, java, etc.,
As you can see I’ve given only two hosts details, slave1 and standby, but forgot to mention node ‘master’ node in which the CM is running.
We installed only the CM in the master node and we have to install the CDH as well, if we have to use it in our cluster.
We don’t have the luxury of using separate server for CM alone and mostly in all organizations, CM server will act as the namenode.
So lets go back to “Add hosts” step and provide master host details as well.
As you can see, CM selected master node but options for standby, slave1 are greyed out. Because CM agents are already installed in those hosts, so they’re part of the CM now. You can see them in the ‘Currently Managed Hosts’ section.
CM now will install the CM agent in the master node as per step 5 and 6.
Step 7 :
After installation of java, cm agents, in the next step CM will install the CDH Parcel and additional parcels (if chosen any in the step 3 ) on the hosts.
Installing Parcels involves four steps:
Downloading the parcel
Distributing the parcel
Unpacking the parcel
Activating the parcel. Only when the parcel activation is successful you can add the services on the host(s)
Step 8 :
In this step, hosts inspector will run. It will validate the hosts added to the cluster and notify you if any warnings or errors are there.
If anything fails, you will able to continue only after rectifying the error.
For warnings, you can ignore for now and remediate later.
Please go through the each validation point as it’s very informative. You’ll get to know what are the checklist items Cloudera use to validate the hosts in the cluster.
You can add services/ assign roles to the hosts in this section.
I’ll explain about this in the ‘adding hosts to the cluster’ topic.
Then in the next step, you’ll be prompted to choose the list of services to be installed.
You can choose the custom option in the bottom and install services of your choice.
You can add any service later using ‘Add service’ option in the cluster dropdown after finishing this setup.
I’ve chosen only Cloudera management service.
Assign the roles and verify the database details.
If you’re using an external database, then select custom database and provide the db details.
After adding any service, respective service’s configuration details will be shown to review. Review it and click finish.
Now the selected the services will be available for use in the Cloudera Manager.
This is part of the cluster setup and since you’ll be given a running cluster, you can try this for learning purposes.
Thus we covered how to Install CDH (CM agents) using Cloudera Manager
Use the comments section below to post your doubts, questions and feedback.
Please follow my blog to get notified of more certification related posts, exam tips, etc.