Multinode hadoop cluster configuration for CCA131 preparation

CCA131 exam is a fully hands on exam and one should have a practical experience of working in a hadoop cluster to pass the exam.

If you don’t have practical experience, then I’d recommend you to practice through Cloudera quickstart VM or build a multi node cluster in your laptop using VMs or in AWS.

Below are the minimum configuration requirements for the cluster.



CM server & Namenode: 8 GB of RAM, 10 GB Disk

CM server needs 8GB of memory to run the Cloudera Manager else it will hung.


Standby node: 4 GB RAM, 8-10 GB disk

Ideally standby should have same config as active NN, since it’s for learning/POC, you can go for 4GB config.


Datanodes (2)  : 1 GB RAM, 8-10 GB disk

Total minimum memory required for the cluster : 12GB

So if you have a laptop of 16GB memory then you can create separate linux VMs with above configs, install Cloudera and build cluster for free of cost.

Refer Install Cloudera SCM server  and Cloudera agents to build a cluster.



AWS instances: 

If you don’t have prior AWS experience/exposure, please skip this topic and check Online labs section at the end

This is just an high level overview of AWS ec2 instances, pricing charges and not basics of AWS. 


Most of us don’t have laptop of 16GB memory, so if you try to reduce the memory allocated and build the cluster, it’ll hang for ages, run out of memory, which is not a convenient one.

If you have experience in AWS, then you can go for EC2 instances to build your cluster.

AWS EC2 instances prices can be found in this link:

I used t2.large for CM/NN, t2.medium for standby, t2.micro (2 no’s) for datanodes.


AWS Charges:

AWS EC2 instances are charged per hour basis (now they introduced per second billing), so if you stop and start an Amazon EC2 instance three times in a single hour, you will be charged for 3 hours.

Since the t2.micro is free for one year(750 hrs) when you signup with AWS, let’s calculate the pricing for masters.

t2.large -> $0.1152/hour

t2.medium -> $0.0576/hour

Total: $0.1728 per hour. So if you’re using it on a weekend for 20 hours, you’ll incur $3.5 for the instances and additional EBS volumes charges plus taxes.

As a free tier, you’ll get 30GB of EBS volumes. Since the min volume size of instance should be 8GB, you will cross the free limits when you create 4 instances ( 4*8 GB ).

I paid around $15 during my preparation. I didn’t know about EBS charges beforehand so ended up paying $3-$4  extra for provisioning 60+ GB EBS volumes.

Once you launched instances, refer Install Cloudera SCM server  and Cloudera agents to build a cluster.


  • Launch the AWS instances in a new VPC to have a static Private IPs.
  • Public ip will change during every start/stop.
  • Before running Cloudera binary installer, disable SElinux and restart the instance. (By default, selinux is enabled and your installation will fail). Of course, you’ll be charged per hour price for this reboot.
  • Always start/stop the instances till your POC is done. If you terminate, cluster will be gone.
  • Update the /etc/hosts in all instances with hostname/ip of the instances, to enable forward/reverse dns lookup.
  • You may encounter lot of issues while building cluster in AWS and don’t get frustrated. Try to debug/resolve as it’s a valuable learning.

AWS charges vary on time/region basis. Please check the pricing details and use at your own discretion.



Online Labs:

If you don’t have sufficient configuration in laptop or new to AWS, then I’d suggest to try some online labs such as  to avoid all these hassles.



Please post your queries, feedback on the comments section.


3 thoughts on “Multinode hadoop cluster configuration for CCA131 preparation

  1. Hi Kannan,

    I emailed itversity support asking if i can use their labs to install clouderamanager, the support person vinod said that right now the labs are only for developers.

    Any other cloud service?


  2. Hi Kannan,

    Did you try in Azure they are providing pricing in Indian Rupee (INR), hope that will be cheaper compared to AWS? Whats your thoughts on this.

Leave a Reply

Your email address will not be published. Required fields are marked *