I took the Cloudera CCA131 Administrator exam last week and passed the certification.
In this post, I’ll explain you about the exam pattern as described by Cloudera, how to prepare for the exam, the topics you should be aware of, study materials, tips, feedback etc.,
Little background about me:
I have 3+ years of experience as Hadoop Admin, 3+ years as Linux admin, have a solid background in the Hadoop/Linux Administration and AWS. We’ve been using Cloudera distribution in our company, so basically everyday of my work is spent in Cloudera Manager and Linux sessions.
You can check more about me in About Me section.
Certification format change:
I wanted to take CCAH exam from last December, but I kept on postponing it. When I decided to take it by May, Cloudera replaced the CCAH multiple choice questions exam to CCA131 practical hands on exam format.
So I looked for resources, materials to prepare for the exam but there are none. Cloudera suggests that CCA131 follows the same objectives as Cloudera Administrator Training and the training course is an excellent part of preparation for the exam. But the training cost is $3195 (2 Lacs in Indian rupees),10x the price of certification and I wonder how Cloudera justifies that price.
After giving up looking for resources, I decided to prepare by topics given on the exam blueprint. We have a test cluster in our company, so I made use of it for preparation. Since I’m not allowed to destroy and rebuild it, I spun up instances in AWS, installed Cloudera, built a cluster from scratch and played with it for weeks.
Once I covered all the exam blueprint topics and felt confident, I gave it a shot and passed.
In the exam, you’ll be a given a fully built, operational, running Cluster with CDH 5.10.x running on Cent OS 7. Your user account credentials, cluster details, Cloudera Manager UI link and login credentials, everything will be mentioned in a page in the browser.
You’ll be given 8-12 problems of various scenarios and the problems can be solved in any order. For every problem, read the question scenario properly and understand what’s the output they’re expecting. You have to score 70% to pass the exam. There are no steps marks for the problem, so even if you do all required tasks for a problem but forgot to change a permission of a file, that will be marked incorrect.
Once you complete the exam, you’ll be getting the exam result in mail confirming whether you’re pass or fail, and the result of each problem.
Ex: Result: FAIL Problem 1 : PASS Problem 2 : FAIL (Wrong configuration) Problem 3 : FAIL (Incorrect Permissions)
Note: You may or may not allowed to use Cloudera documentation, Apache hadoop documentation in the exam browser which you’ll get to know during the exam. I can’t confirm or comment on that.
How to prepare:
Four words – Get Your Hands Dirty!!
Without hands on practice, it’s very difficult to pass this exam.
If you’re new to Hadoop with no background on Linux, then first familiarize yourself with all the basic Linux commands. You should atleast know how to change the ownership, permissions of files, copy/move files, create/delete directory, start/stop system services, etc.
- If you have 8GB RAM laptop, install the Cloudera Quickstart VM and get started right away. Though You won’t be able to add new nodes or set up High Availability in the quickstart VM, you can practice rest of the exam topics.
- If you want to practice building a cluster, setup HA (which I feel you should really do), then you should have a laptop/desktop of RAM > 16 GB. Upgrade your RAM capacity and create Cent OS VMs of size 8GB, 4GB, 2GB and on top of it install Cloudera package.
- If you are worried that you don’t have enough RAM capacity/slots to upgrade in your laptop and no options of physical machines, then pass the worries to the Cloud and Cloud got your back.
Sign up for AWS – choose desired instance configuration, spin it up, BAM you’re set to go.
AWS charge for the instances on pre hourly basis. if your instance is running for 1hr 10 mins (70 mins) you are already charged for 2 hours,so it’s better to plan before hand on what you’re going to practice and stop the instance before the next hour begins.
AWS charges are very cheap, you can get an 8GB instance for $0.2/hour. I incurred $14 for a month (30 hours of usage) including instances,storage costs and taxes.
I’ll write a post soon on how to spin up a cluster a AWS and install Cloudera.
Make a cluster, break a cluster, play with Cloudera manager, add a node/remove a node, look into all warnings/errors and identify how to fix, add all available services, do all the crazy work with the cluster. That way you’re better prepared for the exam and can face any type of scenario questions.
Below are the reference materials I used for my preparation for exam as well in my career as Hadoop Admin.
Hadoop Definitive Guide – This book is considered as the Bible for Hadoop community. This covers in depth of all the hadoop components, services. It’s useful beyond the exam and you can master Hadoop with this book.
Purchase Link: http://amzn.to/2wzI33z
Hadoop Operations – Though this book is outdated (2012 edition), this is valuable for Hadoop Administrators as it focuses a lot on Administration part.
Purchase Link: http://amzn.to/2wISwdL
Cloudera Administration Handbook – This books explains in detail about Hadoop administration using Cloudera. Above books explains about Apache Hadoop view, whereas this book covers on Cloudera distribution. I’d recommend this for exam preparation.
Purchase Link: http://amzn.to/2wJz7cB
Cloudera documentation – No explanation needed. It’s a manual.
And last but not least, hadoopandcloud.com 😀
I’ll be writing about all the CCA131 exam topics in the blueprint and which you can use it as a reference for the preparation. Check CCA131 – Exam Notes and Preparation Guide
Also I’ll be sharing valuable posts about Hadoop Administration, issues we face in our day to day work, exam tips and AWS certifications down the line. So make sure to bookmark this site!
IMPORTANT: This exam pattern is new and questions can be of any type as it’s entirely hands on. So don’t get fooled by any sites offering dumps, exam questions for sale. Prepare as per blueprint, with enough practice and knowledge you’ll pass the exam. Again, Don’t trust any any dumps.
Be ready 15 minutes before the start of the exam. The online proctor will do verification of id, your place, desk etc.,
- Address one problem at a time. Don’t check all the problems in the beginning itself. You’ll be confused which one to pick, may stressed out seeing tough questions. Pick one by one.
- Understand the output expected. Give importance to wording and filesystems.
- If a directory has to be created, read again and confirm in which filesystem. If you create a dir in HDFS, but the question is to create it in local filesystem, then the problem will be marked wrong.
- Do cross check on file permissions. When you are using “cp” copy command, pass the argument ‘-p’ to preserve the file ownership, timestamp permissions. Even if just a write permission is different than the expected output, that problem will be marked wrong.
- If no filesystems are explicitly specified, then it is in the local Linux filesystem. For HDFS, it’ll be explicitly specified in the description.
- Always ensure that the client configurations are deployed for every changes and services are not in stale state.
- If any of your changes gives warning, error messages in the service status, analyze and fix the issue. Else it will be marked wrong.
- If some commands are to be executed as specific user, make sure you switch as the user and run it. For HDFS user, use “sudo -u hdfs” as a prefix for “hadoop fs/ hdfs dfs” command.
- If you’re allowed to use documentation, don’t spend too much time on that. That will eat up all your time. Use it for reference if you have any doubt, don’t rely on it for every question.
- Ensure no services are down or no warning/messages when you about to finish the exam.
- DONT EVER FINISH THE EXAM WITH CLUSTER IN DOWN STATE. ALL YOUR PROBLEMS WILL BE MARKED WRONG AND YOU WILL FAIL THE EXAM IF THE CLUSTER IS NOT UP AT THE END OF THE EXAM.
I’ll add more points when I can recall from my memory. Hope this helps.
Use the comments section below to post your doubts, questions, feedback.
Please follow my blog to get notified of more certification related posts, exam tips, as I’ll be regularly updating the blog.
P.S: Please don’t ask for exam questions or exam content as it violates Cloudera’s confidentiality agreement.