How to increase the HDFS capacity of AWS Elastic Mapreduce EMR cluster

emr hdfs

In this tutorial, we’re going to see how to increase the hdfs capacity of a running EMR cluster. Sometime back, we received an alert that HDFSutilization was high on one of our cluster. Upon checking, the usage is an expected one but we under provisioned the storage capacity during the creation of the cluster and … Continue reading How to increase the HDFS capacity of AWS Elastic Mapreduce EMR cluster

AWS EMR Uniform Instance groups

In this post, I wrote about the AWS EMR uniform instance groups overview, advantages and caveats of using it. AWS EMR architecture contains master node, core node(s) and task nodes.  If you’re new to EMR, refer https://www.hadoopandcloud.com/aws/amazon-emr/  for a quick introduction. While creating the cluster, you have two configuration options for the nodes - instance … Continue reading AWS EMR Uniform Instance groups

How your resume shouldn’t be – observations on screening profiles for Hadoop Admin role

I've screened around 50 profiles/resume in recent months for the technical round of Hadoop Admin/Engineer position and it's shocking to see how terrible the candidates resumes are. All the resumes came through Consultancies and they do have major share for the poor quality of those resumes. I'm listing the below points I observed during screening … Continue reading How your resume shouldn’t be – observations on screening profiles for Hadoop Admin role

Hadoop Cluster – Pre Maintenance procedure

In IT, it's inevitable that all the servers will go for monthly security, vulnerability patching and hadoop servers are no exception. There'd be a separate Systems team to perform OS related patching, security updates, etc and your role is to bring down/up the cluster, ensure the application is good post patching. You've have to schedule … Continue reading Hadoop Cluster – Pre Maintenance procedure