AWS

AWS Certified SysOps Administrator – Associate exam feedback

  I passed AWS Certified SysOps Administrator exam last week, thus completing the trifecta of AWS associate certifications. I took AWS Solutions Architect associate 3 months back, Developer associate 2 months back which really helped me in increasing my expertise in AWS and my confidence, which is key for this certification. As expected, Sysops is… Continue reading AWS Certified SysOps Administrator – Associate exam feedback

cca131

Multinode hadoop cluster configuration for CCA131 preparation

CCA131 exam is a fully hands on exam and one should have a practical experience of working in a hadoop cluster to pass the exam. If you don't have practical experience, then I'd recommend you to practice through Cloudera quickstart VM or build a multi node cluster in your laptop using VMs or in AWS.… Continue reading Multinode hadoop cluster configuration for CCA131 preparation

Hadoop

Hadoop Cluster – Pre Maintenance procedure

In IT, it's inevitable that all the servers will go for monthly security, vulnerability patching and hadoop servers are no exception. There'd be a separate Systems team to perform OS related patching, security updates, etc and your role is to bring down/up the cluster, ensure the application is good post patching. You've have to schedule… Continue reading Hadoop Cluster – Pre Maintenance procedure

Hadoop

Configure the Fair Scheduler to resolve application delays

Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time. Using fair scheduler we can separate pools(queues) for each team and configure the resources for the pool which will help in overcoming application delays. In the exam you may be asked… Continue reading Configure the Fair Scheduler to resolve application delays

Hadoop

Determine reason for application failure

There are many possible causes for a job/application failure varying from code error, environments, files availability, permissions, mapreduce/yarn configuration, resources allocation and even due to server i/o, network issue etc., So the first thing you’ve to do when a job fails is, to look at the error message and correlate with your job. If an… Continue reading Determine reason for application failure

Hadoop

Benchmark the cluster (I/O, CPU, network)

Benchmarking is the process of stress testing the resources of the cluster. It’s very useful in understanding the performance of your cluster and to check whether it’s performing as expected before taking it live. Here we are going to test speed in which files are being read/write in HDFS, time taken for mappers/reducers to process… Continue reading Benchmark the cluster (I/O, CPU, network)

Hadoop

Resolve performance problems/errors in cluster operation

This is again another scenario based topic. Some of the common performance problems are jobs running slowly, services crash due to out of memory, etc., Example: When I’m adding Yarn roles, one of the node managers failed to start. In this case, we have to identify the cause for the failure. Select Log files dropdown… Continue reading Resolve performance problems/errors in cluster operation