Skip to content

Hadoop and Cloud

A place for Hadoop Admins and AWS aspirants

  • Homepage
  • About Me
  • Contact
  • AWS
  • Hadoop

Tag: mapreduce

Why Map outputs are stored in local FS and not in HDFS?

April 9, 2018April 9, 2018 ~ Kannan ~ 1 Comment

Map outputs are temporary intermediate data which doesn't purpose to the user running the job. It is used by the reducer to combine, sort, shuffle and produce the final output. It's not recommended to store it in hdfs as the data will be replicated across the cluster, the namenode has to update its metadata, etc. … Continue reading Why Map outputs are stored in local FS and not in HDFS?

Categories

  • AWS
  • cca131
  • Hadoop
  • Linux
  • Uncategorized

Recent Posts

  • AWS EMR cluster monitoring metrics
  • Script to find the long running hadoop jobs
  • Useful Oozie CLI commands
  • Shell Script to get the versioning status of S3 buckets
  • AWS EMR (Elastic MapReduce) – Introduction

Recent Comments

  • furquan on Create/restore a snapshot of an HDFS directory
  • Ketan P on CCA131 – Cloudera Administration Certification Exam Notes and Preparation Guide
  • Taimur on CCA131 – Cloudera Administration Certification Exam Notes and Preparation Guide
  • bdevils464 on Script to find the long running hadoop jobs
  • Siva on CCA131 – Cloudera Administration Certification Exam Notes and Preparation Guide


Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 29 other subscribers

Proudly powered by WordPress ~ Theme: Penscratch 2 by WordPress.com.