Archives for 


What is HBase? – Hadoop HBase Introduction

hbase-architecture-diagram-explanation What is HBase used for? HBase is a Hadoop’s distributed, scalable, NoSQL database for big data which is run on HDFS – The distributed file system and primary storage layer for Hadoop. The HBase Physical Architecture consists of servers in a Master-Slave relationship as shown below. Typically, the HBase cluster has one Master node, called […]

In Map Reduce – Record reader importance

record-reader-in-map-reduce Hadoop is running on Hadoop Distributed File System (HDFS) that means it is based on Distributed computing. If one data set is passing through Hadoop system it is split as blocks. The default size is 64 MB in Hadoop system. It can be split as multiple of these default size that means 64 MB or […]

Relationship with Input Splits and HDFS Blocks

Input Splits and HDFS Blocks Hadoop is very dynamic technology which rules the technology world now. When you learning about Hadoop you have to understand the what is Hadoop. Hadoop is usually defined as the framework running on distributed data systems. The Data which is injected into Hadoop this data is divided as block of data and it stored in […]