HDFS Replication

“HDFS replication is not decided by Namenode”, surprised ?

It is common misconception that replication is dictated by the master node(name node), which is wrong. The replication is decided always by the client or the node which initiates the operation.

There is no global replication factor, as replication can be controlled at the file level. Namenode can only decide on the minimum( default:1) and maximum value(default:512).

dfs.replication is 3 by default, but it can be changed by either of the following:

  1. hdfs-site.xml -> modify the dfs.replication factor.
  2. -Ddfs.replication -> while coping file.
  3. by using setrep command.

To understand how replication is always decided by the source, watch my explanation given by me on the channel: https://www.youtube.com/watch?v=t20niJDO1f4



Leave a Reply

Your email address will not be published. Required fields are marked *