HDFS#
Features of HDFS#
Highly Scalable
Replication
Fault tolerance
Distributed data storage
Portable
Where to use HDFS#
Very Large Files
Streaming Data Access
Commodity Hardware
Where not to use HDFS#
Low Latency data access
Lots Of Small Files
Multiple Writes
HDFS Architecture#

HDFS Read Image:

HDFS Write Image:

Starting HDFS#
hadoop namenode -format
start-dfs.sh
Listing Files in HDFS#
$HADOOP_HOME/bin/hadoop fs -ls <args>
Inserting Data into HDFS#
hadoop fs -mkdir -p /user/inputhadoop fs -put /home/hadoop/input/README.txt /user/inputhadoop fs -ls /user/input
Retrieving Data from HDFS#
hadoop fs -cat /user/input/README.txthadoop fs -get /user/input/README.txt /home/hadoop/
Shutting Down the HDFS#
stop-dfs.sh
HDFS Basic File Operations#
Putting data to HDFS from local file system:
hadoop fs -copyFromLocal /usr/home/Desktop/data.txt /user/testCopying data from HDFS to local file system:
hadoop fs -copyToLocal /user/test/data.txt /usr/bin/data_copy.txtCompare the files and see that both are same:
md5 /usr/bin/data_copy.txt /usr/home/Desktop/data.txtRecursive deleting:
hadoop fs -rmr <arg>
HDFS Other commands#
put <localSrc><dest>copyFromLocal <localSrc><dest>moveFromLocal <localSrc><dest>get [-crc] <src><localDest>cat <filen-ame>moveToLocal <src><localDest>setrep [-R] [-w] rep <path>touchz <path>test -[ezd] <path>stat [format] <path>