HDFS#
Features of HDFS#
Highly Scalable
Replication
Fault tolerance
Distributed data storage
Portable
Where to use HDFS#
Very Large Files
Streaming Data Access
Commodity Hardware
Where not to use HDFS#
Low Latency data access
Lots Of Small Files
Multiple Writes
HDFS Architecture#
HDFS Read Image:
HDFS Write Image:
Starting HDFS#
hadoop namenode -format
start-dfs.sh
Listing Files in HDFS#
$HADOOP_HOME/bin/hadoop fs -ls <args>
Inserting Data into HDFS#
hadoop fs -mkdir -p /user/input
hadoop fs -put /home/hadoop/input/README.txt /user/input
hadoop fs -ls /user/input
Retrieving Data from HDFS#
hadoop fs -cat /user/input/README.txt
hadoop fs -get /user/input/README.txt /home/hadoop/
Shutting Down the HDFS#
stop-dfs.sh
HDFS Basic File Operations#
Putting data to HDFS from local file system:
hadoop fs -copyFromLocal /usr/home/Desktop/data.txt /user/test
Copying data from HDFS to local file system:
hadoop fs -copyToLocal /user/test/data.txt /usr/bin/data_copy.txt
Compare the files and see that both are same:
md5 /usr/bin/data_copy.txt /usr/home/Desktop/data.txt
Recursive deleting:
hadoop fs -rmr <arg>
HDFS Other commands#
put <localSrc><dest>
copyFromLocal <localSrc><dest>
moveFromLocal <localSrc><dest>
get [-crc] <src><localDest>
cat <filen-ame>
moveToLocal <src><localDest>
setrep [-R] [-w] rep <path>
touchz <path>
test -[ezd] <path>
stat [format] <path>