HDFS
Features of HDFS
Highly Scalable
Replication
Fault tolerance
Distributed data storage
Portable
Where to use HDFS
Very Large Files
Streaming Data Access
Commodity Hardware
Where not to use HDFS
Low Latency data access
Lots Of Small Files
Multiple Writes
HDFS Architecture
HDFS Read Image:
HDFS Write Image:
Starting HDFS
hadoop namenode -format
start-dfs.sh
Listing Files in HDFS
$HADOOP_HOME/bin/hadoop fs -ls <args>
Inserting Data into HDFS
hadoop fs -mkdir -p /user/input
hadoop fs -put /home/hadoop/input/README.txt /user/input
hadoop fs -ls /user/input
Retrieving Data from HDFS
hadoop fs -cat /user/input/README.txt
hadoop fs -get /user/input/README.txt /home/hadoop/
Shutting Down the HDFS
stop-dfs.sh
HDFS Basic File Operations
Putting data to HDFS from local file system:
hadoop fs -copyFromLocal /usr/home/Desktop/data.txt /user/test
Copying data from HDFS to local file system:
hadoop fs -copyToLocal /user/test/data.txt /usr/bin/data_copy.txt
Compare the files and see that both are same:
md5 /usr/bin/data_copy.txt /usr/home/Desktop/data.txt
Recursive deleting:
hadoop fs -rmr <arg>
HDFS Other commands
put <localSrc><dest>
copyFromLocal <localSrc><dest>
moveFromLocal <localSrc><dest>
get [-crc] <src><localDest>
cat <filen-ame>
moveToLocal <src><localDest>
setrep [-R] [-w] rep <path>
touchz <path>
test -[ezd] <path>
stat [format] <path>