（ One ）HDFS
distributed file system , For big data storage . High fault tolerance 、 high reliability 、 High scalability and high throughput .
1. HDFS framework
HDFS yes master/slave Architecture design pattern . One HDFS The cluster has a metadata node (NameNode) And some data nodes (DataNode) form .NameNode It's the master node , It is mainly used to store metadata of management files 【 The metadata size is 150byte namely 8 The metadata is 1K】（ File name 、 size 、 Storage path ） And processing requests from clients .DataNode It's the slave node , It is used to manage the data storage of the corresponding node . The file is stored in DataNode On .
2. Main components
HDFS The data in is stored in the form of data blocks . The default size is 128M.（ If a block is less than 128M, It is stored according to the specific size . It doesn't take up the whole space ）
Store metadata information and receive client requests .
Metadata ： file name 、 file size 、 File storage location, etc . System metadata is stored in fsimage and edits in .
FsImae: System mapping file , It mainly stores source data information .
Edits: Operation log file .HDFS The operation log of the file is saved in it .
Store data in blocks .
Periodically Name Node Send a heartbeat , Reporting on data storage
DN How it works
- Data block in DN Store in the form of a file . Contains two files , One of them is the document itself , The other is the verification file of the file （ length , Data blocks 、 The checksum 、 Time stamp ）
- DN After starting , towards NN register . periodic （1 Hours ） to NN Report block situation .
- DN and NN There's a heartbeat mechanism between （3S）. If NN Over a period of time （10M） Have not received DN The heart of , The machine is not available .
In order to solve the problem of long system startup time . assist Name Node Implementation work .
Metadata exists in memory , You can read the request quickly . Worry about losing power , So there are backup files on disk .FsImage.
Memory metadata modification , Synchronize updates FsImage Too slow （ Modify the operating ）, introduce Edits, Just add , It's very efficient .
SNN effect ： System startup process . The system starts ,Name Node Will load FSimage and Edits. After getting the complete metadata information , Will write FSimage in . If the system starts twice for a long time ,edits It's bigger . Merge fsimage and edits It will take time . It takes a long time to start . therefore ,secondNameNode It's to help NamaNode Merge ahead of time FsImage and Edits file .
SNN Workflow ：
- SNN Intercede NN If you need checkPoint. Returns whether the result is needed or not .
- SNN request NN perform Checkpoint
- NN Scroll what you're writing Edits journal
- adopt Http get Method will mirror the corpus FsImage And compile logs Edits copy to SNN.
- SNN Load image and log files into memory and merge . Generate fsImage.chkpoint
- adopt Http post take FsImage.chkPoint Send to NN.NN Rename it to FsImage And overlay the original file .
Trigger Second Name Node It's usually time to start or Edits Trigger when the number of bars reaches the threshold .
3. The reading and writing process of documents
(1) Read file process
- Client to NameNode Send a request to read a file .C->NN
- NameNode Return metadata list （ The document is divided into several parts Block, Every Block Corresponding to a list of files ）
- Through each Block, according to Block List of metadata for , Select the nearest node sent DataNode request . Establishing a connection .
- After obtaining the data , Shut down and DataNode The connection of . Deal with the next Block Node download .
- Close the mission .
(2) Write file process
- Client to NamaNode Send write file request .C->NN
- NameNode Verify permissions , Whether the document exists, etc . Returns whether you can upload .
- The client loops through each block Block, towards NN Send a request . Get the current Block List of metadata for .[ds1、ds2、ds3]
- For the current Block, Select recent DataNode ds1, Use FSDataOutPutStream Establishing a connection , Write data .
- The backup mechanism is responsible for backing up the written data to other nodes ds1 Data writing ds2,ds2 Data writing ds3.
- To complete all Block After writing , Close the mission .
4. Knowledge point
(1) Network topology ： Node calculation
Distance between nodes ： The two nodes arrive at the sum of the nearest common ancestor .
(2) HADOOP Replica node selection
The first copy , If the client is on the node , It's the current node , If the client is not on the node , The first copy is optional .
The second copy is in the same rack as the first , On different nodes .
The third copy is different from the first two frames , Nodes are random .