ObjectStore
Scenario
Write a file
Read a file
Use multiple machines to store these files
Storage
How to save a file in one machine
Metadata
FileInfo
Name = dengchao.mp4
CreatedTime = 201505031232
Size = 2044323
Index
Block 11 -> diskOffset1
Block 12 -> diskOffset2
Block 13 -> diskOffset3
Block
1 block = 1024 Byte
Advantages
Error checking
Fragmenting the data for storage
How to save a much larger file in one machine
Change chunk size
1 chunk = 64M = 64 * 1024K
Advantages
Reduce size of metadata
Disadvantages
Waste space for small files
Scale
Architecture style
Peer 2 Peer (BitComet, Cassandra)
Advantage: No single point of failure
Disadvantage: Multiple machines need to negotiate with each other
Master slave
Advantage: Simple design. Easy to keep data consistent
Disadvantage: Master is a single point of failure
Final decision
Master + slave
Restart the single master
How to save an extra large file on several machines
One master + many chunk servers
Move chunk offset from master to slaves
Master don't record the disk offset of a chunk
Advantage: Reduce the size of metadata in master; Reduce the traffic between master and chunk server
Write process
The client divides the file into chunks. Create a chunk index for each chunk
Send (FileName, chunk index) to master and master replies with assigned chunk servers
The client transfer data with the assigned chunk server.
Do not support modification
Read process
The client sends (FileName) to master and receives a chunk list (chunk index, chunk server) from the master
The client connects with different server for reading files
Master task
Store metadata for different files
Store Map (file name + chunk index -> chunk server)
Find corresponding server when reading in data
Write to more available chunk server
Failure and recovery
Single master
Double master (Apache Hadoop Goes Realtime at Facebook)
Multi master (Paxos algorithm)
What if a chunk is broken
Check sum 4bytes = 32 bit
Each chunk has a checksum
Write checksum when writing out a chunk
Check checsum when reading in a chunk
Avoid loss of data when chunk server is down
Replica: 3 copies
Two copies in the same data center but on different racks
Third copy in a different data center
How to choose chunk servers
Find servers which are not busy
Find servers with lots of available disk space
How to recover when a chunk is broken
Ask master for help
How to find whether a chunk server is down
Heart beat message
How to solve client bottleneck
Client only writes to a leader chunk server. The leader chunk server is responsible for communicating with other chunk servers.
How to select leading slaves
How to solve chunk server failure
Ask the client to retry
Last updated