hadoop,分布式,系统,存储,云计算
15-440, Hadoop Distributed File SystemAllison Naaktgeboren
Ur doin' it rong kitteh
Wut u mean? I iz loadin a HA-doop fileh
hadoop,分布式,系统,存储,云计算
Annoucements
Go Vote! Interpretive Dances happen only after Lecture Office Hour Change
Mon: 6:30-9:30 Tues: 6-7:30
Exams are graded
hadoop,分布式,系统,存储,云计算
Hadoop Core at 30,000 ft
hadoop,分布式,系统,存储,云计算
Back to the Map Reduce Model
Recall that– map (in_key, in_value) -> (inter_key, inter_value) list combine (inter_key, inter_value) → (inter_key, inter_value) – reduce (inter_key, inter_value list) -> (out_key, out_vlaue)
What resource are we most constrained by?
“Oceans of Data, Skinny pipes”
How many types of data will the file system care about? How long will we need each kind? What is the common case for each?
hadoop,分布式,系统,存储,云计算
hadoop,分布式,系统,存储,云计算
What would a MR Filesytem need?
General Use case: large files
Mostly append to end, long sequential reads, few deletes Appends might be concurrent Adding (or losing) machines should be relatively painless Minimize moving data between machines
Scability
Nodes work on nearby data
Bandwidth is our limiting resource Remember how much data
Failure (handling)is Common
Yea, yea we know, we took 213, we know hardware sucksDisks, processors,whole nodes, racks, and datacenters
No, really failure (handling) is common (constant)
hadoop,分布式,系统,存储,云计算
Addressing Those Concerns
Sequential Reads, appends need to be fast
Deletes can be painfulAdd or lose machines while system is running jobs System should auto detect the change So that all workers have a reasonable amount of data to chew on And coordinating with the Jobtracker (job master) Should be spread out. Why? What type of problems could arise?
“Hot plug” machines
HDFS should distribute data somewhat evenly
Data Replication
hadoop,分布式,系统,存储,云计算
Moving into the Details
Nodes in HDFS
NB – Hadoop and HDFS closely paired
NameNode (master) ( like GFS Master) DataNodes (slaves) ( like GFS chunkservers) “careful use of jargon defines the true expert” “worker node A” and “data node 1” are frequently the same machineJobtracker (Hadoop Job Master) NameNode (file system Master)
Two types of Masters
What I mean by 'master' for the rest of the lecture
hadoop,分布式,系统,存储,云计算
Your Data goes in ....
Files are divided into Chunks
64 MB
The mapping between filename and chunks goes to the Master Each chunk is replicated and sent off to DataNodes
By default, 3 The master determines which dataNodes
hadoop,分布式,系统,存储,云计算
What the Clients Do
Where the data starts On file creation creates a seperate file w/checksum When data fetched back from a dataNode, checksum computed again Cache file data
Avoid bothering the Master too often
When a Client has 1 chunk's worth of data
Contacts the Master, Master sends name of dataNodes
to send it to ONLY sends it to the 1st
hadoop,分布式,系统,存储,云计算
What the DataNodes Do
Heartbeat to the Master Opens, closes, or replicates a chunk if requested from Master During replication, sends data to next dataNode in chain
hadoop,分布式,系统,存储,云计算
What the Namespace Node Does
System metadata!
Holds Name->ID mapping Chunk replicas locations Transcation Logs
EditLog FSImage
It is responsible for coherency
Uses the logs atomically Addresses the conccurent writes issue Similar to AFS volume snapshots Will pull last consistent log upon restart
It is checkpointed
hadoop,分布式,系统,存储,云计算
What the Namespace Node Does
Listens for Heartbeats Listens for Client Requests If no heartbeat
marks a node as dead Its data is deregistered Which nodes get which chunks Signals creating, opening, closing Orders move to /trash Starts delete timer
It selects dataNodes
Deletes
hadoop,分布式,系统,存储,云计算
All together Now!
hadoop,分布式,系统,存储,云计算
Additional Resources
Hadoop wiki Youtube → “Hadoop” → Google developer videos (1-3 will be helpful) Google University
Includes UW course, the other UW course, a couple others Use are your own risk
“The Google File System” paper is rather readable as research papers go