System Design for Facebook
It could be a complete mismatch with what actually goes on in Facebook.
This only shows how one can logically reason and try to draw some estimates about a big cloud service.
According to this reference: Facebook stats on Zephoria, Facebook handles (as of 2013-2014):
Lets see how this translates into system load per second.
(Assuming 105 seconds per day instead of the actual 86,400)
Estimation of storage
5.1 GB/sec translates to 5.1 x 60 x 60 x 24 x 365 = 160 PB
Assuming Facebook keeps past 3 years of data for serving promptly (and data before that is
accessed rarely from a different kind of storage), we need a storage size of around 500 PB.
This is about one-third the size of database we calculated for Google Maps and so should be manageable.
Number of storage nodes requiredAssuming a single machine could manage around 15 TB of space, we would need 500 PB/ 15 TB = 30,000 database shards.
We would need some replicas with each shard to handle read traffic and to add fail-tolerance.
Assuming 1 master 2 replicas per shard system, we would need 100,000 database nodes, each holding 15 TB of space.
Note that this kind of data is perfect for caching as once uploaded, a photo will never be changed.
(It can be deleted, but not updated).
Comments and likes can be updated but that is also not very frequent and we can omit those updates here.
So instead of this giant cloud of DB-nodes, we could replace those
Traffic analysis for Facebook
We got around 75,000 requests per second above for facebook.
But those are only write queries. Number of reads would be substantially higher.
Assuming a factor of 10 reads per writes, we get around 750,000 requests per second for read.
Here are two sources to find maximum queries per second a server can handle:
These links talk about 10,000 to 15,000 read requests per second if most of the data is from cache.
For our case, let us assume that a single server can handle 10,000 reads per second and 1% of write load (i.e. 100 writes per second).
Writes are expensive because they need to go through DB constraints, update indexes and caches etc.
So to handle 75,000 writes, we need 75,000 / 100 = 750 shards where each master takes 100 writes per second.
Since the data is distributed among the shards, each shard will receive an equal part of read traffic as well.
That figure will be: 750,000 / 750 = 1000 reads per shard per second which is easily manageable by a single shard-machine as well.
Thus a cloud with 750 shards should be sufficient for this purpose where each machine takes 100 writes/sec and 1000 reads/second.
Also see: Whos got the most web servers
|Email:||(Your email is not shared with anybody)|