ArchitectureTradeOffAnalysis

Architecture tradeoff analysis

Review Rubrics

Soft skills

Requirements gathering
Make decisions and tradeoffs with justification
Describe the solution using concise language and accurate technical terms

Hard skills

Design quality; scalability; reliability, efficiency etc (L4/L5)
Basic facts about existing software and hardware capabilities (L4 partly, L5)
Project lifecycle awareness, e.g. How a project is developed and maintained (L5)

Non-functional requirements (NFRs)

Type

Description

Performance

Efficiency such as throughput and response time

Availability

Uptime percentage in a year

Scalability

As number of nodes increases, service capability increases linearly

Extensibility

Pluggable and easiness to add new functionalities

Security

Privacy and security

Observability

Able to detect problems and get root cause quickly

Testability

Easy to test different componentss

Robustness

Fault tolerance and fast recovery, high robustness usually indicates high availability

Portability / Compatibility

Support for different OS, hardwares, softwares (browsers, etc) and versions

Consistency

Support for different OS, hardwares, softwares (browsers, etc) and versions

Availability

Availability percentage and service downtime

Commodity hardware failure trend

If your system has 4-5 systems and dozens of database servers (around 10) on the critical path, and assume the failure rate as 2%, then each year you will encounter twice disk failure scenarios.

Failure trends in a large disk drive population

Decision chart

[TODO: Decison chart]

COGS

Commodity hardware

https://www.brentozar.com/archive/2014/12/commodity-hardware/#:~:text=Commodity hardware refers to cheap,E5%2D2600 v3 CPU sockets
Two Intel Xeon E5-2623 v3’s (quad core) – $900 total
128GB RAM (using 8GB DIMMs) – $1,920
Two 512GB SSDs for fast storage – $450
Six 4TB hard drives for slow storage – $900
Grand total: $5,070

Capacity planning

1. Get a baseline: MAU and DAU

The benchmarks above show the average stickiness of products for various industries. It is calculated as (DAU/MAU)*100. The chart also mentions the median along with the average because medians are less likely to be skewed by outliers.
For the SaaS industry, the average stickiness is 13% which means slightly less than 4 days of activity/month/user. The Median for the SaaS industry is 9.4%, implying less than 3 days of activity/per user per month.
Multiply DAU/WAU * WAU / MAU to get actual DAU/MAU ratio:
- Facebook: ~72%
- Ecommerce:
  - Amazon: 17%
  - Walmart: 15%
  - eBay: 3%
- Finance:
  - Paypal: 12.5%
  - Venmo: 10%
- Uber: 12.5%
- Netflix: 3%
- Groupon: 4.5%
References:
- https://medium.com/sequoia-capital/selecting-the-right-user-metric-de95015aa38

2. Growth speed

For fast growing data (e.g. order data in ecommerce website), use 2X planned capacity to avoid resharding
For slow growing data (e.g. user identity data in ecommerce website), use 3-year estimated capacity to avoid resharding.

3. Divide capacity by system capability

Single Kafka instance

Single machine write: 250K (50MB) messages per second
Single machine read: 550K (110MB) messages per second

Appendix: Conversions

Power of two

Power of two

10 based number

Short name

1 thousand (10^3)

1 KB

1 million (10^6)

1 MB

1 billion (10^9)

1 GB

1 trillion (10^12)

1 TB

1 quadrillion (10^15)

1 PB

Time scale conversion

Total seconds in a day: 86400 ~ 10^5
2.5 million requests per month: 1 request per second
100 million requests per month: 40 requests per second
1 billion requests per month: 400 requests per second

Performance estimation

Memory

Random access: 300K times / s
Sequential access: 5M times / s
Size: GB level per second
Read 1MB memory data takes 0.25ms

Disk IO

Operating system page size for read and write: 4KB
SATA mechanical hard disk
- IOPS: 120 times / s
- Sequential read size: 100MB / s
- Random read size: 2MB / s
- Sector size: 0.5KB
SSD hard disk: Speed similar to memory
- 0.1-0.2ms
- Sector size: 4KB

Network latency

Single DC network round trip: 0.5ms
Multi DC network round trip: 30-100ms
Usually set timeout value for RPC within a single DC as 500ms
Interactive latency checker (A scroll bar in the top for different year)
- https://colin-scott.github.io/personal_website/research/interactive_latency.html

Typical API latency

[TODO: Add a section for typical API latency]

Load balancing design

Example: Design load balancing mechanism for an application with 10M DAU (e.g. Github has around 10M DAU)
Traffic voluem estimation
10M DAU. Suppose each user operate 10 times a day. Then the QPS will be roughly ~ 1160 QPS
Peak value 10 times average traffic ~ 11600 QPS
Suppose volume need to increase due to static resource, microservices. Suppose 10. QPS ~ 116000 QPS.
Capacity planning
Multiple DC: QPS * 2 = 232000
Half-year volume increase: QPS * 1.5 = 348000
Mechanism
No DNS layer
LVS

Stress testing tools

MySqlslap: Shipped together with MySQL. Could not perform long time stress test.
Sysbench: Works on MacOS and Linux.
JMeter: Only basic functionality for database pressure testing.

Scale numbers with examples

Typeahead service

Google search

Google has been visited 62.19 billion times this year.
Google processes over 3.5 billion searches per day.
- It means that Google processes over 40,000 search queries every second on average. Let’s also take a look at how Google’s searches per year have progressed. In 1998, Google was processing over 10,000 search queries per day. In comparison, by the end of 2006, the same amount of searches would be processed by Google in a single second.
84 percent of respondents use Google 3+ times a day or more often.
- Google has 92.18 percent of the market share as of July 2019.
More than one billion questions have been asked on Google Lens.
63 percent of Google’s US organic search traffic originated from mobile devices.
Facebook was the most searched keyword on Google.
46 percent of product searches begin on Google.
90 percent of survey respondents said they were likely to click on the first set of results.
https://www.oberlo.com/blog/google-search-statistics

Instant messaging app

https://everysecond.io/messenger
Whatsapp: 1.6 billion MAU
Facebook Messenger: 1.3 billion MAU
Wechat: 1.1 billion MAU
Snapchat: 0.3 billion MAU
Telegram: 0.2 billion MAU

Microsoft Teams

140 million DAU
240 million MAU

1.6 billion WhatsApp users access the app on a monthly basis. 53 percent of WhatsApp users in the US use the app at least once a day.
More than 65 billion messages are sent via WhatsApp every day. In other words, that boils down to 2.7 billion per hour, 45 million per minute, and more than 750,000 per second.
WhatsApp was downloaded 96 million times in February 2020.
WhatsApp is available in more than 180 countries and 60 different languages.
With 340 million users, India is WhatsApp’s biggest market.
There are more than five million businesses using WhatsApp Business.

Video Streaming

Netflix

200 million subscribers Q4/2020. US has 74 million subscribers.
- vs Amazon Prime - 150 million subscribers
- vs Hulu - 39 million subscribers
Subscribers spent 3.2 hours per day watching Netflix
https://www.businessofapps.com/data/netflix-statistics/
serving 100% of our video, over 125 million hours every day, to 100 million members across the globe! https://netflixtechblog.com/how-data-science-helps-power-worldwide-delivery-of-netflix-content-bac55800f9a7
For each episode of the crown, over 1200 files will be created. https://netflixtechblog.com/content-popularity-for-open-connect-b86d56f613b

// Watch video RPS
100 M daily active users * 2 hours per day spent by each subscriber / total seconds

Youtube

Every second: https://everysecond.io/youtube
2.3 billion MAU
720,000 hours of video uploaded daily
- 500 hours of video uploaded every minute
- (2012) 4 billion hours of video watched every day. 60 hours of video is uploaded every minute. 350+ million devices are YouTube enabled.
- (2009) 1 billion videws per day. That’s at least 11,574 views per second, 694,444 views per minute, and 41,666,667 views per hour. https://mashable.com/2009/10/09/youtube-billion-views/
- 8.4 minutes per person per day if everyone watches Youtube
Second most popular search after Google
Localized in 100 countries and 80 languages
70% of traffic come from mobile
Reference: https://www.oberlo.com/blog/youtube-statistics#:~:text=500 hours of video are,uploaded every day to YouTube.

Newsfeed

Twitter

There are 330m monthly active users and 145 million daily users.
There are 500 million tweets sent each day. That’s 6,000 tweets every second.
A total of 1.3 billion accounts have been created.
Of those, 44% made an account and left before ever sending a tweet.
Based on US accounts, 10% of users write 80% of tweets.
During the 2014 FIFA World Cup Final, 618,725 tweets were sent in a single minute.
Reference: https://www.brandwatch.com/blog/twitter-stats-and-statistics/#:~:text=Twitter user statistics,billion accounts have been created.&text=As of Q1 2019%2C 68m,access the site via mobile.

Facebook

Instagram

https://everysecond.io/instagram
In total 250 billion photo since 2004.
Photo uploads total 300 million per day
243,055 new photos uploaded per minute
127 photos uploaded on average per Facebook user
There are 1.074 billion Instagram MAU worldwide in 2021.
Instagram users spend an average of 53 minutes per day.
Dec, 2012: more than 25 photos and 90 likes every second.
https://www.statista.com/topics/1882/instagram/#:~:text=As of June 2018%2C the,market based on audience size.

File system

Dropbox

Assume the application has 50 million signed up users and 10 million DAU. • Users get 10 GB free space.
Assume users upload 2 files per day. The average file size is 500 KB.
1:1 read to write ratio.
Total space allocated: 50 million * 10 GB = 500 Petabyte
QPS for upload API: 10 million * 2 uploads / 24 hours / 3600 seconds = ~ 240
Peak QPS = QPS * 2 = 480
Reference: Dropbox statistics

Geo location

Yelp

Yelp has more than 178 million unique visitors monthly across mobile, desktop and app platforms
Reference: https://review42.com/resources/yelp-statistics/

Uber

https://everysecond.io/uber
103 million MAU
Uber has 5 million drivers, Q4 2019 and 18.7 million trips per day on average Q1 2020
- versus Lyft has 2 million drivers, who serve over 21.2 million active riders per quarter

References

分布式服务架构原理、设计与实战

PreviousREADME NextEstimation

Last updated 1 year ago

Was this helpful?