Mess around software system design

CtrlK

Unread count

Badge count
- Question1: Whether to store badge and conversation count separately
- Question2: Keep consistency between badge and conversation count
  - Problems
  - Solution
    Distributed lock
    Transaction
    Lua script
Count of unread messages in a thread
- Naive solution with SQL
  - Improve write requests with hash based sharding
  - Improve read requests with Redis
    Not enough memory
- Avoid notification storm in large group chat
  - Solution: Aggregate and update
Count of unread message in newsfeed
- How is the scenario different?
- Idea

Badge count

Question1: Whether to store badge and conversation count separately

In theory, it is possible to calculate badge count from conversation count on the fly.
In practice, badge counter is used in a much higher frequency than these internal counters. If it is always calculated on the fly, then it will be a performance penalty.
So badge count and conversation count are usually stored separately.

Question2: Keep consistency between badge and conversation count

Problems

Total unread message increment and unread message against a specific person are two atomic operations. One could fail while the other one succeed. Or other clearing operations are being executed between these two operations.

Solution

Distributed lock

MC add, Redis setNX

Transaction

Redis's MULTI, DISCARD, EXEC and WATCH operations. Optimistic lock.

Lua script

Count of unread messages in a thread

Naive solution with SQL

All attributes inside a table and using (thread id + user id) as the primary key

select unread_count from threadToUser table

Improve write requests with hash based sharding

Hash sharding based on (thread id + user id) over time range based sharding because chat data obviously has hot / cold data. Using time range based sharding could lead to hot shards

Improve read requests with Redis

Improve with Redis

Not enough memory

Redis is designed for general data structures and could take much memory:
- Revise the native data structure for Redis
- For example, Key stored as string: 8 bit LONG type will be stored as 8 bit (sdshdr length)+ 19 bit (8 byte Long represent as string）+ 1(’\0’)=28; In addition, remove pointers
Redis + SSD: Popularity of tweets usually calms down over time. For old data, persist them into SSD disk.

Avoid notification storm in large group chat

Suppose that there is a 5000 people group and there are 10 persons speaking within the group per second, then QPS for updating unread messges will be 50K; When there are 1000 such groups, the QPS will be 50M

Solution: Aggregate and update

There will be multiple queues A/B/C/... for buffering all incoming requests.
Two components will be pulling from queues
- Timer: Will be triggered after certain time
- Flusher: Will be triggered if any of the queue exceed a certain length
Aggregator service will pull msgs from Timer and Flusher, aggregate the read increment and decrement operations

Cons:
- Since there is no persistent on queues, if there is a restart, the number of unread messages will be inaccurate

Count of unread message in newsfeed

How is the scenario different?

Counts of tweets repost, comments... will be based on user activity (follow, edit, etc.), counts of newsfeed need to happen for everyone without any user intervention. The former is triggered by user activity, the later always happen automatically.
Counts of system notifications. Unread message number in newsfeed will vary among every user.

Idea

Record how many posts each user has made. Also record a snapshot of all posts.
When need to check for how many unread messages there are in newsfeed, calculate the different between snapshots.

PreviousPresence NextNotifications

Last updated 3 years ago

Was this helpful?