Snowflake
Last updated
Last updated
The IDs are made up of the following components:
First bit: Not used.
Epoch timestamp in millisecond precision - 41 bits (gives us 69 years with a custom epoch)
Configured machine id - 10 bits (gives us up to 1024 machines)
Sequence number - 12 bits (A local counter per machine that rolls over every 4096)
In theory, the QPS for snowflake could be 409.6 * 10^4 /s.
64-bit unique IDs, half the size of a UUID
Can use time as first component and remain sortable
Distributed system that can survive nodes dying
Would introduce additional complexity and more ‘moving parts’ (ZooKeeper, Snowflake servers) into our architecture.
If local system time is not accurate, it might generate duplicated IDs. For example, when time is reset/rolled back, duplicated ids will be generated.
If the QPS is not high such as 1 ID per second, then the generated ID will always end with "1" or some number, which resulting in uneven shards when used as primary key.
Solutions: Use random number as starting bit, rather than 0.
If there are too many requests and the 12 bits are exhausted for the given timestamp, then more bits for sequence number could be allocated.
Embed business sharding key inside generated IDs. Useful in sharding cases.
It could not guarantee the global uniqueness: What if the same user ID generated two idential sequence number? Although the probability is low.
Retrieve IDs in batch
Prefetch IDs