🐝
Mess around software system design
  • README
  • ArchitectureTradeOffAnalysis
    • Estimation
    • Middleware
    • Network
    • Server
    • Storage
  • Conversion cheat sheet
  • Scenarios
    • TinyURL
      • Estimation
      • Flowchart
      • Shortening mechanisms
      • Rest API
      • Performance
      • Storage
      • Follow-up
    • TaskScheduler
      • JDK delay queue
      • Timer based
      • RabbitMQ based
      • Kafka-based fixed delay time
      • Redis-based customized delay time
      • MySQL-based customized delay time
      • Timer TimingWheel
      • Industrial Scheduler
      • Workflow Engine
      • Airflow Arch
    • GoogleDrive
      • Estimation
      • Flowchart
      • Storage
      • Follow-up
    • Youtube
      • Estimation
      • Flowchart
      • Performance
      • Storage
      • Follow-up
      • Netflix
    • Uber
      • Estimation
      • Rest api
      • Flowchart
      • KNN algorithms
      • Geohash-based KNN mechanism
      • Redis implementation
      • Storage
    • Twitter
      • Estimation
      • Flowchart
      • Storage
      • Scalability
      • Follow-up
    • Instant messenger
      • Architecture overview
      • Presence
      • Unread count
      • Notifications
      • Read receipt
      • Large group chat
      • Storage-Offline 1:1 Chat
      • Storage-Offline group chat
      • Storage-Message roaming
      • NonFunc-Realtime
      • NonFunc-Reliability
      • NonFunc-Ordering
      • NonFunc-Security
      • Livecast-LinkedIn
    • Distributed Lock
      • Single machine
      • AP model based
      • CP model based
      • Chubby-TODO
    • Payment system
      • Resilience
      • Consistency
      • Flash sale
    • Key value store
      • Master-slave KV
      • Peer-to-peer KV
      • Distributed cache
  • Time series scenarios
    • Observability
      • TimeSeries data
      • Distributed traces
      • Logs
      • Metrics
      • NonFunc requirments
  • Search engine
    • Typeahead
    • Search engine
    • Distributed crawler
      • Estimation
      • Flowchart
      • Efficiency
      • Robustness
      • Performance
      • Storage
      • Standalone implementation
      • Python Scrapy framework
    • Stream search
  • Big data
    • GFS/HDFS
      • Data flow
      • High availability
      • Consistency
    • Map reduce
    • Big table/Hbase
    • Haystack
    • TopK
    • Stateful stream
    • Lambda architecture
    • storm架构
    • Beam架构
    • Comparing stream frameworks
    • Instagram-[TODO]
  • MicroSvcs
    • Service Registry
      • Flowchart
      • Data model
      • High availability
      • Comparison
      • Implementation
    • Service governance
      • Load balancing
      • Circuit breaker
      • Bulkhead
      • Downgrade
      • Timeout
      • API gateway
      • RateLimiter
        • Config
        • Algorithm comparison
        • Sliding window
        • Industrial impl
    • MicroSvcs_ConfigCenter-[TODO]
    • MicroSvcs_Security
      • Authentication
      • Authorization
      • Privacy
  • Cache
    • Typical topics
      • Expiration algorithm
      • Access patterns
      • Cache penetration
      • Big key
      • Hot key
      • Distributed lock
      • Data consistency
      • High availability
    • Cache_Redis
      • Data structure
      • ACID
      • Performance
      • Availability
      • Cluster
      • Applications
    • Cache_Memcached
  • Message queue
    • Overview
    • Kafka
      • Ordering
      • At least once
      • Message backlog
      • Consumer idempotency
      • High performance
      • Internal leader election
    • MySQL-based msg queue
    • Other msg queues
      • ActiveMQ-TODO
      • RabbitMQ-TODO
      • RocketMQ-TODO
      • Comparison between MQ
  • Traditional DB
    • Index data structure
    • Index categories
    • Lock
    • MVCC
    • Redo & Undo logs
    • Binlog
    • Schema design
    • DB optimization
    • Distributed transactions
    • High availability
    • Scalability
    • DB migration
    • Partition
    • Sharding
      • Sharding strategies
      • Sharding ID generator overview
        • Auto-increment key
        • UUID
        • Snowflake
        • Implement example
      • Cross-shard pagination queries
      • Non-shard key queries
      • Capacity planning
  • Non-Traditional DB
    • NoSQL overview
    • Rum guess
    • Data structure
    • MySQL based key value
    • KeyValueStore
    • ObjectStore
    • ElasticSearch
    • TableStore-[TODO]
    • Time series DB
    • DistributedAcidDatabase-[TODO]
  • Java basics
    • IO
    • Exception handling
  • Java concurrency
    • Overview
      • Synchronized
      • Reentrant lock
      • Concurrent collections
      • CAS
      • Others
    • Codes
      • ThreadLocal
      • ThreadPool
      • ThreadLifeCycle
      • SingletonPattern
      • Future
      • BlockingQueue
      • Counter
      • ConcurrentHashmap
      • DelayedQueue
  • Java JVM
    • Overview
    • Dynamic proxy
    • Class loading
    • Garbage collection
    • Visibility
  • Server
    • Nginx-[TODO]
  • Distributed system theories
    • Elementary school with CAP
    • Consistency
      • Eventual with Gossip
      • Strong with Raft
      • Tunable with Quorum
      • Fault tolerant with BFT-TODO
      • AutoMerge with CRDT
    • Time in distributed system
      • Logical time
      • Physical time
    • DDIA_Studying-[TODO]
  • Protocols
    • ApiDesign
      • REST
      • RPC
    • Websockets
    • Serialization
      • Thrift
      • Avro
    • HTTP
    • HTTPS
    • Netty-TODO
  • Statistical data structure
    • BloomFilter
    • HyperLoglog
    • CountMinSketch
  • DevOps
    • Container_Docker
    • Container_Kubernetes-[TODO]
  • Network components
    • CDN
    • DNS
    • Load balancer
    • Reverse proxy
    • 云中网络-TODO
  • Templates
    • interviewRecord
  • TODO
    • RecommendationSystem-[TODO]
    • SessionServer-[TODO]
    • Disk
    • Unix philosophy and Kafka
    • Bitcoin
    • Design pattern
      • StateMachine
      • Factory
    • Akka
    • GoogleDoc
      • CRDT
Powered by GitBook
On this page
  • SETNX command
  • NX
  • randomValue
  • PX
  • Use cases - AP model
  • Flowchart
  • Retry case
  • Resume lock
  • Options to avoid distributed lock
  • Optimistic lock
  • Consistent hashing
  • Redlock
  • Motivation
  • Limitations
  • References
  • References
  • Redisson

Was this helpful?

  1. Scenarios
  2. Distributed Lock

AP model based

PreviousSingle machineNextCP model based

Last updated 1 year ago

Was this helpful?

SETNX command

  • Example: SET resourceName randomValue NX PX 30000

NX

  • Succeed only if key does not exist; Otherwise fail the operation. Use the atomic property of NX to guarantee that only one client could configure it successfully.

randomValue

  • Definition: Used for validation when releasing lock among different threads. Only release lock if the randomValue is the same. Typically set as UUID.

PX

  • Automatic expiration time in case there are some exceptions happening (Thread crash)

Motivation

Use cases - AP model

  • Is actually an AP model. Applicable for scenarios that prioritize efficiency over correctness.

Flowchart

Retry case

  • If failed to acquire the lock,

    • Retry every certain period (typically set to 99 percentile of lock usage duration)

    • Monitor deletes events of the key

Resume lock

  • The lock is about to expire but business logic needs more time, resume lock to rescue.

  • If failed to acquire lock, one option is to use interrupt

// Interrupt sample flow in loop case (while/for)

for condition {
  // Interrupt signal
  if interrupted {
    break;
  }
  // Business logic
  DoSomething()
}
// Interrupt sample flow for no-loop case

step1()
if interrupted {
  return
}
step2()
if interrupted {
  return
}

Options to avoid distributed lock

Optimistic lock

  • For scenarios below, it could be improved with optimistic lock.

addDistributedLock()

compute()

updateDatabase()

Consistent hashing

  • Use consistent hashing to guarantee the same key is always routed to the same node. There will be no need for distributed lock.

Redlock

Motivation

  • There is an obvious race condition with this model:

    • Client A acquires the lock in the master.

    • The master crashes before the write to the key is transmitted to the slave.

    • The slave gets promoted to master.

    • Client B acquires the lock to the same resource A already holds a lock for. SAFETY VIOLATION!

Limitations

  • Redlock does not have any facility to generate fencing tokens. And it is not straightforward to repurpose Redlock for generating fencing tokens.

  • Relying on expiration time to avoid deadlock is not reliable.

    • What if the lock owner dies? The lock will be held forever and we could be in a deadlock. To prevent this issue Redis will set an expiration time on the lock, so the lock will be auto-released. However, if the time expires before the task handled by the owner isn't yet finish, another microservice can acquire the lock, and both lock holders can now release the lock causing inconsistency.

  • Redlock depends on a lot of timing assumptions

    1. All Redis nodes hold keys for approximately the right length of time before expiring

    2. The network delay is small compared to the expiry duration

    3. Process pauses are much shorter than the expiry duration

  • References: https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

References

References

Redisson

  • Relationship with Redis could be thought as similar to Curator to Zookeeper

A fencing token needed to be used to avoid race conditions. Please see for details.

.

this post
Failures of distributed locks
A hot debate on the security perspective of RedLock algorithm
SETNX command
NX
randomValue
PX
Motivation
Use cases - AP model
Flowchart
Retry case
Resume lock
Options to avoid distributed lock
Optimistic lock
Consistent hashing
Redlock
Motivation
Limitations
References
References
Redisson
Use case for randomValue
Use case for auto expiration