🐝
Mess around software system design
  • README
  • ArchitectureTradeOffAnalysis
    • Estimation
    • Middleware
    • Network
    • Server
    • Storage
  • Conversion cheat sheet
  • Scenarios
    • TinyURL
      • Estimation
      • Flowchart
      • Shortening mechanisms
      • Rest API
      • Performance
      • Storage
      • Follow-up
    • TaskScheduler
      • JDK delay queue
      • Timer based
      • RabbitMQ based
      • Kafka-based fixed delay time
      • Redis-based customized delay time
      • MySQL-based customized delay time
      • Timer TimingWheel
      • Industrial Scheduler
      • Workflow Engine
      • Airflow Arch
    • GoogleDrive
      • Estimation
      • Flowchart
      • Storage
      • Follow-up
    • Youtube
      • Estimation
      • Flowchart
      • Performance
      • Storage
      • Follow-up
      • Netflix
    • Uber
      • Estimation
      • Rest api
      • Flowchart
      • KNN algorithms
      • Geohash-based KNN mechanism
      • Redis implementation
      • Storage
    • Twitter
      • Estimation
      • Flowchart
      • Storage
      • Scalability
      • Follow-up
    • Instant messenger
      • Architecture overview
      • Presence
      • Unread count
      • Notifications
      • Read receipt
      • Large group chat
      • Storage-Offline 1:1 Chat
      • Storage-Offline group chat
      • Storage-Message roaming
      • NonFunc-Realtime
      • NonFunc-Reliability
      • NonFunc-Ordering
      • NonFunc-Security
      • Livecast-LinkedIn
    • Distributed Lock
      • Single machine
      • AP model based
      • CP model based
      • Chubby-TODO
    • Payment system
      • Resilience
      • Consistency
      • Flash sale
    • Key value store
      • Master-slave KV
      • Peer-to-peer KV
      • Distributed cache
  • Time series scenarios
    • Observability
      • TimeSeries data
      • Distributed traces
      • Logs
      • Metrics
      • NonFunc requirments
  • Search engine
    • Typeahead
    • Search engine
    • Distributed crawler
      • Estimation
      • Flowchart
      • Efficiency
      • Robustness
      • Performance
      • Storage
      • Standalone implementation
      • Python Scrapy framework
    • Stream search
  • Big data
    • GFS/HDFS
      • Data flow
      • High availability
      • Consistency
    • Map reduce
    • Big table/Hbase
    • Haystack
    • TopK
    • Stateful stream
    • Lambda architecture
    • storm架构
    • Beam架构
    • Comparing stream frameworks
    • Instagram-[TODO]
  • MicroSvcs
    • Service Registry
      • Flowchart
      • Data model
      • High availability
      • Comparison
      • Implementation
    • Service governance
      • Load balancing
      • Circuit breaker
      • Bulkhead
      • Downgrade
      • Timeout
      • API gateway
      • RateLimiter
        • Config
        • Algorithm comparison
        • Sliding window
        • Industrial impl
    • MicroSvcs_ConfigCenter-[TODO]
    • MicroSvcs_Security
      • Authentication
      • Authorization
      • Privacy
  • Cache
    • Typical topics
      • Expiration algorithm
      • Access patterns
      • Cache penetration
      • Big key
      • Hot key
      • Distributed lock
      • Data consistency
      • High availability
    • Cache_Redis
      • Data structure
      • ACID
      • Performance
      • Availability
      • Cluster
      • Applications
    • Cache_Memcached
  • Message queue
    • Overview
    • Kafka
      • Ordering
      • At least once
      • Message backlog
      • Consumer idempotency
      • High performance
      • Internal leader election
    • MySQL-based msg queue
    • Other msg queues
      • ActiveMQ-TODO
      • RabbitMQ-TODO
      • RocketMQ-TODO
      • Comparison between MQ
  • Traditional DB
    • Index data structure
    • Index categories
    • Lock
    • MVCC
    • Redo & Undo logs
    • Binlog
    • Schema design
    • DB optimization
    • Distributed transactions
    • High availability
    • Scalability
    • DB migration
    • Partition
    • Sharding
      • Sharding strategies
      • Sharding ID generator overview
        • Auto-increment key
        • UUID
        • Snowflake
        • Implement example
      • Cross-shard pagination queries
      • Non-shard key queries
      • Capacity planning
  • Non-Traditional DB
    • NoSQL overview
    • Rum guess
    • Data structure
    • MySQL based key value
    • KeyValueStore
    • ObjectStore
    • ElasticSearch
    • TableStore-[TODO]
    • Time series DB
    • DistributedAcidDatabase-[TODO]
  • Java basics
    • IO
    • Exception handling
  • Java concurrency
    • Overview
      • Synchronized
      • Reentrant lock
      • Concurrent collections
      • CAS
      • Others
    • Codes
      • ThreadLocal
      • ThreadPool
      • ThreadLifeCycle
      • SingletonPattern
      • Future
      • BlockingQueue
      • Counter
      • ConcurrentHashmap
      • DelayedQueue
  • Java JVM
    • Overview
    • Dynamic proxy
    • Class loading
    • Garbage collection
    • Visibility
  • Server
    • Nginx-[TODO]
  • Distributed system theories
    • Elementary school with CAP
    • Consistency
      • Eventual with Gossip
      • Strong with Raft
      • Tunable with Quorum
      • Fault tolerant with BFT-TODO
      • AutoMerge with CRDT
    • Time in distributed system
      • Logical time
      • Physical time
    • DDIA_Studying-[TODO]
  • Protocols
    • ApiDesign
      • REST
      • RPC
    • Websockets
    • Serialization
      • Thrift
      • Avro
    • HTTP
    • HTTPS
    • Netty-TODO
  • Statistical data structure
    • BloomFilter
    • HyperLoglog
    • CountMinSketch
  • DevOps
    • Container_Docker
    • Container_Kubernetes-[TODO]
  • Network components
    • CDN
    • DNS
    • Load balancer
    • Reverse proxy
    • 云中网络-TODO
  • Templates
    • interviewRecord
  • TODO
    • RecommendationSystem-[TODO]
    • SessionServer-[TODO]
    • Disk
    • Unix philosophy and Kafka
    • Bitcoin
    • Design pattern
      • StateMachine
      • Factory
    • Akka
    • GoogleDoc
      • CRDT
Powered by GitBook
On this page
  • Replication
  • Ring buffer as replication buffer
  • HA Availability - Sentinel
  • Def
  • Health detection
  • Master election
  • State notifications
  • Ref

Was this helpful?

  1. Cache
  2. Cache_Redis

Availability

PreviousPerformanceNextCluster

Last updated 3 years ago

Was this helpful?

Replication

Ring buffer as replication buffer

Def

  • A ring buffer is a circular queue with a maximum size or capacity which will continue to loop back over itself in a circular motion.

  • http://tutorials.jenkov.com/java-performance/ring-buffer.html

Size

  • On the ring buffer, Master maintains a pointer called master_repl_offset (namely write pos) and slaves maintains a pointer called slave_repl_offset (namely read pos).

  • The size of the buffer should take the following factors into consideration:

    • The write speed on the master

    • The transmission speed between master and slaves

  • It could be calculated as

BufferSpace = MasterWriteSpeed * OperationSize - TransmissionSpeed * OperationSize

// By default, buffer size is 1M. 
// e.g. Master write speed 2000 RPS, size 2K, transmission speed 1000
BufferSpace = (2000 - 1000) * 2K = 2M

Advantages

  • Circular Queues offer a quick and clean way to store FIFO data with a maximum size.

  • Doesn’t use dynamic memory → No memory leaks

  • Conserves memory as we only store up to our capacity (opposed to a queue which could continue to grow if input outpaces output.)

  • Simple Implementation → easy to trust and test

  • Never has to reorganize / copy data around

  • All operations occur in constant time O(1)

Disadvantages

  • Circular Queues can only store the pre-determined maximum number of elements.

  • Have to know the max size beforehand

HA Availability - Sentinel

Def

  • Initialization: Sentinel is a redis server running on a special mode

    • Sentinel will not load RDB or AOF file.

    • Sentinel will load a special set of Sentinel commands.

  • It will

    • Monitor your master & slave instances, notify you about changed behaviour.

    • Handle automatic failover in case a master is down.

    • Act as a configuration provider, so your clients can find the current master instance.

Health detection

  • Detect instance's objective and subjective downstate by sending PING commands

  • Redis sentinel will form a cluster. Each sentinel instance will ping redis master/slave server for connectivity. If an instance does not respond to ping requests correctly, it will be considered in subjective down state. Only when quorum of sentinel cluster consider a master in subjective down state, master-slave failover will happen.

Qurom

  • Use cases: Considering a master as objectively downstate; Authorizing the failover process

  • Quorum could be used to tune sentinel in two ways:

    • If a the quorum is set to a value smaller than the majority of Sentinels we deploy, we are basically making Sentinel more sensible to master failures, triggering a failover as soon as even just a minority of Sentinels is no longer able to talk with the master.

    • If a quorum is set to a value greater than the majority of Sentinels, we are making Sentinel able to failover only when there are a very large number (larger than majority) of well connected Sentinels which agree about the master being down.

Subjective and objective down state

  • Subjective down state: An SDOWN condition is reached when it does not receive a valid reply to PING requests for the number of seconds specified in the configuration as is-master-down-after-milliseconds parameter.

  • Objective down state: When enough Sentinels (at least the number configured as the quorum parameter of the monitored master) have an SDOWN condition, and get feedback from other Sentinels using the SENTINEL is-master-down-by-addr command.

Master election

Step1: Remove insuitable nodes

  • If a slave that is found to be disconnected from the master for more than ten times the configured master timeout (down-after-milliseconds option), plus the time the master is also not available from the point of view of the Sentinel doing the failover, is considered to be not suitable for the failover and is skipped.

Step2: Rank remaining nodes

  • Slaves will consider the following factors in order. As long as there is a slave winner in one round, it will be elected as master. Otherwise, it will continue to the next factor.

    1. Slave priority: Each slave could be configured with a manual number slave-priority. The slaves are sorted by slave-priority as configured in the redis.conf file of the Redis instance. A lower priority will be preferred.

    2. Close to old master in terms of replication offset. If the priority is the same, the replication offset processed by the slave is checked, and the slave that received more data from the master is selected.

    3. Slave instance ID. Having a lower run ID is not a real advantage for a slave, but is useful in order to make the process of slave selection more deterministic, instead of resorting to select a random slave.

State notifications

Connection between sentinels

  • You don't need to configure a list of other Sentinel addresses in every Sentinel instance you run, as Sentinel uses the Redis instances Pub/Sub capabilities in order to discover the other Sentinels that are monitoring the same masters and slaves.

    • Every Sentinel publishes a message to every monitored master and slave Pub/Sub channel sentinel:hello, every two seconds, announcing its presence with ip, port, runid.

    • Every Sentinel is subscribed to the Pub/Sub channel sentinel:hello of every master and slave, looking for unknown sentinels. When new sentinels are detected, they are added as sentinels of this master.

    • Hello messages also include the full current configuration of the master. If the receiving Sentinel has a configuration for a given master which is older than the one received, it updates to the new configuration immediately.

Connection between sentinel and redis master

  • When starting sentinel instance, needs to provide sentinel master ip address and port

Connection between sentinel and redis slave

  • Configuratin epochs

    • Epoch is similar to Raft algorithm's term.

    • When a Sentinel is authorized, it gets a unique configuration epoch for the master it is failing over. This is a number that will be used to version the new configuration after the failover is completed. Because a majority agreed that a given version was assigned to a given Sentinel, no other Sentinel will be able to use it.

Ref

  • Sentinel is Redis' resiliency solution to standalone redis instance.

Slave selection: Relies on

See for details.

Raft protocol
https://redis.io/topics/sentinel
Compare redis deployments
Replication
Ring buffer as replication buffer
Def
Size
Advantages
Disadvantages
HA Availability - Sentinel
Def
Health detection
Qurom
Subjective and objective down state
Master election
Step1: Remove insuitable nodes
Step2: Rank remaining nodes
State notifications
Connection between sentinels
Connection between sentinel and redis master
Connection between sentinel and redis slave
Ref