High availability
Replication
Category
Synchronous replication:
Asynchronous replication:
Semi-Synchronous replication:
Replication topology
Process
Flowchart
binlog
Format
Please see MySQL official documents
Statement
Statement: not always safe, but may save storage place and faster
Pros:
Low number of logs generated, save I/O bandwidth
Does not require consistency between master and slave table schema
Cons:
Inapplicable for many SQL statements. For non-deterministic functions, could not gaurantee the consistency between master and slave server
Database locks: Needs to lock a bigger chunk of data
Row
Binlog: always safe, possibly very slow and inefficient in time and space
Pros:
Database locks: Only need to lock a specific row
Applicable to any SQL statement.
Cons:
High number of logs generated
Does not require consistency between master and slave table schema
Mixed
Mixed: best of both worlds in theory. Most queries are replicated by statement. But transactions MySQL knows are non-deterministic are replicated by row. Mixed Mode uses row-based replication for any transaction that:
Uses user defined functions
Uses the UUID(), USER(), or CURRENT_USER() functions
Uses LOAD_FILE (which otherwise assumes every slave has the exact same file on the local file system and doesn't replicate the data)
Updates two tables with auto_increment columns (the binlog format only carries one auto_increment value per statement)
Why MySQL 5.7 default to Row instead of Mixed
Main reason is for backup and data recovery
For example
For delete SQL commands, Row based format will record the entire original record.
For insert SQL commands, Row based format will record the inserted record.
For update SQL commands, Row based format will record the before and after record.
Replication delay
Def
The master-slave latency is defined as the difference between T3 and T1.
Master DB executes a transaction, writes into binlog and finishes at timestamp T1.
The statement is replicated to binlog, Slave DB received it from the binlog T2.
Slave DB executes the transaction and finishes at timestamp T3.
Delay sources
Inferior slave machines: Slave machine is insuperior to master
Too much load for slave
Causes: Many analytical queries run on top of slave.
Solutions:
Multiple slaves
Output telemetry to external statistical systems such as Hadoop through binlog
Big transactions
If a transaction needs to run for as long as 10 minutes on the master database, then it must wait for the transaction to finish before running it on slave. Slave will be behind master for 10 minutes.
e.g. Use del to delete too many records within DB
e.g. mySQL DDL within big tables.
Before MySQL 5.6, there isn't much built-in support for parallel relay log processing. If there is a steady difference between the write speed on master server and relay speed on slave server, then the replication delay between master and slave could become several hours. For more details, please check this file in geektime in Chinese
How to reduce replication delay
Solution1: After write to master, write to cache as well.
What if write to cache fails
If read from master, slave useless
If read from slave, still replication delay
Solution2: If cannot read from slave, then read from master.
It works for DB add operation
It doesn't work for DB update operation
Solution3: If master and slave are located within the same location, synchronous replication
Use cases
Handle old data - Archive
Archive old data in time and save disk space
Use case
If a single SQL table's size exceeds 20M rows, then its performance will be slow. Partitioning and Sharding have already been applied. Due to the use cases, there are some hot and cold data, e.g. ecommerce website orders.
Implementation
MySQL engine for archiving
InnoDB is based on B+ tree, each time a write happens, the clustered index will need to modified again. The high traffic during archiving process decides that InnoDB will not be a great fit.
TukoDB has higher write performance
Create archiving tables
Pt-archiver: One of the utils of Percona-toolkit and used to archive rows from a MySQL table into another table or a file. https://www.percona.com/doc/percona-toolkit/LATEST/pt-archiver.html
Flowchart
Backup
Recovery test on the backup data
High availability with failover
Usually reliability first is preferred to availability first to avoid data consistency problems.
For reliability first failover, the duration for downtime depends on the replication delay between master and slave.
Problems to be solved:
After switch, how to notify applications the new address of master
How to determine the master is available
After switch, how to decide on the master-slave replication relationship
Two servers
Reliability first failover
There will be a brief period when both server A and server B are readonly between step (2-4)
Check the value of slave server's seconds_behind_master, if it is smaller than certain threshold (e.g. 5s), continue to next step; Else repeat this step
Change master server A's state to readonly (set readonly flag to true)
Check the value of slave server's seconds_behind_master until its value becomes 0.
Change slave server B's state to read-write (set readonly flag to false)
Switch the traffic to slave server B
Approaches
Reliability first - After step 2 and before step4 below, both master and slave will be in readonly state.
Availability first failover
There is no blank period for availability. The switch always happens immediately.
When the binlog format is set to row based, it will be easier to discover the inconsistency between master and slave.
Example setup: Server A is master and B is slave.
In mixed format, only the statement is sent along.
In row format, the entire record is sent along so it will be easier to detect the conflict (duplicate key error shown below)
Multiple servers
New challenges - find sync point between multiple servers
It has the following setup
After failing over, A' becomes the new master. The new challenges
Problem with binlog position
Hard to be accurate with MASTER_LOG_POS
Reason for the inaccuracy. The sync point is usually found by locating a timestamp in original master server. Then according to this problematic timestamp, find transaction id in new master server. It is hard to guarantee that slave servers are behind new master server (e.g. might execute the same command twice)
GTID to rescue
Traditional MySQL replication is based on relative coordinates — each replica keeps track of its position with respect to its current primary’s binary log files. GTID enhances this setup by assigning a unique identifier to every transaction, and each MySQL server keeps track of which transactions it has already executed. This permits “auto-positioning,” the ability for a replica to be pointed at a primary instance without needing to specify a binlog filename or position in the CHANGE PRIMARY statement.
Auto-positioning makes failover simpler, faster, and less error-prone. It becomes trivial to get replicas in sync after a primary failure, without requiring an external tool such as Master High Availability (MHA). Planned primary promotions also become easier, as it is no longer necessary to stop all replicas at the same position first. Database administrators need not worry about manually specifying incorrect positions; even in the case of human error, the server is now smart enough to ignore transactions it has already executed.
More detailed explanation available at Geektime MySQL
Last updated