DB migration

Flowchart
Initial load phase
- mysqldump
  - Best practices
- XtraBackup
Dual write phase
- Primary key
  - innodb_autoinc_lock_mode parameter
  - Consistency
- Dual write flag in ORM/AOP
  - UpdateTime field based approach
    Flowchart
    Soft delete on source table
  - Binlog triggered

Flowchart

The motivation for introduce the third phase: In case something goes wrong on the target table, you could rollback to the original table.

Initial load phase

There are two typically used DB backup tools: mysqldump and XtraBackup.

mysqldump

Pros:
- Free
- Could backup the entire DB, including table structure and data
Cons:
- Backup time could be long

Best practices

To increase speed when using mysqldump tool, there are a couple options:
- Close the unique key check and external key check because source table already guarantees this.
- Close binlog
- Adjust the disk flush time of redo log by setting innodb_flush_log_at_trx_commit = 0

XtraBackup

Pros:
- Faster backup speed
- Support incremental backup
- Backup process won't interrupt executing transactions
Cons:
- Only applicable to InnoDB engine
- Poor cross-platform compatibility.

Dual write phase

Primary key

innodb_autoinc_lock_mode parameter

"innodb_autoinc_lock_mode" parameter determines how InnoDB generates primary key. All three cases rely on table-wise lock.
- 0: The table lock will be released after insert statements are executed.
- 1: The table lock release times depend on SQL statements:
  - For INSERT INTO VALUE / INSERT INTO VALUES, the primary key lock will be released immediately after primary keys are generated, instead of after SQL statements are executed.
  - For INSERT / SELECT, the primary key lock will be released after SQL statements are executed.
- 2: The table lock will be released immediately after primary keys are generated.

Consistency

When writing to the first table of dual write, return the primary key and use it in subsequent table.

Dual write flag in ORM/AOP

Typically there is a dual write flag inside the AOP/ORM.
This flag controls the dual write mode.

UpdateTime field based approach

Flowchart

for {
  // Execute the SQL query
  // SELECT * FROM xx WHERE update_time >= last_time
  rows := findUpdatedRows()

  for row in rows {
    // Use row primary key / id to find corresponding rows in target table
    tgtRow = findTgt(row.id)
    if row != tgtRow {
      // fix data
      fix()
    }
  }

  // Record the maximum timestamp of these records, use for next query. 
  last_time = maxUpdateTime(row)

  // Sleep for a while
  sleep(1s)
}

Soft delete on source table

If using hard delete on source table, delete failures on target table won't be discovered.

If we need to use hard delete on source table, there needs to be a reverse check from target table to source table.

Binlog triggered

PreviousScalability NextPartition

Last updated 1 year ago

Was this helpful?