Coming up with a backup strategy that does not affect database performance or lock your tables can be tricky. How do you backup your production database cluster without affecting your applications? Should you use mysqldump or Percona Xtrabackup? When should you use incremental backups? Where do you store the backups? In this article, we will cover some of the common backup methods for Galera Cluster for MySQL/MariaDB, and how you can get the most out of these.
There are various ways to backup your Galera Cluster data:
Xtrabackup is an open-source MySQL hot backup utility from Percona. It is a combination of xtrabackup (built in C) and innobackupex (built on Perl) and can back up data from InnoDB, XtraDB and MyISAM tables.
Xtrabackup does not lock your database during the backup process. For large databases (100+ GB), it provides much better restoration time as compared to mysqldump. The restoration process involves preparing MySQL data from the backup files before replacing or switching it with the current data directory on the target node. However, the restoration process is not very straightforward.
If you do not want to backup your entire database every single time, then you should look into incremental backup. Xtrabackup supports incremental backup where it can copy the data that has changed since the last backup. You can have many incremental backups between each full backup. For every incremental backup, you need information on the last one you did so it knows where to start the new one.
Note that without a full backup to start from, the incremental backups are useless.
This is probably the most popular backup method for MySQL. mysqldump is the perfect tool when migrating data between different versions or storage engines of MySQL, or when you need to change something in the text file dump before restoring it. Use it in conjunction with xtrabackup to give you more recovery options.
You may perform mysqldump against all databases by using the –single-transaction option. When using –single-transaction you will get a consistent backup for the Innodb tables without making database read only. So –single-transaction does not work if you have MyISAM tables (so these would be inconsistent). However when using Galera Cluster, all tables should be InnoDB (except the mysql system tables, but that is okay).
This means that using mysqldump is safe, but it has drawbacks. mysqldump will do, for each database and for each table, “SELECT * FROM
” and write the content to the mysqldump file. The problem with the “SELECT * FROM .. ” is if you have tables (and a data set/DB size) that does not fit in the innodb buffer pool. The active data set (that your application uses) will take a hit when the SELECT * FROM .. will load up data from disk, store the pages in the InnoDB buffer pool, and to do so, expunge pages part of the active data set from the InnoDB buffer pool, and put them on disk.
Hence you will get a performance degradation on that node, since the active data set is no longer in RAM but on DISK (if the InnoDB buffer pool is not large enough to fit the entire database).
If you want to avoid that, then use xtrabackup. Nevertheless, it is common to use –single-transaction and it does not block the nodes (except for a very short time when a START TRANSACTION is made, but that can be neglected). And yes, all nodes can still perform read and writes. But you will take a performance hit in the cluster while mysqldump is running – since CPU, DISK and RAM are used by the mysqldump process. A Galera Cluster is as fast the slowest running node.**
**Note that this is only true if “wsrep_desync=OFF”. When you allow the node to desync from the cluster momentarily, the cluster performance won’t get degraded for the duration of desync, which is suitable for backup workloads. However there is a risk if the node doesn’t get back in sync before desync is disabled, it still may cause some impact on the cluster.