Database Replication Trouble Shooting
Database replication is a robust solution that should handle issues such as loss of network connection between the two servers. However certain chains of events can cause replication to fail. These are where transactions created on one server (whilst there is no connection between the two servers) can then not be implemented on the other server.
- Sync job runs on both servers whilst connection is lost leading to trying to delete the same account twice
- Sync jobs running at exactly the same time on both servers possibly leading to trying to create the same account twice.
An issue like these will stop database changes from replicating and the databases will become out of sync.
Checking Replication Status
To check the replication status of the two databases log onto both servers and access the database console
Then on each server type the command
show slave status\G;
You should see on both servers
Slave_IO_Running: Yes Slave_SQL_Running: Yes
You can also run
show master status;
and you would expect to see the value in the position column (626 in this example) to match the value for Read_Master_Log_Pos from the show slave status command on the*other* sever.
If the slave has stopped running you can try and restart them by running the start slave command from within the MySQL console on both servers.
if after this command show slave status still shows that they are not both running that the replication needs to be repaired.
Scripts are supplied to help repair replication. These scripts are under /opt/idenprotect/scripts. If you do not have these on your installation email firstname.lastname@example.org
The scripts take a back-up of each database, however, it is highly recommended that a manual back up of both databases is taken before attempting to repair replication
The replication repair process will copy across the "correct database" to the "out of date" database. Before beginning this process it has to be decided which database is going to be preserved.
- Stop the idenprotect service from running on both servers
- Run the script resetReplication.sh on both servers this will stop replication, and take a back up of the database
- Check that the back up has been created, it will be under /etc/idenprotect and be called database<date>.sql where <date> is today's date
- Run the script restartReplication.sh, this will recreate the database and then restart replication.
- Only on the server that has the database to be preserved run the script restoreDatabase.sh. This will restore the database locally and also recreate an identical copy on the other server
- Start the idenprotect service on both servers
Once this process has been completed you can check the replication status as described earlier.