通过Heartbert2 让Mysql Replication 具有HA-技术开发专区

通过Heartbert2 让Mysql Replication 具有HA

作者：IT168 沈刚编辑：覃里 2009-04-16 00:19 来源：IT168�

　　6、失败测试

　　6.1 网络切换测试

　　将ha1 的网线拔掉，在ha2 看资源的切换情况

　　Node: ha2 (0c267980-de77-4421-bcb5-9bb6b0743eef): online

　　Node: ha1 (ab30057d-03f6-4be8-a787-98c5fc7f4c64): OFFLINE

　　Resource Group: group_1

　　IPaddr_10_0_0_180 (ocf::heartbeat:IPaddr): Started ha2

　　Filesystem_2 (ocf::heartbeat:Filesystem): Started ha2

　　mysqld_3 (lsb:mysqld): Started ha2

　　没有问题，把ha1 网线插上，资源又回切到ha1。

　　Node: ha2 (0c267980-de77-4421-bcb5-9bb6b0743eef): online

　　Node: ha1 (ab30057d-03f6-4be8-a787-98c5fc7f4c64): online

　　Resource Group: group_1

　　IPaddr_10_0_0_180 (ocf::heartbeat:IPaddr): Started ha1

　　Filesystem_2 (ocf::heartbeat:Filesystem): Started ha1

　　mysqld_3 (lsb:mysqld): Started ha1

　　6.2 宕机切换

　　将ha1 强制关机在开机，这和6.2 的情况差不多，能正常切换。

　　6.3 服务切换

　　将ha1 的mysql 服务强制中断，修改/etc/my.cf

　　datadir=/u09/data #实际这个路径不存在

　　[root@ha1 ~]# ps -ef|grep mysql

　　root 4597 1 0 10:44 ? 00:00:00 /bin/sh ./bin/mysqld_safe

　　--datadir=/u01/data --pid-file=/u01/data/ha1.pid

　　mysql 4760 4597 0 10:44 ? 00:00:00 /usr/local/mysql/bin/mysqld

　　--basedir=/usr/local/mysql --datadir=/u01/data --user=mysql --log-error=/u01/data/ha1.err

　　--pid-file=/u01/data/ha1.pid --socket=/tmp/mysql.sock --port=3306

　　root 13174 4285 0 11:22 ? 00:00:00 /bin/sh /etc/init.d/mysqld start

　　root 13192 4208 0 11:22 pts/0 00:00:00 grep mysql

　　[root@ha1 ~]# kill 4597 4760

　　[root@ha1 ~]# ps -ef|grep mysql

　　root 13532 4208 0 11:23 pts/0 00:00:00 grep mysql

　　在ha2 上查看资源情况，已经切换过来。

　　Node: ha2 (0c267980-de77-4421-bcb5-9bb6b0743eef): online

　　Node: ha1 (ab30057d-03f6-4be8-a787-98c5fc7f4c64): online

　　Resource Group: group_1

　　IPaddr_10_0_0_180 (ocf::heartbeat:IPaddr): Started ha2

　　Filesystem_2 (ocf::heartbeat:Filesystem): Started ha2

　　mysqld_3 (lsb:mysqld): Started ha2

　　Failed actions:

　　mysqld_3_monitor_120000 (node=ha1, call=10, rc=1): complete

　　mysqld_3_start_0 (node=ha1, call=12, rc=1): complete

　　6.4 slave 同步测试：

　　将ha1 的heartbeat 关闭。

　　[root@ha1 data]# /etc/init.d/heartbeat stop

　　在ha2 看资源情况。

　　Node: ha2 (0c267980-de77-4421-bcb5-9bb6b0743eef): online

　　Node: ha1 (ab30057d-03f6-4be8-a787-98c5fc7f4c64): OFFLINE

　　Resource Group: group_1

　　IPaddr_10_0_0_180 (ocf::heartbeat:IPaddr): Started ha2

　　Filesystem_2 (ocf::heartbeat:Filesystem): Started ha2

　　mysqld_3 (lsb:mysqld): Started ha2

　　资源全部切换到ha2，我们执行一些操作，看是否同步。

　　[root@ha2 ~]# /usr/local/mysql/bin/mysql -u root -p -D test

　　Enter password:

　　Welcome to the MySQL monitor. Commands end with ; or \g.

　　Your MySQL connection id is 30

　　Server version: 5.1.31-log MySQL Community Server (GPL)

　　Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

　　root@test >create table t(t1 int);

　　root@test >insert into t values(1);

　　root@test >select * from t;

　　+------+

　　| t1 |

　　+------+

　　| 1 |

　　+------+

　　1 row in set (0.00 sec)

　　在slave 查看是否同步

　　[root@mysql2 ~]# /root/cron/lg.sh

　　Welcome to the MySQL monitor. Commands end with ; or \g.

　　Your MySQL connection id is 7

　　Server version: 5.1.31-log MySQL Community Server (GPL)

　　Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

　　mysql> use test

　　Database changed

　　mysql> select * from t;

　　+------+

　　| t1 |

　　+------+

　　| 1 |

　　+------+

　　1 row in set (0.00 sec)

　　在将资源切回到ha1，在进行操作，看slave 的情况

　　[root@ha1 ~]# /etc/init.d/heartbeat start

　　Node: ha2 (0c267980-de77-4421-bcb5-9bb6b0743eef): online

　　Node: ha1 (ab30057d-03f6-4be8-a787-98c5fc7f4c64): online

　　Resource Group: group_1

　　IPaddr_10_0_0_180 (ocf::heartbeat:IPaddr): Started ha1

　　Filesystem_2 (ocf::heartbeat:Filesystem): Started ha1

　　mysqld_3 (lsb:mysqld): Started ha1

　　资源已经切回到ha1，对其进行操作。

　　[root@ha1 ~]# /usr/local/mysql/bin/mysql -u root -p -D test

　　Enter password:

　　Welcome to the MySQL monitor. Commands end with ; or \g.

　　Your MySQL connection id is 3

　　Server version: 5.1.31-log MySQL Community Server (GPL)

　　Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

　　mysql> insert into t values(2);

　　mysql> insert into t values(3);

　　mysql> select * from t;

　　+------+

　　| t1 |

　　+------+

　　| 1 |

　　| 2 |

　　| 3 |

　　+------+

　　3 rows in set (0.00 sec)

　　在slave 查看是否同步

　　mysql> select * from t;

　　+------+

　　| t1 |

　　+------+

　　| 1 |

　　| 2 |

　　| 3 |

　　+------+

　　3 rows in set (0.00 sec)

　　mysql> show slave status\G;

　　*************************** 1. row ***************************

　　Slave_IO_State:Waiting for master to send event

　　Master_Host: 10.0.0.180

　　Master_User: slave162

　　Master_Port: 3306

　　Connect_Retry: 60

　　Master_Log_File: master.000003

　　Read_Master_Log_Pos: 278

　　Relay_Log_File: mysql2-relay-bin.000023

　　Relay_Log_Pos: 420

　　Relay_Master_Log_File: master.000003

　　Slave_IO_Running: Yes

　　Slave_SQL_Running: Yes

　　Master 切换后，slave 无需任何人工介入，自动同步。

　　6.5 模拟写入，切换测试。

　　写个简单的循环，

　　[root@mysql2 cron]# cat test.sh

　　for ((num=1;num<10000000;num=num+1))

　　echo

　　/usr/local/mysql/bin/mysql -u sg -psg109504 -h 10.0.0.180 -D test -e"

　　insert into t values($num);

　　if (( $? )); then

　　echo $num:no

　　else

　　echo $num:ok

　　done

　　在写入过程中，手工切换，手工reboot 等操作，和6.4 的测试结果一样。在写入过程中强制ha1 断电，需要手工处理slave 同步。将ha1 断电后，资源切到ha2 后，在slave 会报错：

　　mysql> show slave status\G;

　　*************************** 1. row ***************************

　　Slave_IO_State:

　　Master_Host: 10.0.0.180

　　Master_User: slave162

　　Master_Port: 3306

　　Connect_Retry: 60

　　Master_Log_File: master.000023

　　Read_Master_Log_Pos: 657144

　　Relay_Log_File: mysql2-relay-bin.000013

　　Relay_Log_Pos: 657286

　　Relay_Master_Log_File: master.000023

　　Slave_IO_Running: No

　　Slave_SQL_Running: Yes

　　查看slave 报错日志：

　　090225 16:54:51 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log

　　'master.000023' at postion 657144

　　090225 16:54:51 [ERROR] Error reading packet from server: Client requested master to start

　　replication from impossible position ( server_errno=1236)

　　090225 16:54:51 [ERROR] Got fatal error 1236: 'Client requested master to start replication from

　　impossible position' from master when reading data from binary log

　　090225 16:54:51 [Note] Slave I/O thread exiting, read up to log 'master.000023', position 657144

　　在master.000023 中找不到position 657144。

　　查看master 的binlog。

　　[root@ha2 data]# /usr/local/mysql/bin/mysqlbinlog master.000023|tail -10

　　/*!*/;

　　# at 610152

　　#090225 16:50:52 server id 180 end_log_pos 610241 Query thread_id=6871

　　exec_time=0 error_code=0

　　SET TIMESTAMP=1235551852/*!*/;

　　insert into t values(6867)

　　/*!*/;

　　DELIMITER ;

　　# End of log file

　　ROLLBACK /* added by mysqlbinlog */;

　　/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

　　在master 上，master.000023 最后值是610152，而在slave 却是657144。Slave 的Log_Pos比master 大，应该是在断电的时候，binlog 有部分没有及时写入磁盘，但从库已经读入。我们需要在slave 执行change maseter 操作，MASTER_LOG_FILE 在原来的基础上加1，Log_Pos 为1。

　　mysql> stop slave;

　　Query OK, 0 rows affected (0.00 sec)

　　mysql> CHANGE MASTER TO MASTER_HOST='10.0.0.180',MASTER_PORT =

　　3306,MASTER_USER

　　='slave162',MASTER_PASSWORD='nslave162',MASTER_LOG_FILE='master.000024',MAST

　　ER_LOG_POS = 000000001;

　　Query OK, 0 rows affected (0.00 sec)

　　mysql> start slave;

　　Query OK, 0 rows affected (0.00 sec)

　　mysql> show slave status\G;

　　*************************** 1. row ***************************

　　Slave_IO_State:Waiting for master to send event

　　Master_Host: 10.0.0.180

　　Master_User: slave162

　　Master_Port: 3306

　　Connect_Retry: 60

　　Master_Log_File: master.000024

　　Read_Master_Log_Pos: 106

　　Relay_Log_File: mysql2-relay-bin.000003

　　Relay_Log_Pos: 248

　　Relay_Master_Log_File: master.000025

　　Slave_IO_Running: Yes

　　Slave_SQL_Running: Yes

　　在这次断电的切换过程中，将会有657144 减去610152 条binlog 操作在slave 存在，而在masert 没有。在后面的复制中可能会产生错误，不过这些错误影响不大。如果对数据的同步要求很严格，可以在主库添加参数sync_binlog=1，这样最多将只会有1 条数据错误，不过这样将影响mysql 写性能。

第1页：前言第2页：网络、磁盘设置 MYSQL安装第3页：mysql slave 配置、同步第4页：失败测试第5页：heartbeat crm 常用命令第6页：讨论

关注我们