Discussion:
[drbd-mc] DRBD sync stalled after abrupt server shutdown
vijay patel
2012-04-25 02:20:42 UTC
Permalink
Hi Friends,

I am having DRBD and Heartbeat setup for Production file server. I am having two servers in cluster with following versions on RHEL 5.8.

[vpatel at Pd02 ~]$ rpm -qa | grep drbd
drbd82-8.2.6-1.el5.centos
kmod-drbd82-8.2.6-2
[vpatel at Pd02 ~]$ rpm -qa | grep heartbeat
heartbeat-gui-2.1.4-11.el5
heartbeat-pils-2.1.4-11.el5
heartbeat-2.1.4-11.el5
heartbeat-ldirectord-2.1.4-11.el5
heartbeat-stonith-2.1.4-11.el5
heartbeat-devel-2.1.4-11.el5


Till now every thing was working fine. Today we had some outage and one of the drbd server was shutdown abruptly. Once server came up both servers became out of sync. On secondary server i found message of split brain detected in logs.

In order to sync two servers when i am running 'connect' command, sync is getting started but gets stalled after few seconds. Below is the output of the commands i tried.

DRBD Secondary Server :

[root at Pd02 ~]# drbdadm disconnect all
[root at Pd02 ~]# drbdadm -- --discard-my-data connect all
[root at Pd02 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-10-03 11:30:17
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
ns:0 nr:0 dw:8041264 dr:929 al:46 bm:102 lo:0 pe:0 ua:0 ap:0 oos:1260216
[>....................] sync'ed: 0.4% (1260216/1260216)K
finish: 1:45:01 speed: 0 (0) K/sec
[root at Pd02 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-10-03 11:30:17
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
ns:0 nr:0 dw:8041264 dr:929 al:46 bm:102 lo:0 pe:0 ua:0 ap:0 oos:1260216
[>....................] sync'ed: 0.4% (1260216/1260216)K
finish: 4:22:32 speed: 0 (0) K/sec
[root at Pd02 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-10-03 11:30:17
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
ns:0 nr:0 dw:8041264 dr:929 al:46 bm:102 lo:0 pe:0 ua:0 ap:0 oos:1260216
[>....................] sync'ed: 0.4% (1260216/1260216)K
finish: 8:45:05 speed: 0 (0) K/sec
[root at Pd02 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-10-03 11:30:17
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
ns:0 nr:412 dw:8020164 dr:929 al:46 bm:102 lo:0 pe:0 ua:0 ap:0 oos:1260216
[>....................] sync'ed: 0.4% (1260216/1260216)K
finish: 17:30:10 speed: 0 (0) K/sec
[root at Pd02 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-10-03 11:30:17
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate B r---
ns:0 nr:412 dw:8020164 dr:929 al:46 bm:102 lo:0 pe:0 ua:0 ap:0 oos:1260216
[>....................] sync'ed: 0.4% (1260216/1260216)K
stalled

I tried to force sync via invalidate command but got following message.

[root at Pd02 ~]# drbdadm invalidate all
Device '/dev/drbd0' is configured!
Command 'drbdmeta /dev/drbd0 v08 /dev/sdb2 0 invalidate' terminated with exit code 20
command exited with code 20


Can anyone tell me how can i sync my drbd servers?

Regards,
Vijay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-mc/attachments/20120425/fe693582/attachment.htm>
Trevor Hemsley
2012-04-25 08:42:44 UTC
Permalink
On 25/04/12 03:20, vijay patel wrote:
> Hi Friends,
>
> I am having DRBD and Heartbeat setup for Production file server. I am
> having two servers in cluster with following versions on RHEL 5.8.
>
> [vpatel at Pd02 ~]$ rpm -qa | grep drbd
> drbd82-8.2.6-1.el5.centos
> kmod-drbd82-8.2.6-2
> [vpatel at Pd02 ~]$ rpm -qa | grep heartbeat
> heartbeat-gui-2.1.4-11.el5
> heartbeat-pils-2.1.4-11.el5
> heartbeat-2.1.4-11.el5
> heartbeat-ldirectord-2.1.4-11.el5
> heartbeat-stonith-2.1.4-11.el5
> heartbeat-devel-2.1.4-11.el5
>
>
> Till now every thing was working fine. Today we had some outage and
> one of the drbd server was shutdown abruptly. Once server came up both
> servers became out of sync. On secondary server i found message of
> split brain detected in logs.
>
> In order to sync two servers when i am running 'connect' command, sync
> is getting started but gets stalled after few seconds. Below is the
> output of the commands i tried.

You're running a pretty old version of DRBD there. The CentOS 'extras'
repo has 8.3.12 in it and that upgrade fixed someone else's problem with
sync stalling the other day.

Trevor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-mc/attachments/20120425/897b2833/attachment.htm>
vijay patel
2012-04-25 13:31:34 UTC
Permalink
Isn't there any other way to solve?? Earlier when there was sync issues i was able to solve them with connect all command. I don't know what is happening this time.

Updates, i can apply during weekend only as i have to take downtime and apply.

Regards,
Vijay




Date: Wed, 25 Apr 2012 09:42:44 +0100
From: themsley at voiceflex.com
To: catchvjay at hotmail.com
CC: drbd-mc at lists.linbit.com
Subject: Re: [drbd-mc] DRBD sync stalled after abrupt server shutdown

On 25/04/12 03:20, vijay patel wrote:



Hi Friends,

I am having DRBD and Heartbeat setup for Production file server. I am having two servers in cluster with following versions on RHEL 5.8.

[vpatel at Pd02 ~]$ rpm -qa | grep drbd
drbd82-8.2.6-1.el5.centos
kmod-drbd82-8.2.6-2
[vpatel at Pd02 ~]$ rpm -qa | grep heartbeat
heartbeat-gui-2.1.4-11.el5
heartbeat-pils-2.1.4-11.el5
heartbeat-2.1.4-11.el5
heartbeat-ldirectord-2.1.4-11.el5
heartbeat-stonith-2.1.4-11.el5
heartbeat-devel-2.1.4-11.el5


Till now every thing was working fine. Today we had some outage and one of the drbd server was shutdown abruptly. Once server came up both servers became out of sync. On secondary server i found message of split brain detected in logs.

In order to sync two servers when i am running 'connect' command, sync is getting started but gets stalled after few seconds. Below is the output of the commands i tried.

You're running a pretty old version of DRBD there. The CentOS 'extras' repo has 8.3.12 in it and that upgrade fixed someone else's problem with sync stalling the other day.

Trevor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-mc/attachments/20120425/a752f51a/attachment.htm>
Loading...