Discussion:
[drbd-mc] New version problems
Peter Ambroz
2011-01-17 14:56:08 UTC
Permalink
Hello,

I upgraded to drbd-mc-0.8.10, and now after few seconds since start of
console, it says my cluster is down.
Even though the cluster is in fact healthy and running, crm_mon proves
it. I didn't have this problem in 0.8.8.
The only weird thing in 0.8.8 was that it used to think that corosync is
stopped on both nodes. But it didn't affect anything. Maybe now it does..

A picture is worth 1000 words: Loading Image...

Peter Ambroz
Rasto Levrinc
2011-01-17 15:10:57 UTC
Permalink
Post by Peter Ambroz
Hello,
I upgraded to drbd-mc-0.8.10, and now after few seconds since start of
console, it says my cluster is down. Even though the cluster is in fact
healthy and running, crm_mon proves it. I didn't have this problem in
0.8.8.
The only weird thing in 0.8.8 was that it used to think that corosync is
stopped on both nodes. But it didn't affect anything. Maybe now it does..
A picture is worth 1000 words: http://files.guri.sk/drbd.png
This must be one of those weird gentoo issues. :) DRBD MC checks
/etc/init.d/corosync status or /etc/init.d/openais status, to find out if
corosync is running. I assume now that on gentoo corosync is not called
openais.
/etc/init.d/corosync status should return 0, if corosync is running. If
these scripts don't have status parameter,
readlink -f /proc/`pidof corosync`/exe is executed and compared if it is
/usr/sbin/corosync.

I suspect that on gentoo /etc/init.d/corosync doesn't know status
parameter and the corosync program is not /usr/sbin/corosync. Can you
check it?

Rasto
--
: Dipl-Ing Rastislav Levrinc
: DRBD MC http://oss.linbit.com/drbd-mc/
: DRBD MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
Peter Ambroz
2011-01-17 15:17:53 UTC
Permalink
Post by Rasto Levrinc
Post by Peter Ambroz
Hello,
I upgraded to drbd-mc-0.8.10, and now after few seconds since start of
console, it says my cluster is down. Even though the cluster is in fact
healthy and running, crm_mon proves it. I didn't have this problem in
0.8.8.
The only weird thing in 0.8.8 was that it used to think that corosync is
stopped on both nodes. But it didn't affect anything. Maybe now it does..
A picture is worth 1000 words: http://files.guri.sk/drbd.png
This must be one of those weird gentoo issues. :) DRBD MC checks
/etc/init.d/corosync status or /etc/init.d/openais status, to find out if
corosync is running. I assume now that on gentoo corosync is not called
openais.
Init scipt is OK, returns 0
# /etc/init.d/corosync status; echo $?
* status: started
0
Post by Rasto Levrinc
/etc/init.d/corosync status should return 0, if corosync is running. If
these scripts don't have status parameter,
readlink -f /proc/`pidof corosync`/exe is executed and compared if it is
/usr/sbin/corosync.
Readlink method also works for me
# readlink -f /proc/`pidof corosync`/exe
/usr/sbin/corosync
Post by Rasto Levrinc
I suspect that on gentoo /etc/init.d/corosync doesn't know status
parameter and the corosync program is not /usr/sbin/corosync. Can you
check it?
Gentoo init scripts seem to be OK, also working with LSB resources.

Comp
Rasto Levrinc
2011-01-17 15:27:31 UTC
Permalink
Post by Peter Ambroz
Post by Rasto Levrinc
This must be one of those weird gentoo issues. :) DRBD MC checks
/etc/init.d/corosync status or /etc/init.d/openais status, to find out
if corosync is running. I assume now that on gentoo corosync is not
called openais.
Init scipt is OK, returns 0
# /etc/init.d/corosync status; echo $?
* status: started
0
Post by Rasto Levrinc
/etc/init.d/corosync status should return 0, if corosync is running. If
these scripts don't have status parameter, readlink -f /proc/`pidof
corosync`/exe is executed and compared if it is /usr/sbin/corosync.
Readlink method also works for me
# readlink -f /proc/`pidof corosync`/exe
/usr/sbin/corosync
Post by Rasto Levrinc
I suspect that on gentoo /etc/init.d/corosync doesn't know status
parameter and the corosync program is not /usr/sbin/corosync. Can you
check it?
Gentoo init scripts seem to be OK, also working with LSB resources.
That's bad. I mean that's good, but I don't know what's going on. Can you
send me an output from one of the nodes?

/usr/local/bin/drbd-gui-helper-0.8.10 get-cluster-versions


Thanks,

Rasto
--
: Dipl-Ing Rastislav Levrinc
: DRBD MC http://oss.linbit.com/drbd-mc/
: DRBD MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
Peter Ambroz
2011-01-17 15:33:27 UTC
Permalink
Post by Rasto Levrinc
Post by Peter Ambroz
Post by Rasto Levrinc
This must be one of those weird gentoo issues. :) DRBD MC checks
/etc/init.d/corosync status or /etc/init.d/openais status, to find out
if corosync is running. I assume now that on gentoo corosync is not
called openais.
Init scipt is OK, returns 0
# /etc/init.d/corosync status; echo $?
* status: started
0
Post by Rasto Levrinc
/etc/init.d/corosync status should return 0, if corosync is running. If
these scripts don't have status parameter, readlink -f /proc/`pidof
corosync`/exe is executed and compared if it is /usr/sbin/corosync.
Readlink method also works for me
# readlink -f /proc/`pidof corosync`/exe
/usr/sbin/corosync
Post by Rasto Levrinc
I suspect that on gentoo /etc/init.d/corosync doesn't know status
parameter and the corosync program is not /usr/sbin/corosync. Can you
check it?
Gentoo init scripts seem to be OK, also working with LSB resources.
That's bad. I mean that's good, but I don't know what's going on. Can you
send me an output from one of the nodes?
/usr/local/bin/drbd-gui-helper-0.8.10 get-cluster-versions
Here.. equal on both nodes.
moose ~ # drbd-gui-helper-0.8.10 get-cluster-versions
/etc/corosync/corosync.conf
hb:
pm:1.0.9
cs:1.2.8
ais:wrapper
hb-rc:off
cs-ais-rc:off
hb-running:127
cs-ais-running:127
hb-conf:2
cs-ais-conf:on
drbd:8.3.8.1
drbd-mod:
drbd-loaded:1

Comp
Rasto Levrinc
2011-01-17 15:53:00 UTC
Permalink
Post by Peter Ambroz
Here.. equal on both nodes.
moose ~ # drbd-gui-helper-0.8.10 get-cluster-versions
/etc/corosync/corosync.conf
pm:1.0.9
cs:1.2.8
ais:wrapper
hb-rc:off
cs-ais-rc:off
hb-running:127
cs-ais-running:127
hb-conf:2
cs-ais-conf:on
drbd:8.3.8.1
drbd-loaded:1
The problem is that there's /usr/sbin/aisexec program that is a wrapper
script for corosync. As a quick fix you can rename the aisexec. I don't
think, that it is used by anything in your case. I'll try to make a proper
fix later.

Rasto
--
: Dipl-Ing Rastislav Levrinc
: DRBD MC http://oss.linbit.com/drbd-mc/
: DRBD MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
Continue reading on narkive:
Loading...