Discussion:
[drbd-mc] Kind of timeout problem
INVITU
2012-03-23 19:57:45 UTC
Permalink
Dear Rasto

First, thanks for your job

I have 40 ressources (drbd & vm) onto 2 nodes working with
corosync&pacemaker.
The cluster management is no longer operational in CRM view. All seem to
be stucked. (I still have access to DRBD management)
It was working fine until I had 30 ressources

I launch LCMC from the LAN so there are no network problems

I can't find any logs.
Could you help me debugging ?

Thanks in advance

Best regards

Cyril
Rasto Levrinc
2012-03-24 08:01:53 UTC
Permalink
Post by INVITU
Dear Rasto
First, thanks for your job
I have 40 ressources (drbd & vm) onto 2 nodes working with
corosync&pacemaker.
The cluster management is no longer operational in CRM view. All seem to be
stucked. (I still have access to DRBD management)
It was working fine until I had 30 ressources
I launch LCMC from the LAN so there are no network problems
I can't find any logs.
Could you help me debugging ?
There were some fixes in this area in the upcoming release, could you
please see if it has been fixed for you?

http://sourceforge.net/projects/lcmc/files/testing/

If not run it with --debug 3 option and send me the output.

There are still some timeouts, that could effect this, there are not
configurable at the moment, but it would be easy to do so.

There were some changes in 1.3.0, could you try the version 1.2.3, if
there was the same problem?

http://sourceforge.net/projects/lcmc/files/all-releases/

It could also be that not number of the resources, but something in the
config causes the freeze. Could you post your crm configure show?

What OS and Java version do you have?

Thanks,

Rasto
--
Dipl.-Ing. Rastislav Levrinc
rasto.levrinc at gmail.com
Linux Cluster Management Console
http://lcmc.sf.net/
INVITU
2012-03-26 23:23:24 UTC
Permalink
Thanks to Rasto, problem was solved

On the servers side, ssh must be tuned as LCMC can use 5 ssh sessions
on each server
ssh timeout was also one of the reasons for my problem
Post by Rasto Levrinc
Post by INVITU
Dear Rasto
First, thanks for your job
I have 40 ressources (drbd& vm) onto 2 nodes working with
corosync&pacemaker.
The cluster management is no longer operational in CRM view. All seem to be
stucked. (I still have access to DRBD management)
It was working fine until I had 30 ressources
I launch LCMC from the LAN so there are no network problems
I can't find any logs.
Could you help me debugging ?
There were some fixes in this area in the upcoming release, could you
please see if it has been fixed for you?
http://sourceforge.net/projects/lcmc/files/testing/
If not run it with --debug 3 option and send me the output.
There are still some timeouts, that could effect this, there are not
configurable at the moment, but it would be easy to do so.
There were some changes in 1.3.0, could you try the version 1.2.3, if
there was the same problem?
http://sourceforge.net/projects/lcmc/files/all-releases/
It could also be that not number of the resources, but something in the
config causes the freeze. Could you post your crm configure show?
What OS and Java version do you have?
Thanks,
Rasto
Loading...