Am Dienstag, 7. November 2006 00:16 schrieb Montrose, Ernest:
>
Hi all,
>
I have submitted this issue before, sorry for resubmit. Essentially, on
>
the primary node if I do an ifdown on the heartbeat interface and I have
>
fencing enable to say "resource-only" then on the primary node the
>
outdate-peer script gets called twice. Once for state Disconnecting,
>
and the other for state Networkfailure.
Maybe the return code of the outdate-peer handler did not indicated
success.
>
>
I also notice that if on a node that is primary, I issue a "drbdadm
>
secondary r0" the outdate-peer script gets called again from
>
drbd_set_role() this time.
Maybe the return code of the outdate-peer handler did not indicated
success.
>
What is the exact policy for the outdate-peer script?
This is the section of the ROADMAP file, that describes how it *should*
work:
7 Handle split brain situations; Support IO fencing;
New commands:
drbdadm outdate r0
When the device is configured this works via an ioctl() call.
In the other case it modifies the meta data directly by
calling drbdmeta.
remove option: on-disconnect
New meta-data flag: "Outdated"
introduce:
disk {
fencing [ dont-care | resource-only | resource-and-stonith ];
}
handlers {
outdate-peer "some script";
}
If the disk state of the peer is unknown, drbd calls this
handler (yes a call to userspace from kernel space). The handler's
returncodes are:
3 -> peer is inconsistent
4 -> peer is outdated (this handler outdated it) [ resource fencing ]
5 -> peer was down / unreachable
6 -> peer is primary
7 -> peer got stonithed [ node fencing ]
Let us assume that we have two boxes (N1 and N2) and that these
two boxes are connected by two networks (net and cnet [ clinets'-net ]).
Net is used by DRBD, while heartbeat uses both, net and cnet
I know that you are talking about fencing by STONITH, but DRBD is
not limited to that. Here comes my understanding of how resource fencing
should works with DRBDv8 :
N1 net N2
P/S --- S/P everything up and running.
P/? - - S/? network breaks ; N1 freezes IO
P/? - - S/? N1 fences N2:
In the STONITH case: turn off N2.
In the resource fencing case:
N1 asks N2 to fence itself from the storage via cnet.
HB calls "drbdadm outdate r0" on N2.
N2 replies to N1 that fencing is done via cnet.
The outdate-peer script on N1 returns sucess to DRBD.
P/D - - S/? N1 thaws IO
N2 got the the "Outdated" flag set in its meta-data, by the outdate
command.
The fencing is set to resource-only enables this behaviour. In the
resource-only case the outdate-peer handler should have a return
value of 3, 4, 5 or 6, but should not return 7.
In case "fencing" is set to "resource-and-stonith", all IO operations
get immediately frozen (even all currently outstanding IO operations
will not finish) upon loss of connection.
Then the "outdate-peer" handler is started. In this configuration
the outdate peer handler might return any of the documented return
values.
When the outdate-peer handler returns IO is resumed.
Notes:
* Why do we need to freeze IO in the "resource-and-stonith" case:
Stonith protects you when all communication pathes fail. In
that case both (isolated) nodes try to stonith each other.
If the current primary would continue to allow IO it could
accept transactions, but could get stonithed by the
currently secondary node.
-> Therefore others could see commited transactions that
would be gone after the successfull stonith operation.
* The outedate peer handler also gets called if an unconnected
secondary wants to become primary.
In other words it only may become primary when it knows that
the peer is outdated/inconsistent.
* We need to store the fact that the peer is outdated/inconsistent
in the meta-data. To allow an stand allone primary to be rebooted.
Does this answer your question ?
-Phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria
http://www.linbit.com :
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: larger than 4tb volume status
Am Dienstag, 7. November 2006 00:44 schrieb Federico Grau:
> Hello,
>
> What is the plan for supporting volumes larger than 4terabytes, I don't see
> any mention to that in the svn ROADMAP file.
>
Our current business model has two tree steams of revenue in the DRBD area
* Support contracts
* Doing projects, workshops and consulting in the DRBD area
* Selling DRBD+ licenses
Wile keeping the basis technology open source...
In case you have two times 4TB of storage you spent already quite some
money on hard disks. In that case we expect you to support DRBD with
buying a DRBD+ license. See http://www.linbit.com/en/drbd/drbd-plus/2/
-Phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :
Next Message by Date:
click to view message preview
Re: DRBD8: Split-brain false positive on Primary/primary potential patch
Am Dienstag, 7. November 2006 00:47 schrieb Montrose, Ernest:
> When running Primary/Primary if the Heartbeat connection goes down when
> we recover we always split brain. Simon had an idea which I have
> implemented. He is on vacation so this may not reflect his exact idea.
>
> Essentially with this change, we do not create a new current UUID on the
> node unless I/O is seen. This prevent Split-Brain mitigation when both
> nodes are primary but only one node is originating I/O and never the
> other. He is only stand-by in that case.
>
> Take a look and let me know.
Hi Ernest,
I understand your reasoning, I see the patch, which I guess does
what you expect of it.
I do not want to do it that way for the following reasons:
* It is only applicable in case you are using a 1-node filesystem
on a primary-primary DRBD cluster.
* I do not want users to do this. Because with this setup it is
easily possible to mount the FS on both nodes concurrently.
I want to protect the from themselfs ;)
* Users using a 1-node filesystem should use DRBD withe
primary and secondary role.
* I rather want to fix DRBD's split brain recovery methods to deal
with a cluster crash of a primary-primary cluster (actually this
is item 41 in the ROADMAP file)
I have a few hours time today, I will work on this today...
-Phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :
Previous Message by Thread:
click to view message preview
DRBD8: Fencing and outdate-peer handler getting called multiple times
Hi all,
I have submitted this issue before, sorry for resubmit. Essentially, on
the primary node if I do an ifdown on the heartbeat interface and I have
fencing enable to say "resource-only" then on the primary node the
outdate-peer script gets called twice. Once for state Disconnecting,
and the other for state Networkfailure.
I also notice that if on a node that is primary, I issue a "drbdadm
secondary r0" the outdate-peer script gets called again from
drbd_set_role() this time.
What is the exact policy for the outdate-peer script?
Thanks,
EM--
Next Message by Thread:
click to view message preview
RE: DRBD8: Fencing and outdate-peer handler getting called multiple times
Phil,
My outdate-peer handler simply says "echo /deve/drbd0: Running handler for
outdate-peer >>/tmp/drbdio.log"
And that log file is populated and created so I can only assume that did
indicate success. In this case success being an exit status of 0.
Thanks,
EM--
-----Original Message-----
From: Philipp Reisner
[mailto:philipp.reisner-63ez5xqkn6DQT0dZR+AlfA@xxxxxxxxxxxxxxxx]
Sent: Thursday, November 16, 2006 3:54 AM
To: drbd-dev-63ez5xqkn6DQT0dZR+AlfA@xxxxxxxxxxxxxxxx
Cc: Montrose, Ernest
Subject: Re: [Drbd-dev] DRBD8: Fencing and outdate-peer handler getting called
multiple times
Am Dienstag, 7. November 2006 00:16 schrieb Montrose, Ernest:
> Hi all,
> I have submitted this issue before, sorry for resubmit. Essentially, on
> the primary node if I do an ifdown on the heartbeat interface and I have
> fencing enable to say "resource-only" then on the primary node the
> outdate-peer script gets called twice. Once for state Disconnecting,
> and the other for state Networkfailure.
Maybe the return code of the outdate-peer handler did not indicated
success.
>
> I also notice that if on a node that is primary, I issue a "drbdadm
> secondary r0" the outdate-peer script gets called again from
> drbd_set_role() this time.
Maybe the return code of the outdate-peer handler did not indicated
success.
> What is the exact policy for the outdate-peer script?
This is the section of the ROADMAP file, that describes how it *should*
work:
7 Handle split brain situations; Support IO fencing;
New commands:
drbdadm outdate r0
When the device is configured this works via an ioctl() call.
In the other case it modifies the meta data directly by
calling drbdmeta.
remove option: on-disconnect
New meta-data flag: "Outdated"
introduce:
disk {
fencing [ dont-care | resource-only | resource-and-stonith ];
}
handlers {
outdate-peer "some script";
}
If the disk state of the peer is unknown, drbd calls this
handler (yes a call to userspace from kernel space). The handler's
returncodes are:
3 -> peer is inconsistent
4 -> peer is outdated (this handler outdated it) [ resource fencing ]
5 -> peer was down / unreachable
6 -> peer is primary
7 -> peer got stonithed [ node fencing ]
Let us assume that we have two boxes (N1 and N2) and that these
two boxes are connected by two networks (net and cnet [ clinets'-net ]).
Net is used by DRBD, while heartbeat uses both, net and cnet
I know that you are talking about fencing by STONITH, but DRBD is
not limited to that. Here comes my understanding of how resource fencing
should works with DRBDv8 :
N1 net N2
P/S --- S/P everything up and running.
P/? - - S/? network breaks ; N1 freezes IO
P/? - - S/? N1 fences N2:
In the STONITH case: turn off N2.
In the resource fencing case:
N1 asks N2 to fence itself from the storage via cnet.
HB calls "drbdadm outdate r0" on N2.
N2 replies to N1 that fencing is done via cnet.
The outdate-peer script on N1 returns sucess to DRBD.
P/D - - S/? N1 thaws IO
N2 got the the "Outdated" flag set in its meta-data, by the outdate
command.
The fencing is set to resource-only enables this behaviour. In the
resource-only case the outdate-peer handler should have a return
value of 3, 4, 5 or 6, but should not return 7.
In case "fencing" is set to "resource-and-stonith", all IO operations
get immediately frozen (even all currently outstanding IO operations
will not finish) upon loss of connection.
Then the "outdate-peer" handler is started. In this configuration
the outdate peer handler might return any of the documented return
values.
When the outdate-peer handler returns IO is resumed.
Notes:
* Why do we need to freeze IO in the "resource-and-stonith" case:
Stonith protects you when all communication pathes fail. In
that case both (isolated) nodes try to stonith each other.
If the current primary would continue to allow IO it could
accept transactions, but could get stonithed by the
currently secondary node.
-> Therefore others could see commited transactions that
would be gone after the successfull stonith operation.
* The outedate peer handler also gets called if an unconnected
secondary wants to become primary.
In other words it only may become primary when it knows that
the peer is outdated/inconsistent.
* We need to store the fact that the peer is outdated/inconsistent
in the meta-data. To allow an stand allone primary to be rebooted.
Does this answer your question ?
-Phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :