|
RE: DRBD8: Split-brain false positive on Primary/primary potential patch: msg#00010linux.kernel.drbd.devel
Not sure I agree that the current behavior is protecting users from themselves -- it only causes the split-brain if you lose the n/w and during 'normal' operation and there is nothing that protects against mounting a 1-node fs on both nodes of a primary-primary DRBD cluster. Running primary-secondary doesn't work if you are in a situation where it is not possible to switch primaryness when failing over; a good example of that is if you want to run a Xen virtual machine on top of a DRBD partition and support live migration of the VM (the problem is that Xen doesn't provide the means to execute a script to change primaryness at the required point in the migration). Of course you could argue that this is a Xen bug _but_ pragmatically, the proposed patch to delay updating the UUID until an actual write occurs preserves (I believe) correctness in DRBD and works without introducing new features into Xen. Recovering from split-brain automatically is of course something that is incredibly valuable but I think it can be treated orthogonally to the proposed fix. Simon -----Original Message----- From: Philipp Reisner [mailto:philipp.reisner-63ez5xqkn6DQT0dZR+AlfA@xxxxxxxxxxxxxxxx] Sent: Thursday, November 16, 2006 4:10 AM To: drbd-dev-63ez5xqkn6DQT0dZR+AlfA@xxxxxxxxxxxxxxxx Cc: Montrose, Ernest; Graham, Simon Subject: Re: [Drbd-dev] DRBD8: Split-brain false positive on Primary/primary potential patch Am Dienstag, 7. November 2006 00:47 schrieb Montrose, Ernest: > When running Primary/Primary if the Heartbeat connection goes down when > we recover we always split brain. Simon had an idea which I have > implemented. He is on vacation so this may not reflect his exact idea. > > Essentially with this change, we do not create a new current UUID on the > node unless I/O is seen. This prevent Split-Brain mitigation when both > nodes are primary but only one node is originating I/O and never the > other. He is only stand-by in that case. > > Take a look and let me know. Hi Ernest, I understand your reasoning, I see the patch, which I guess does what you expect of it. I do not want to do it that way for the following reasons: * It is only applicable in case you are using a 1-node filesystem on a primary-primary DRBD cluster. * I do not want users to do this. Because with this setup it is easily possible to mount the FS on both nodes concurrently. I want to protect the from themselfs ;) * Users using a 1-node filesystem should use DRBD withe primary and secondary role. * I rather want to fix DRBD's split brain recovery methods to deal with a cluster crash of a primary-primary cluster (actually this is item 41 in the ROADMAP file) I have a few hours time today, I will work on this today... -Phil -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com : |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: larger than 4tb volume status: 00010, Philipp Reisner |
|---|---|
| Next by Date: | RE: DRBD8: Fencing and outdate-peer handler getting called multiple times: 00010, Montrose, Ernest |
| Previous by Thread: | Re: DRBD8: Split-brain false positive on Primary/primary potential patchi: 00010, Philipp Reisner |
| Next by Thread: | Re: DRBD8: Split-brain false positive on Primary/primary potential patch: 00010, Lars Ellenberg |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |