osdir.com
mailing list archive

Subject: Re: Data file corruption and recovery - msg#00080

List: web.zope.zodb

Date: Prev Next Index Thread: Prev Next Index
My first guess would be that the storage was shutdown without an index
file, and the slowness was just scanning the actual file to recreate the
index.

Jeremy



_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx
http://mail.zope.org/mailman/listinfo/zodb-dev



Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

Re: Data file corruption and recovery

Jeremy, I never got the zeo server to tell me anything (I started it in debug mode to make sure). Now I wasn't in a very patient mood so I didn't wait more than 3 mins or so. I figured that the problem was some sort of corruption as the older file loaded with no problem. I don't see any .trN file but maybe this is due to my impatience. I have the original file but it is fairly big (67MB when gziped). fsrecvoer didn't say anything other than that 0 data was lost during recovery. As Toby suggests it may be possible that the machine was writing bad data for a while as a result of the cpu. Is it possible that there was no problem and that I should have been more patient with the startup time of zeo? -EAD Jeremy Hylton wrote: On Thu, 2003-02-13 at 09:33, Erik Dahl wrote: Yesterday I had a cpu failure on a box that caused the sudden reboot of a zeo server. When the service was brought up on the other side of the cluster it didn't start. I figured this was due to data corruption and when using a backup the server started fine. The problem was the backup was a little stale so I wanted to try recovering the corrupt file. I found two methods for fixing the file running fsrecover.py or running tranalyzer.py then using its output to truncate the data file. fsrecover.py did fix my problem but only after running for around 6 hours and generating no output other than to say that no data was lost. The tranalyzer method never worked. My questions are: What happened when you tried to start up the zeo server? I must admit that I haven't run into this problem in real deployment, and I don't remember what the storage / server is supposed to say. Did it create a .trN file? That would indicate it figured out what transactions to delete. Do you have the original file? If you run into file storage corruption problems, it's helpful to us developers if you keep a copy of the damaged file. 1. how can you figure out what the server is doing when you have a corrupted file (I tried setting STUPID_LOG_SEVERITY to -300 with no results). It did say something, right? Otherwise you would not have known that the storage was corrupted. There should be some complaint during the initial startup, but if it starts up successfully I wouldn't expect further error reports in the log. 2. any idea why taking transactions off the end of the file didn't fix the problem? Do you know what changes fsrecover made to fix the problem? 3. would directory storage handle this situation better or do I need to go to a berkeley db backend? It's surprising that truncating the file didn't solve the problem. FileStorage should be pretty robust against these sorts of crashes. It calls sync() in tpc_finish(), so it's quite unlikely that a reboot would do anything other than leave incomplete transaction data at the end. Jeremy _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Next Message by Date: click to view message preview

Re: AdaptableStorage and ZClass instances

On Thu, 13 Feb 2003 15:47:38 -0500 Shane Hathaway <shane@xxxxxxxx> wrote: > However, at configuration time, you have access to the classes of the > objects to be loaded/stored. During configuration, you could take the > opportunity to automatically configure mappers and gateways based on > class inspection. Can I only access the class at configuration time? Can't one get access to the class at runtime through the MetaTypeClassifier? > Application code might look very much like the > SQLObject example (see Christian's post), except that classes would use > ZODB.Persistent as a base class (rather than SQLObject) and the same > code would work in regular ZODB. Mmm, I don't know if I want to put any storage specific details in my application code just to make the OR Mapping work. What I think is genius about AS is that I can keep writing apps in the way I'm used to and in a way that fits the framework iow I don't have to subclass from this or that to persist my classes. I'd rather do more work on the side of the OR mapper and teach it how to persist objects in my problem domain. For a start one can definitely have a SQL base class that implements basic load and store methods which leaves you with only having to specify sql strings in the subclass, like Rocky suggested. But then one still has to write SQL for each class that you want to persist. My idea was to have a default properties gateway that generates SQL at runtime for all classes that doesn't have a specific gateway. This seems problematic if one doesn't have access to the object and since this is a default gateway one wouldn't now anything about the class either. All one really needs is the classname and the property types and values on the object. Is this at all possible? The classname can then be used in the load and store methods of the gateway to query the right table and property types will be looked up in a typemap for the RDBMS. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Previous Message by Thread: click to view message preview

Re: Data file corruption and recovery

Jeremy, I never got the zeo server to tell me anything (I started it in debug mode to make sure). Now I wasn't in a very patient mood so I didn't wait more than 3 mins or so. I figured that the problem was some sort of corruption as the older file loaded with no problem. I don't see any .trN file but maybe this is due to my impatience. I have the original file but it is fairly big (67MB when gziped). fsrecvoer didn't say anything other than that 0 data was lost during recovery. As Toby suggests it may be possible that the machine was writing bad data for a while as a result of the cpu. Is it possible that there was no problem and that I should have been more patient with the startup time of zeo? -EAD Jeremy Hylton wrote: On Thu, 2003-02-13 at 09:33, Erik Dahl wrote: Yesterday I had a cpu failure on a box that caused the sudden reboot of a zeo server. When the service was brought up on the other side of the cluster it didn't start. I figured this was due to data corruption and when using a backup the server started fine. The problem was the backup was a little stale so I wanted to try recovering the corrupt file. I found two methods for fixing the file running fsrecover.py or running tranalyzer.py then using its output to truncate the data file. fsrecover.py did fix my problem but only after running for around 6 hours and generating no output other than to say that no data was lost. The tranalyzer method never worked. My questions are: What happened when you tried to start up the zeo server? I must admit that I haven't run into this problem in real deployment, and I don't remember what the storage / server is supposed to say. Did it create a .trN file? That would indicate it figured out what transactions to delete. Do you have the original file? If you run into file storage corruption problems, it's helpful to us developers if you keep a copy of the damaged file. 1. how can you figure out what the server is doing when you have a corrupted file (I tried setting STUPID_LOG_SEVERITY to -300 with no results). It did say something, right? Otherwise you would not have known that the storage was corrupted. There should be some complaint during the initial startup, but if it starts up successfully I wouldn't expect further error reports in the log. 2. any idea why taking transactions off the end of the file didn't fix the problem? Do you know what changes fsrecover made to fix the problem? 3. would directory storage handle this situation better or do I need to go to a berkeley db backend? It's surprising that truncating the file didn't solve the problem. FileStorage should be pretty robust against these sorts of crashes. It calls sync() in tpc_finish(), so it's quite unlikely that a reboot would do anything other than leave incomplete transaction data at the end. Jeremy _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev

Next Message by Thread: click to view message preview

Re: Data file corruption and recovery

On Thursday 13 February 2003 2:33 pm, Erik Dahl wrote: > Yesterday I had a cpu failure on a box that caused the sudden reboot of > a zeo server. cpu failure - That sounds bad. How do you know the failure really was sudden, and that your server hadnt been writing badness into this storage for a while before the reboot? Did you check the filesystem containing the storage? > 3. would directory storage handle this situation better Ive never been able to provoke a failure by pulling the power plug from a loaded FileStorage. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@xxxxxxxx http://mail.zope.org/mailman/listinfo/zodb-dev
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by