|
|
Mozy Online Backup: 2GB Free. Automatic. Secure.
Subject: Re: Oops crash in yaffs_AddOrFindLevel0Tnode during mount - msg#00098
List: linux.file-systems.yaffs
Thanks, Charles,
I also tend to think that applying of your latest checkins
could be the right thing to do.
Could you please only confirm that writing of not erased chunk
could really cause troubles ? (fs corruptions etc ?)
Regards,
Gennady Dagman.
On Tue, 2006-09-26 at 11:31 +1200, Charles Manning wrote:
> On Tuesday 26 September 2006 06:11, Gennady Dagman wrote:
> > Hello,
> >
> > We ran into this linux kernel crash during mounting yaffs2 partition
> > (please find full Oops file attached below)
> > and from the trace-back and register analysis I conclude that:
> >
> > The trace-back function call chain:
> > get_sb_bdev ->
> > yaffs_internal_read_super ->
> > yaffs_GutsInitialise ->
> > yaffs_CheckpointRestore ->
> > yaffs_ReadCheckpointData ->
> > yaffs_ReadCheckpointObjects ->
> > yaffs_ReadCheckpointTnodes ->
> > yaffs_AddOrFindLevel0Tnode -> memcpy
> >
> > >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty clear
> >
> > that the only reason for memcpy
> > (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both
> > fStruct->topLevel = 0 and fStruct->top = 0.
> >
> > Looks like this problem is not reproducible easily - we saw it only ones
> > so far and I suspect
> > the root cause of it (as well as few others odd problems we run into
> > from time to time - see, for example,
> > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is
> > a flash file system corruption.
> >
> > Questions:
> > ---------------
> >
> > 1) Can you imagine what could be the reason (other than flash fs
> > corruption) for this Oops crash ?
>
> I'll have a look at that oops to see what hapopened.
>
> >
> > 2) I see that currently in our yaffs code we have defined:
> >
> > #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
> >
> > means that erasure check of NAND chunks is NOT performed before
> > write, but I know for sure that from time to time we do encounter
> > not erased chunks as result of power-off during block erasure.
> > What could be the consequences of using not erased chunks for
> > yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ?
> > problems like
> > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ??
> > or this current crash ???
>
> Try out the new code I recently checked in for handling the retirement better.
> Amongst other things this handles the erased checking far better and I hunch
> it will fix the problem you describe here without burdening you with lots of
> erase checks.
>
> See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: Oops crash in yaffs_AddOrFindLevel0Tnode during mount
On Tuesday 26 September 2006 06:11, Gennady Dagman wrote:
> Hello,
>
> We ran into this linux kernel crash during mounting yaffs2 partition
> (please find full Oops file attached below)
> and from the trace-back and register analysis I conclude that:
>
> The trace-back function call chain:
> get_sb_bdev ->
> yaffs_internal_read_super ->
> yaffs_GutsInitialise ->
> yaffs_CheckpointRestore ->
> yaffs_ReadCheckpointData ->
> yaffs_ReadCheckpointObjects ->
> yaffs_ReadCheckpointTnodes ->
> yaffs_AddOrFindLevel0Tnode -> memcpy
>
> >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty clear
>
> that the only reason for memcpy
> (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both
> fStruct->topLevel = 0 and fStruct->top = 0.
>
> Looks like this problem is not reproducible easily - we saw it only ones
> so far and I suspect
> the root cause of it (as well as few others odd problems we run into
> from time to time - see, for example,
> http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is
> a flash file system corruption.
>
> Questions:
> ---------------
>
> 1) Can you imagine what could be the reason (other than flash fs
> corruption) for this Oops crash ?
I'll have a look at that oops to see what hapopened.
>
> 2) I see that currently in our yaffs code we have defined:
>
> #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
>
> means that erasure check of NAND chunks is NOT performed before
> write, but I know for sure that from time to time we do encounter
> not erased chunks as result of power-off during block erasure.
> What could be the consequences of using not erased chunks for
> yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ?
> problems like
> http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ??
> or this current crash ???
Try out the new code I recently checked in for handling the retirement better.
Amongst other things this handles the erased checking far better and I hunch
it will fix the problem you describe here without burdening you with lots of
erase checks.
See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html
Next Message by Date:
click to view message preview
Re: Oops crash in yaffs_AddOrFindLevel0Tnode during mount
On Tuesday 26 September 2006 12:06, Gennady Dagman wrote:
> Thanks, Charles,
>
> I also tend to think that applying of your latest checkins
> could be the right thing to do.
>
> Could you please only confirm that writing of not erased chunk
> could really cause troubles ? (fs corruptions etc ?)
It is likely to fix #2, but it could fix #1 too.
Of course we shouldn't oops no matter what is on the flash :-).
>
> Regards,
>
> Gennady Dagman.
>
> On Tue, 2006-09-26 at 11:31 +1200, Charles Manning wrote:
> > On Tuesday 26 September 2006 06:11, Gennady Dagman wrote:
> > > Hello,
> > >
> > > We ran into this linux kernel crash during mounting yaffs2 partition
> > > (please find full Oops file attached below)
> > > and from the trace-back and register analysis I conclude that:
> > >
> > > The trace-back function call chain:
> > > get_sb_bdev ->
> > > yaffs_internal_read_super ->
> > > yaffs_GutsInitialise ->
> > > yaffs_CheckpointRestore ->
> > > yaffs_ReadCheckpointData ->
> > > yaffs_ReadCheckpointObjects ->
> > > yaffs_ReadCheckpointTnodes ->
> > > yaffs_AddOrFindLevel0Tnode -> memcpy
> > >
> > > >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty
> > > > clear
> > >
> > > that the only reason for memcpy
> > > (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both
> > > fStruct->topLevel = 0 and fStruct->top = 0.
> > >
> > > Looks like this problem is not reproducible easily - we saw it only
> > > ones so far and I suspect
> > > the root cause of it (as well as few others odd problems we run into
> > > from time to time - see, for example,
> > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is
> > > a flash file system corruption.
> > >
> > > Questions:
> > > ---------------
> > >
> > > 1) Can you imagine what could be the reason (other than flash fs
> > > corruption) for this Oops crash ?
> >
> > I'll have a look at that oops to see what hapopened.
> >
> > > 2) I see that currently in our yaffs code we have defined:
> > >
> > > #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
> > >
> > > means that erasure check of NAND chunks is NOT performed before
> > > write, but I know for sure that from time to time we do encounter
> > > not erased chunks as result of power-off during block erasure.
> > > What could be the consequences of using not erased chunks for
> > > yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ?
> > > problems like
> > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ??
> > > or this current crash ???
> >
> > Try out the new code I recently checked in for handling the retirement
> > better. Amongst other things this handles the erased checking far better
> > and I hunch it will fix the problem you describe here without burdening
> > you with lots of erase checks.
> >
> > See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html
>
> _______________________________________________
> yaffs mailing list
> yaffs@xxxxxxxxxxxxxxxxxx
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs
Previous Message by Thread:
click to view message preview
Re: Oops crash in yaffs_AddOrFindLevel0Tnode during mount
On Tuesday 26 September 2006 06:11, Gennady Dagman wrote:
> Hello,
>
> We ran into this linux kernel crash during mounting yaffs2 partition
> (please find full Oops file attached below)
> and from the trace-back and register analysis I conclude that:
>
> The trace-back function call chain:
> get_sb_bdev ->
> yaffs_internal_read_super ->
> yaffs_GutsInitialise ->
> yaffs_CheckpointRestore ->
> yaffs_ReadCheckpointData ->
> yaffs_ReadCheckpointObjects ->
> yaffs_ReadCheckpointTnodes ->
> yaffs_AddOrFindLevel0Tnode -> memcpy
>
> >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty clear
>
> that the only reason for memcpy
> (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both
> fStruct->topLevel = 0 and fStruct->top = 0.
>
> Looks like this problem is not reproducible easily - we saw it only ones
> so far and I suspect
> the root cause of it (as well as few others odd problems we run into
> from time to time - see, for example,
> http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is
> a flash file system corruption.
>
> Questions:
> ---------------
>
> 1) Can you imagine what could be the reason (other than flash fs
> corruption) for this Oops crash ?
I'll have a look at that oops to see what hapopened.
>
> 2) I see that currently in our yaffs code we have defined:
>
> #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
>
> means that erasure check of NAND chunks is NOT performed before
> write, but I know for sure that from time to time we do encounter
> not erased chunks as result of power-off during block erasure.
> What could be the consequences of using not erased chunks for
> yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ?
> problems like
> http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ??
> or this current crash ???
Try out the new code I recently checked in for handling the retirement better.
Amongst other things this handles the erased checking far better and I hunch
it will fix the problem you describe here without burdening you with lots of
erase checks.
See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html
Next Message by Thread:
click to view message preview
Re: Oops crash in yaffs_AddOrFindLevel0Tnode during mount
On Tuesday 26 September 2006 12:06, Gennady Dagman wrote:
> Thanks, Charles,
>
> I also tend to think that applying of your latest checkins
> could be the right thing to do.
>
> Could you please only confirm that writing of not erased chunk
> could really cause troubles ? (fs corruptions etc ?)
It is likely to fix #2, but it could fix #1 too.
Of course we shouldn't oops no matter what is on the flash :-).
>
> Regards,
>
> Gennady Dagman.
>
> On Tue, 2006-09-26 at 11:31 +1200, Charles Manning wrote:
> > On Tuesday 26 September 2006 06:11, Gennady Dagman wrote:
> > > Hello,
> > >
> > > We ran into this linux kernel crash during mounting yaffs2 partition
> > > (please find full Oops file attached below)
> > > and from the trace-back and register analysis I conclude that:
> > >
> > > The trace-back function call chain:
> > > get_sb_bdev ->
> > > yaffs_internal_read_super ->
> > > yaffs_GutsInitialise ->
> > > yaffs_CheckpointRestore ->
> > > yaffs_ReadCheckpointData ->
> > > yaffs_ReadCheckpointObjects ->
> > > yaffs_ReadCheckpointTnodes ->
> > > yaffs_AddOrFindLevel0Tnode -> memcpy
> > >
> > > >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty
> > > > clear
> > >
> > > that the only reason for memcpy
> > > (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both
> > > fStruct->topLevel = 0 and fStruct->top = 0.
> > >
> > > Looks like this problem is not reproducible easily - we saw it only
> > > ones so far and I suspect
> > > the root cause of it (as well as few others odd problems we run into
> > > from time to time - see, for example,
> > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is
> > > a flash file system corruption.
> > >
> > > Questions:
> > > ---------------
> > >
> > > 1) Can you imagine what could be the reason (other than flash fs
> > > corruption) for this Oops crash ?
> >
> > I'll have a look at that oops to see what hapopened.
> >
> > > 2) I see that currently in our yaffs code we have defined:
> > >
> > > #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
> > >
> > > means that erasure check of NAND chunks is NOT performed before
> > > write, but I know for sure that from time to time we do encounter
> > > not erased chunks as result of power-off during block erasure.
> > > What could be the consequences of using not erased chunks for
> > > yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ?
> > > problems like
> > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ??
> > > or this current crash ???
> >
> > Try out the new code I recently checked in for handling the retirement
> > better. Amongst other things this handles the erased checking far better
> > and I hunch it will fix the problem you describe here without burdening
> > you with lots of erase checks.
> >
> > See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html
>
> _______________________________________________
> yaffs mailing list
> yaffs@xxxxxxxxxxxxxxxxxx
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs
|
|