>
-----Original Message-----
>
From: Charles Manning [mailto:manningc2@xxxxxxxxxxxxx]
>
Sent: Sunday, December 04, 2005 9:55 AM
>
To: yaffs@xxxxxxxxxxxxxxxxxxxxxx
>
Cc: Lawson.Reed
>
Subject: Re: [Yaffs] YAFFS2 and kswapd dead lock problem
>
>
>
On Saturday 03 December 2005 07:56, Lawson.Reed wrote:
>
> Found it.... took me 5 work days.
>
> And this deadlock issue IS in the YAFFS2 tip code on CVS here:
>
>
>
http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_fs.c?
>
rev=1.34&view
>
>=auto
>
>
>
>
Thanks for this huge effort Reed.
>
>
I am quite surprised that someone else has not stumbled on
>
this either.
>
>
Does it only impact on 2.4.x kernels or also 2.6?
>
From looking at the inode.c in 2.6 I see they are still doing the
wait_queue thing on the I_FREEING flag just the same as 2.4.
So, this deadlock issue should also be in 2.6 kernels. I do not
have a system running 2.6 so I can not verify this.
It only occurs under heavy multi thread load on the yaffs2 fs.
It is fairly infrequent and to troubleshoot it, I had to write
a torture test program.
>
> So, no one has seen this???
>
>
>
> Here is what is happening:
>
>
>
> Process 'A' grabs the YAFFS2 grossLock.
>
> Process 'B' preempts and it's job is to free unused inodes
>
everywhere.
>
> (hint: 'B' is kswapd). So, 'B' sets I_FREEING. Then it calls
>
> yaffs_clear_inode() which needs the grossLock. So, it goes
>
> on the wait queue because 'A' has the grossLock.
>
>
>
> Now process 'A' runs. It's holding the grossLock. It calls
>
> yaffs_get_inode() which calls BACK UP to iget()... With
>
> the grossLock held! That calls find_inode(). It finds
>
> I_FREEING set and then gets put on a wait queue in
>
> __wait_on_freeing_inode().
>
>
>
> Presto chango deadlock.
>
>
>
> So, my solution is to make sure the grossLock is not held when
>
> calling yaffs_get_inode(). Plus, I added grossLocking to
>
> yaffs_read_inode() since NB's comment in there is no longer
>
> true.
>
>
>
> I ran my 20 thread torture test which usually deadlocks in under 30
>
> seconds. It ran overnight with this fix. The test found no compare
>
> errors in the 20 files that it reads and writes at random times with
>
> random data and random lengths.
>
>
>
> So, I strongly suggest that someone close to the YAFFSs
>
effort review
>
> this change and incorporate it. I am kinda new to all this and I'm
>
> not even sure what the correct way to submit the changes are.
>
> So, let me know how I can help.
>
>
Did you mean to include some code?
No.
>
The easiest is to send something in patch form using "diff
>
-Naur old new", but others are fine too.
OK, how is this? Warning... I did not even compile this in the
tip YAFFS2 code because I do not have a 2.6 tree. So, I copied my
changes from my 2.4 tree into the tip yaffs2/yaffs_fs.c code.
So, please review it carefully...
I started with this:
http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_fs.c?rev=1.34&view=auto
File: [Development] / yaffs2 / yaffs_fs.c (download) (as text)
Revision: 1.34, Mon Nov 14 21:00:54 2005 UTC (2 weeks, 3 days ago) by charles
Thanks,
- Reed.
--------------------------------------------8<--------------------------------
diff -Naur yaffs2_old/yaffs_fs.c yaffs2/yaffs_fs.c
--- yaffs2_old/yaffs_fs.c 2005-12-05 09:53:10.000000000 -0800
+++ yaffs2/yaffs_fs.c 2005-12-05 10:15:32.000000000 -0800
@@ -310,6 +310,9 @@
obj = yaffs_GetEquivalentObject(obj); /* in case it was a hardlink */
+ // can not hold grossLock when calling yaffs_get_inode()
+ yaffs_GrossUnlock(dev);
+
if (obj) {
T(YAFFS_TRACE_OS,
(KERN_DEBUG "yaffs_lookup found %d\n", obj->objectId));
@@ -325,8 +328,6 @@
/*dget(dentry); // try to solve directory bug */
d_add(dentry, inode);
- yaffs_GrossUnlock(dev);
-
/* return dentry; */
return NULL;
#endif
@@ -336,7 +337,6 @@
T(YAFFS_TRACE_OS, (KERN_DEBUG "yaffs_lookup not found\n"));
}
- yaffs_GrossUnlock(dev);
/* added NCB for 2.5/6 compatability - forces add even if inode is
* NULL which creates dentry hash */
@@ -725,6 +725,8 @@
/* NB Side effect: iget calls back to yaffs_read_inode(). */
/* iget also increments the inode's i_count */
+ // RL AND you better not be holding the grossLock, or dead lock with
wait_on_freeing_inode()
+ // will occur because of kswapd (and other things)!!
return inode;
}
@@ -941,6 +943,9 @@
break;
}
+ // can not call yaffs_get_inode() with grossLock held.
+ yaffs_GrossUnlock(dev);
+
if (obj) {
inode = yaffs_get_inode(dir->i_sb, mode, rdev, obj);
d_instantiate(dentry, inode);
@@ -954,8 +959,6 @@
error = -ENOMEM;
}
- yaffs_GrossUnlock(dev);
-
return error;
}
@@ -1238,6 +1241,7 @@
{
/* NB This is called as a side effect of other functions and
* thus gross locking should always be in place already.
+ * RL Put the grossLock back in because of dead lock issue.
*/
yaffs_Object *obj;
@@ -1246,10 +1250,14 @@
T(YAFFS_TRACE_OS,
(KERN_DEBUG "yaffs_read_inode for %d\n", (int)inode->i_ino));
+ yaffs_GrossLock(dev);
+
obj = yaffs_FindObjectByNumber(dev, inode->i_ino);
yaffs_FillInodeFromObject(inode, obj);
+ yaffs_GrossUnlock(dev);
+
}
static LIST_HEAD(yaffs_dev_list);
@@ -1478,13 +1486,13 @@
("yaffs_read_super: guts initialised %s\n",
(err == YAFFS_OK) ? "OK" : "FAILED"));
+ // Can not call yaffs_get_inode() with the grossLock held.
+ yaffs_GrossUnlock(dev);
+
/* Create root inode */
if (err == YAFFS_OK)
inode = yaffs_get_inode(sb, S_IFDIR | 0755, 0,
yaffs_Root(dev));
-
- yaffs_GrossUnlock(dev);
-
if (!inode)
return NULL;
--------------------------------------------8<--------------------------------
>
>
Thanx
>
>
Charles
>