On 4/21/07, Vadivelan Ranjith <velan.aero@xxxxxxxxx> wrote:
Hi
some of our compute nodes went down due to power failure. We booted some nodes after few days. After booting nodes, i deleted all jobs manually using qdel in server. All jobs deleted except two jobs. when i type showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
12377 prashant Running 1 -INFINITY Fri Mar 23 15:34:49
12361 prashant Running 1 -INFINITY Fri Mar 23 15:34:49
12769 vilask Running 1 1:08:57:44 Tue Apr 17 19:42:11
12775 dmashok Running 1 1:10:46:14 Tue Apr 17 21:30:41
12777 shinisha Running 1 1:10:48:18 Tue Apr 17 21:32:45
12778 mehta Running 1 1:10:51:55 Tue Apr 17 21:36:22
12779 mehta Running 1 1:10:51:55 Tue Apr 17 21:36:22
12789 atuls Running 1 1:21:59:27 Wed Apr 18 08:43:54
12790 atuls Running 1 1:21:59:58 Wed Apr 18 08:44:25
12791 atuls Running 1 1:22:00:29 Wed Apr 18 08:44:56
12796 sndatta Running 1 2:01:59:11 Wed Apr 18 12:43:38
12768 deepa Running 1 2:02:23:59 Wed Apr 18 13:08:26
12803 dipankar Running 1 2:22:35:34 Thu Apr 19 09:20:01
12804 dipankar Running 1 2:22:45:54 Thu Apr 19 09:30:21
12805 shinisha Running 1 2:23:00:22 Thu Apr 19 09:44:49
12806 mahendra Running 1 2:23:30:20 Thu Apr 19 10:14:47
12816 mahendra Running 1 3:05:38:12 Thu Apr 19 16:22:39
12838 dmashok Running 1 4:00:31:31 Fri Apr 20 11:15:58
12839 shinisha Running 1 4:01:04:04 Fri Apr 20 11:48:31
12851 dmashok Running 1 4:11:04:12 Fri Apr 20 21:48:39
12849 vilask Running 1 4:23:25:54 Sat Apr 21 10:10:21
12850 deepa Running 1 4:23:25:54 Sat Apr 21 10:10:21
22 Active Jobs 22 of 32 Processors Active (
68.75%)
14 of 16 Nodes Active (87.50%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
12333 mahendra Deferred 1 5:00:00:00 Thu Mar 8 08:56:33
12342 dipankar Deferred 1 5:00:00:00 Thu Mar 8 10:37:22
Total Jobs: 24 Active Jobs: 22 Idle Jobs: 0 Blocked Jobs: 2
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Here first two jobs showing INFINITY and jobs are not running. Even its not deleting . I login to compute nodes and i did top. Jobs are not running. when i check the job it showing,
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
checking job 12377
State: Running
Creds: user:prashant group:prashant class:batch qos:DEFAULT
WallTime: 41:03:45:05 of 1:12:00:00
SubmitTime: Sat Mar 10 13:13:50
(Time Queued Total: 13:02:20:59 Eligible: 13:02:20:59)
StartTime: Fri Mar 23 15:34:49
Total Tasks: 1
Req[0] TaskCount: 1 Partition: DEFAULT
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
NodeCount: 1
Allocated Nodes:
[node08:1]
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 2
PartitionMask: [ALL]
Flags: RESTARTABLE
Reservation '12377' ( -INFINITY -> 00:00:01 Duration: 28:19:08:37)
PE: 1.00 StartPriority: 18860
Can you please help me how to sort it out.
Velan
_______________________________________________
torqueusers mailing list
torqueusers@xxxxxxxxxxxxxxxx
http://www.supercluster.org/mailman/listinfo/torqueusers
|