|
|
Choosing A Webhost: |
Re: Torque not deleting job: msg#00092clustering.torque.user
Chris, Here are the answers to your questions (output is from the compute node after the reboot): 1. pbs_mom is set to start automatically upon reboot, and I did confirm that it starts. 2. [root@rrmaster ~]# checknode n01-01-06 checking node n01-01-06 State: Running (in current state for 00:00:00) Configured Resources: PROCS: 8 MEM: 31G SWAP: 31G DISK: 1M Utilized Resources: [NONE] Dedicated Resources: PROCS: 1 Opsys: linux Arch: [NONE] Speed: 1.00 Load: 0.190 Network: [DEFAULT] Features: [dev][compute][opteron][cu01][rack01][ww26][x86_64][optdev_DRV_20070211] Attributes: [Batch] Classes: [batch 8:8][dque 7:8][loadl 8:8] Total Time: 13:13:36:09 Up: 13:13:23:34 (99.94%) Active: 6:20:49:40 (50.62%) Reservations: Job '1160'(x1) -00:07:01 -> 11:52:59 (12:00:00) JobList: 1160 ALERT: node has 1 procs dedicated but load is low (0.190) 3. [root@rrmaster ~]# pbsnodes -l n01-01-02 down,job-exclusive 4. This is on a linux cluster where both server and nodes are running Fedora Core 6 Thanks Adam Emerich IBM Corporation - Rochester, MN Staff Engineer Office: 030-3 F305 Office: (507) 253-5483 Cell: (507) 358-2999 aemerich@xxxxxxxxxx "Insanity: doing the same thing over and over again and expecting different results." -Albert Einstein Chris Samuel <csamuel@xxxxxxxx > To Sent by: torqueusers@xxxxxxxxxxxxxxxx torqueusers-bounc cc es@xxxxxxxxxxxxxx rg Subject Re: [torqueusers] Torque not deleting job 04/20/2007 03:56 AM On Fri, 20 Apr 2007, Adam Emerich wrote: > I am seeing a case in which torque does not delete an interactive job if > the node on which the job is running goes down. Some (probably silly) questions: Is someone starting pbs_mom on the node once the node it is back up ? What does checknode say ? What does pbsnodes -l say ? Is this on AIX by some chance ? cheers, Chris -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia _______________________________________________ torqueusers mailing list torqueusers@xxxxxxxxxxxxxxxx http://www.supercluster.org/mailman/listinfo/torqueusers
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: php based web interface for qstat viewing, Ramon Bastiaans |
|---|---|
| Next by Date: | Re: Torque not deleting job, Garrick Staples |
| Previous by Thread: | Re: Torque not deleting job, Chris Samuel |
| Next by Thread: | Re: Torque not deleting job, Garrick Staples |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |