logo       


Re: Odd job reject problem: msg#00119

Subject: Re: Odd job reject problem
On Fri, 2006-12-29 at 11:40 -0500, Tim Miller wrote:
> I'm running Torque 2.1.4. I would like all of the nodes and desktop 
> computers on our internal network to be able to submit jobs, but only 
> some of them are able to and I'm not seeing why.
> 
> My setup is simple; a single routing queue that feeds into a single 
> execution queue. The queues are configured as follows:
> 
> routing:
> Queue entry
>          queue_type = Route
>          total_jobs = 0
>          state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 
> Exiting:0
>          acl_host_enable = False
>          resources_default.nodes = 1:xeon306
>          mtime = Fri Dec 29 11:19:27 2006
>          route_destinations = xeon
>          enabled = True
>          started = True
> 
> exec:
> Queue xeon
>          queue_type = Execution
>          total_jobs = 42
>          state_count = Transit:0 Queued:1 Held:0 Waiting:0 Running:41 
> Exiting:0
>          acl_host_enable = False
>          from_route_only = True
>          mtime = Fri Dec 29 11:19:21 2006
>          resources_assigned.nodect = 58
>          enabled = True
>          started = True
> 
> Server setup:
> Server <name removed by me>
>          server_state = Active
>          scheduling = True
>          total_jobs = 50
>          state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:50 
> Exiting:0
>          managers = <manager list removed>
>          default_queue = entry
>          log_events = 511
>          mail_from = adm
>          query_other_jobs = True
>          resources_assigned.nodect = 67
>          scheduler_iteration = 600
>          node_check_rate = 120
>          tcp_timeout = 6
>          pbs_version = 2.1.4
> 
> As you can see, I've explicit set acl_host_enable to false on both 
> queues. Nonetheless, when I try to submit a job from certain hosts I get 
> a "job rejected by all possible destinations" and the following in the 
> server log:
> 
> 12/29/2006 11:20:22;0100;PBS_Server;Req;;Type AuthenticateUser request 
> received from tim@xxxxxxxxxxxxxxxx, sock=10
> 12/29/2006 11:20:22;0100;PBS_Server;Req;;Type QueueJob request received 
> from tim@xxxxxxxxxxxxxxxx, sock=9
> 12/29/2006 11:20:22;0100;PBS_Server;Req;;Type ReadyToCommit request 
> received from tim@xxxxxxxxxxxxxxxx, sock=9
> 12/29/2006 11:20:22;0100;PBS_Server;Req;;Type Commit request received 
> from tim@xxxxxxxxxxxxxxxx, sock=9
> 12/29/2006 11:20:22;0080;PBS_Server;Req;req_reject;Reject reply 
> code=15039(Job rejected by all possible destinations), aux=0, 
> type=Commit, from tim@xxxxxxxxxxxxxxxx
> 
> It looks like the job is never even assigned a number and rejected 
> before it even hits the routing queue.
> 
> I've scratched my head over this a little and just can't see what I'm 
> doing wrong. Any ideas?

What does the job look like?  It's hard to say why the job was rejected
without seeing what resources it requested.

        --Troy
-- 
Troy Baer                       troy@xxxxxxx
Science & Technology Support    http://www.osc.edu/hpc/
Ohio Supercomputer Center       614-292-9701


Ruby Jobs
Java Jobs
Jobs in California
more...
what
job title, keywords
where
city, state, zip
jobs by job search
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
encryption.gpg....    ietf.rfc822/199...    freebsd.devel.i...    lang.haskell.li...    mail.squirrelma...    web.zope.plone....    yellowdog.gener...    text.xml.xalan....    recreation.phot...    kde.devel.educa...    hardware.bus.ca...    printing.ghosts...    voip.peering/20...    assembly/2006-0...    org.user-groups...    culture.interne...    network.i2p/200...    boot-loaders.ya...    xfree86.render/...    qnx.openqnx.dev...    jakarta.velocit...    user-groups.pal...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe