Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Job Allocation problem: msg#00093

clustering.torque.user

Subject: Job Allocation problem

I have come across an interesting problem. A user of the cluster submitted a couple of hundred jobs to the queue - by default they went into the feed queue and the resources requested were such that they were moved to the long queue, which then filled up to the max_queuable limit. The jobs that remain then were moved to the parallel queue (which is logical even if I did not anticipate it). However the jobs when they went into the parallel queue were allocated 8 nodes - which is not the behaviour I would have expected. details of the queue structure are copied below.

I am currently running torque-1.2.0p2, although intend to upgrade to the latest version in the near future.

The jobs are fed into short, long, verylong, parallel

create queue short
set queue short queue_type = Execution
set queue short max_queuable = 500
set queue short max_running = 120
set queue short resources_max.cput = 02:00:00
set queue short resources_max.nodect = 1
set queue short resources_max.nodes = 1
set queue short resources_max.walltime = 04:00:00
set queue short resources_default.cput = 01:00:00
set queue short resources_default.nodes = 1
set queue short resources_default.walltime = 04:00:00
set queue short enabled = True
set queue short started = True
#
# Create and define queue feed
#
create queue feed
set queue feed queue_type = Route
set queue feed max_queuable = 3000
set queue feed route_destinations = short
set queue feed route_destinations += long
set queue feed route_destinations += verylong
set queue feed route_destinations += parallel
set queue feed enabled = True
set queue feed started = True
#
# Create and define queue long
#
create queue long
set queue long queue_type = Execution
set queue long max_queuable = 160
set queue long max_running = 40
set queue long resources_max.cput = 30:00:00
set queue long resources_max.nodect = 1
set queue long resources_max.nodes = 1
set queue long resources_max.walltime = 50:00:00
set queue long resources_default.cput = 24:00:00
set queue long resources_default.nodes = 1
set queue long resources_default.walltime = 50:00:00
set queue long enabled = True
set queue long started = True
#
# Create and define queue verylong
#
create queue verylong
set queue verylong queue_type = Execution
set queue verylong max_queuable = 20
set queue verylong max_running = 10
set queue verylong resources_max.nodect = 1
set queue verylong resources_max.nodes = 1
set queue verylong resources_min.cput = 24:00:01
set queue verylong resources_min.walltime = 24:00:01
set queue verylong resources_default.nodes = 1
set queue verylong resources_default.walltime = 288:00:00
set queue verylong enabled = True
set queue verylong started = True
#
# Create and define queue parallel
#
create queue parallel
set queue parallel queue_type = Execution
set queue parallel max_queuable = 10
set queue parallel max_running = 5
set queue parallel resources_max.nodect = 8
set queue parallel resources_max.nodes = 8
set queue parallel resources_default.nodes = 1
set queue parallel resources_default.walltime = 288:00:00
set queue parallel enabled = True
set queue parallel started = True




I have since added the following lines of configuration and expect that this will solve the particular problem, but I would like to know the reason why the resources allocated was 8 nodes


> set queue parallel resources_min.nodect = 2
> set queue parallel resources_min.nodes = 2
> set queue parallel resources_default.nodect = 2
> set queue parallel resources_default.nodes = 2




Thanks


Mark Meenan
University of Glasgow
Computing Service


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
hardware.arm.at...    cms.citadel.dev...    video.gstreamer...    java.facelets.u...    misc.basics.qna...    web.wiki.instik...    network.uip.use...    xdg.devel/2003-...    tex.bibtex.bibd...    finance.quotesp...    ietf.zeroconf/2...    redhat.blinux.g...    suse.db2/2003-0...    php.phpesp/2004...    uml.devel/2003-...    gnome.labyrinth...    qnx.openqnx.dev...    boot-loaders.gr...    db.dataperfect....    audio.audacity....    linux.uclinux.m...    editors.j.devel...    os.openbsd.tech...    kde.users.multi...   
Home | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation