logo       
Google Custom Search
    AddThis Social Bookmark Button
-->

Re: OSC mpiexec with torque on Fedora6: msg#00156

Subject: Re: OSC mpiexec with torque on Fedora6

On Jan 30, 2007, at 1:34 PM, Tony Schreiner wrote:

I am wanting to upgrade a cluster from Fedora 4 to Fedora 6, but am hung up on the OSC mpiexec part.

I have torque  2.1.6-1 from the Fedora repo installed.

mpiexec compiles fine, I used
./configure ---with-default-comm=mpich-p4

my script, dompi is basically

/path/to/mpiexec ./app

I submit the dompi script, with
qsub -l nodes=nodeX dompi

on the node I upgraded (node5), I get in the error log
mpiexec: Error: get_hosts: pbs_connect: no error.

and this is because pbs_connect(0) in get_hosts.c returns -1 for me on this node, I guess it's supposed to return the number of available nodes.

It still works on the other ones though.

Some sort  of host resolution error? Everything seems fine to me.


If I may answer my own question. I got the vital clue from Pete Wyckoff at OSC. The error pointed to problems with the pbs_iff program.

I had installed the torque, torque-mom and libtorque RPMs from Fedora, but had not installed torque-client which is where pbs_iff is found. After I corrected that the problem was solved.

Tony Schreiner


<Prev in Thread] Current Thread [Next in Thread>