Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Re: scp error: msg#00135

clustering.torque.user

Subject: Re: scp error

On Thu, Nov 30, 2006 at 03:15:44PM +0100, LEROY Christine alleged:
> Hello,
>
>
>
> We are using torque and maui beside our grid middleware, and users are
> complaining that there jobs are sometimes failing with no output.
>
> We had a look in our logs and we can see those errors:
>
>
>
> Nov 30 02:18:31 wn021 pbs_mom: sys_copy, command '/usr/bin/scp -rpB
> /var/spool/pbs/spool/87831.node0.OU
> atlp@xxxxxxxxxxxxxxxxxxxxxx:/home/atlp/.lcgjm/globus-cache-export.Y30406
> /batch.out' failed with status=1, giving up after 4 attempts
>
> Nov 30 02:18:36 wn021 pbs_mom: sys_copy, command '/usr/bin/scp -rpB
> /var/spool/pbs/spool/87831.node0.ER
> atlp@xxxxxxxxxxxxxxxxxxxxxx:/home/atlp/.lcgjm/globus-cache-export.Y30406
> /batch.err' failed with status=1, giving up after 4 attempts
>
>
>
> (node07.datagrid.cea.fr is our pbs server, and wn021 is one of our nodes
> where pbs_mom is running)
>
>
>
> Are those file "/var/spool/pbs/spool/87831.node0.OU" and
> "/var/spool/pbs/spool/87831.node0.ER " deleted too soon by the system on
> the pbs_mom node?
>
> Or is it possible to configure the number of attempts ?
>
>
>
> Thanks in advance for your help.
>
> Cheers
>
> Christine
>
>
>
>
>
> PS : We have also the same type of error but at the beginning of the job
> :
>
>
>
> Nov 30 04:40:21 wn021 pbs_mom: sys_copy, command '/usr/bin/scp -rpB
> fus176@xxxxxxxxxxxxxxxxxxxxxx:/home/fus176/.lcgjm/globus-cache-export.g2
> 2960/globus-cache-export.g22960.gpg globus-cache-export.g22960.gpg'
> failed with status=1, giving up after 4 attempts
>

The number of tries isn't configurable, and IMHO doesn't need to be
because, generally speaking, any failure will just repeat until it gives
up. Meaning that 4 tries is as good as 1 try, and is as good as 50
tries.

Make sure you are on 2.1.6, 2.1.4 and 2.1.5 have some things broken in
this area.

Since this is likely an ssh configuration error, the exact error message
should have been sent to the user in an email.

If /home is shared on your cluster, add suitable $usecp lines to your
MOM config so that scp isn't used anymore.


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
hardware.arm.at...    cms.citadel.dev...    video.gstreamer...    java.facelets.u...    misc.basics.qna...    web.wiki.instik...    network.uip.use...    xdg.devel/2003-...    tex.bibtex.bibd...    finance.quotesp...    ietf.zeroconf/2...    redhat.blinux.g...    suse.db2/2003-0...    php.phpesp/2004...    uml.devel/2003-...    gnome.labyrinth...    qnx.openqnx.dev...    boot-loaders.gr...    db.dataperfect....    audio.audacity....    linux.uclinux.m...    editors.j.devel...    os.openbsd.tech...    kde.users.multi...   
Home | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation