Steve Young wrote:
Thanks Dave. This is why I am wondering how torque checks an OS to
verify how much memory is being used. I suspect that when the job is
first being started that a lot more resources are used but after it's
underway it evens out to expected operation. I am hoping once I can find
out how torque does it that perhaps I can do the same from command line
to try to find out for myself why torque thinks that it needs so much
more memory.
Basically just add up what you find belonging to the job from ps aux.
You bring up an interesting point... having MOM ignore resource usage
for young processes. I didn't see anything on parameters page for MOM to
configure this. Would you mind elaborating on how you did that? =).
Thanks in advance,
Fairly simplistically. These are cutdown versions of the routines in
mom_mach.c for finding job vmem and mem respectively (I've chopped out
gory shared memory details but left gratuitous macros in).
David
static memsize_t mem_sum(job *pjob)
{
char *id="mem_sum";
memsize_t memsize=0;
int iproc;
for (iproc=0; iproc<nproc; iproc++) {
psinfo_t *pi = &proc_info[iproc];
if (!injob(pjob, pi->pr_sid)) continue;
/*
* A feeble attempt to ignore the memory use of recently forked
* processes - ignore processes less than 2 seconds old
*/
if ( time_now < (time_t) ISECS(pi->pr_start) + 2 ) continue;
if ( PRVMEM_TO_BYTES(pi->pr_size) < PROC_MEM_MAX)
memsize += PRVMEM_TO_BYTES(pi->pr_size);
}
return (memsize);
}
static memsize_t resi_sum(job *pjob)
{
char *id="resi_sum";
memsize_t resisize=0;
int iproc;
for (iproc=0; iproc<nproc; iproc++) {
psinfo_t *pi = &proc_info[iproc];
if (!injob(pjob, pi->pr_sid)) continue;
/*
* A feeble attempt to ignore the memory use of recently forked
* processes - ignore processes less than 2 seconds old
*/
if ( time_now < (time_t) ISECS(pi->pr_start) + 2 ) continue;
if (PRRSS_TO_BYTES(pi->pr_rssize) < PROC_MEM_MAX)
resisize += PRRSS_TO_BYTES(pi->pr_rssize);
}
return (resisize);
}
|