More CPUs doen't equal more speed
On Mon, May 27, 2019 at 4:06 AM Grant Edwards <grant.b.edwards at gmail.com> wrote:
> On 2019-05-23, Chris Angelico <rosuav at gmail.com> wrote:
> > On Fri, May 24, 2019 at 5:37 AM Bob van der Poel <bob at mellowood.ca> wrote:
> >> I've got a short script that loops though a number of files and
> >> processes them one at a time. I had a bit of time today and figured
> >> I'd rewrite the script to process the files 4 at a time by using 4
> >> different instances of python. My basic loop is:
> >> for i in range(0, len(filelist), CPU_COUNT):
> >> for z in range(i, i+CPU_COUNT):
> >> doit( filelist[z])
> >> With the function doit() calling up the program to do the
> >> lifting. Setting CPU_COUNT to 1 or 5 (I have 6 cores) makes no
> >> difference in total speed. I'm processing about 1200 files and my
> >> total duration is around 2 minutes. No matter how many cores I use
> >> the total is within a 5 second range.
> > Where's the part of the code that actually runs them across multiple
> > CPUs? Also, are you spending your time waiting on the disk, the CPU,
> > IPC, or something else?
> He said he's using N differenct Python instances, and he even provided
> the code that runs in each instance which is obviously processesing
> 1/Nth of the files.
> It's a pretty good bet that I/O is the limiting factor.
Sometimes, the "simple" and "obvious" code, the part that clearly has
no bugs in it, is the part that has the problem. :)