George On 21/02/19 1:15 PM, george trojan wrote: > def create_box(x_y): > return geometry.box(x_y[0] - 1, x_y[1], x_y[0], x_y[1] - 1) > > x_range = range(1, 1001) > y_range = range(1, 801) > x_y_range = list(itertools.product(x_range, y_range)) > > grid = list(map(create_box, x_y_range)) > > Which creates and populates an 800x1000 ?grid? (represented as a flat list > at this point) of ?boxes?, where a box is a shapely.geometry.box(). This > takes about 10 seconds to run. > > Looking at this, I am thinking it would lend itself well to > parallelization. Since the box at each ?coordinate" is independent of all > others, it seems I should be able to simply split the list up into chunks > and process each chunk in parallel on a separate core. To that end, I > created a multiprocessing pool: I recall a similar discussion when folk were being encouraged to move away from monolithic and straight-line processing to modular functions - it is more (CPU-time) efficient to run in a straight line; than it is to repeatedly call, set-up, execute, and return-from a function or sub-routine! ie there is an over-head to many/all constructs! Isn't the 'problem' that it is a 'toy example'? That the amount of computing within each parallel process is small in relation to the inherent 'overhead'. Thus, if the code performed a reasonable analytical task within each box after it had been defined (increased CPU load), would you then notice the expected difference between the single- and multi-process implementations? From AKL to AK -- Regards =dn

