Asked  7 Months ago    Answers:  5   Viewed   127 times

This is probably a trivial question, but how do I parallelize the following loop in python?

# setup output lists
output1 = list()
output2 = list()
output3 = list()

for j in range(0, 10):
    # calc individual parameter value
    parameter = j * offset
    # call the calculation
    out1, out2, out3 = calc_stuff(parameter = parameter)

    # put results into correct output list
    output1.append(out1)
    output2.append(out2)
    output3.append(out3)

I know how to start single threads in Python but I don't know how to "collect" the results.

Multiple processes would be fine too - whatever is easiest for this case. I'm using currently Linux but the code should run on Windows and Mac as-well.

What's the easiest way to parallelize this code?

 Answers

88

Using multiple threads on CPython won't give you better performance for pure-Python code due to the global interpreter lock (GIL). I suggest using the multiprocessing module instead:

pool = multiprocessing.Pool(4)
out1, out2, out3 = zip(*pool.map(calc_stuff, range(0, 10 * offset, offset)))

Note that this won't work in the interactive interpreter.

To avoid the usual FUD around the GIL: There wouldn't be any advantage to using threads for this example anyway. You want to use processes here, not threads, because they avoid a whole bunch of problems.

Tuesday, June 1, 2021
 
jedwards
answered 7 Months ago
73

Actually, using a base class would work out better here:

class InstancesList(object): 
    def __new__(cls, *args, **kw):
        if not hasattr(cls, 'instances'):
            cls.instances = []
        return super(InstancesList, cls).__new__(cls, *args, **kw)

    def __init__(self):
        self.index = len(type(self).instances)
        type(self).instances.append(self)

class Foo(InstancesList):
    def __init__(self, arg1, arg2):
        super(Foo, self).__init__()
        # Foo-specific initialization
Sunday, August 15, 2021
 
dannyhobby
answered 4 Months ago
69

Thanks to @jp2011 for pointing to pygmo

First, worth noting the difference from pygmo 1, since the fist link on google still directs to the older version.

Second, Multiprocessing island are available only for python 3.4+

Third, it works. The processes I started when I first asked the question are still running while I write, the pygmo archipelago running an extensive test of all the 18 possible DE variations present in saDE made in less than 3h. The compiled version using Numba as suggested here https://esa.github.io/pagmo2/docs/python/tutorials/coding_udp_simple.html will probably finish even earlier. Chapeau.

I personally find it a bit less intuitive than the scipy version, given the need to build a new class (vs a signle function in scipy) to define the problem but is probably just a personal preference. Also, the mutation/crossing over parameters are defined less clearly, for someone approaching DE for the first time might be a bit obscure.
But, since serial DE in scipy just isn't cutting it, welcome pygmo(2).

Additionally I found a couple other options claiming to parallelize DE. I didn't test them myself, but might be useful to someone stumbling on this question.

Platypus, focused on multiobjective evolutionary algorithms https://github.com/Project-Platypus/Platypus

Yabox
https://github.com/pablormier/yabox

from Yabox creator a detailed, yet IMHO crystal clear, explaination of DE https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/

Wednesday, August 25, 2021
 
altermativ
answered 3 Months ago
45

Two options come to mind:

  1. Port your program into a windows service. You can probably share much of your code between the two implementations.

  2. Does your program really use any daemon functionality? If not, you rewrite it as a simple server that runs in the background, manages communications through sockets, and perform its tasks. It will probably consume more system resources than a daemon would, but it would be quote platform independent.

Sunday, September 5, 2021
 
Timur Mustafaev
answered 3 Months ago
85
from concurrent.futures import ProcessPoolExecutor, Future, wait
from itertools import combinations
from functools import partial

similarity_matrix = [[0]*word_count for _ in range(word_count)]

def callback(i, j, future):
    similarity_matrix[i][j] = future.result()
    similarity_matrix[j][i] = future.result()

with ProcessPoolExecutor(max_workers=4) as executer:
    fs = []
    for i, j in combinations(range(wordcount), 2):
        future = excuter.submit(
                    calculate_similarity, 
                    t_matrix[i], 
                    t_matrix[j])

        future.add_done_callback(partial(callback, i, j))
        fs.append(future)

    wait(fs)
Saturday, October 16, 2021
 
huhushow
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share