Speedy numpy replacement for matlab accumarray

TLDR: check out this: numpy-groupies

I regularly have to translate some matlab code into python. Most functions there translate fairly well to numpy functions, but the accumarray receipe, that I used to use up to now, sucked quite hard performance wise. So I was looking for some more elegant solution. Unfortunately, there is not too much around, and I was already about to write something together, to ask it at stackoverflow, when I had the Idea for this little snippet:

def accum_np(accmap, a, func=np.sum):
    indices = np.where(np.ediff1d(accmap, to_begin=[1],
                                  to_end=[1]))[0]
    vals = np.zeros(len(indices) - 1)
    for i in xrange(len(indices) - 1):
        vals[i] = func(a[indices[i]:indices[i+1]])
    return vals

Careful: This quick hack only works with contiguous accmaps, like 222111333, but not 1212323. Every change from one number to another will be seen as a new value. This avoids the slow sorting.

Benchmarking shows, that it’s more than 18x faster than the previous solution:

accmap = np.repeat(np.arange(100000), 20)
a = np.random.randn(accmap.size)

timeit accum_py(accmap, a)
>>> 1 loops, best of 3: 16.7 s per loop

timeit accum_np(accmap, a)
>>> 1 loops, best of 3: 887 ms per loop

For completeness, here the timings with octave:

accmap = repmat(1:100000, 20, 1)(:);
a = randn([numel(accmap), 1]);
tic; accumarray(accmap, a); toc
>>> Elapsed time is 0.05152 seconds.

Which actually makes me think of using some bigger guns for the problem now.

So after some days of hacking around with scipy.weave now, I’m down to this:

timeit accum(accmap, a)
1 loops, best of 3: 27 ms per loop

This seems pretty reasonable now, when comparing it with octave.

The new implementation comes with fast implementations written in C for most common functions (sum, prod, min, max, mean, std, …), and falls back to a pure numpy solution for everything less common. It comes with a complete test suite, but if you should face some issues with it, please let me know!

Edit:

This blog post is quite outdated right now. In collaboration with @d1mansion, all this developed further into some nifty little python package called numpy-groupies available at PyPI and on Github. For more info on this topic, usage details and benchmarks, see the project page at Github.