Dell: How can we help you?

Dell_LogoDear Dell team,

I’m a business user and in principle a big fan of your XPS series, especially due to the beautiful small frame, the long battery life and elegant design. I’d actually buy one, though, I totally don’t understand you model policy. The XPS was at least partly marketed as the perfect workhorse for business users. And, if you have low hardware requirements, it may actually be, but only in combination with the matte FHD screen. Let me briefly explain why: Dell: How can we help you? weiterlesen

Speedy numpy replacement for matlab accumarray

TLDR: check out this: numpy-groupies

I regularly have to translate some matlab code into python. Most functions there translate fairly well to numpy functions, but the accumarray receipe, that I used to use up to now, sucked quite hard performance wise. So I was looking for some more elegant solution. Unfortunately, there is not too much around, and I was already about to write something together, to ask it at stackoverflow, when I had the Idea for this little snippet:

def accum_np(accmap, a, func=np.sum):
    indices = np.where(np.ediff1d(accmap, to_begin=[1],
    vals = np.zeros(len(indices) - 1)
    for i in xrange(len(indices) - 1):
        vals[i] = func(a[indices[i]:indices[i+1]])
    return vals

Careful: This quick hack only works with contiguous accmaps, like 222111333, but not 1212323. Every change from one number to another will be seen as a new value. This avoids the slow sorting.

Benchmarking shows, that it’s more than 18x faster than the previous solution:

accmap = np.repeat(np.arange(100000), 20)
a = np.random.randn(accmap.size)

timeit accum_py(accmap, a)
>>> 1 loops, best of 3: 16.7 s per loop

timeit accum_np(accmap, a)
>>> 1 loops, best of 3: 887 ms per loop

For completeness, here the timings with octave:

accmap = repmat(1:100000, 20, 1)(:);
a = randn([numel(accmap), 1]);
tic; accumarray(accmap, a); toc
>>> Elapsed time is 0.05152 seconds.

Which actually makes me think of using some bigger guns for the problem now.

So after some days of hacking around with scipy.weave now, I’m down to this:

timeit accum(accmap, a)
1 loops, best of 3: 27 ms per loop

This seems pretty reasonable now, when comparing it with octave.

The new implementation comes with fast implementations written in C for most common functions (sum, prod, min, max, mean, std, …), and falls back to a pure numpy solution for everything less common. It comes with a complete test suite, but if you should face some issues with it, please let me know!


This blog post is quite outdated right now. In collaboration with @d1mansion, all this developed further into some nifty little python package called numpy-groupies available at PyPI and on Github. For more info on this topic, usage details and benchmarks, see the project page at Github.

Wikivoyage parser with heuristics

While travelling through Vietnam, I was using wikivoyage quite intensely as a travel guide and so I started to contribute to some articles there myself. When editing an article, especially cleaning up and structuring long semi-formatted lists of hotels and restaurant was quite annoying, but given the semi-structured shape of the lists, it’s not straight forward, to automate the formatting.

Being annoyed enough by the editing, I took it as a challenge, and wrote a parser making use of a bunch of heuristical rules to classify the list entries, split them into chunks, apply formatting rules on the chunks, and merge it together again into a nicely formatted list entry. So some ugly unstructred listing like

* '''Birmingham Buddhist Centre''', 11 Park Rd, Moseley (''#1, #35 or #50 bus''), ''+44 121'' 449 5279 (''[]''), []. A centre run by the Friends of the Western Buddhist Order'' .

* '''Hotel Indah Manila''' 350 A J Villegas St. Tel: ''+63 2'' 5361188, 5362288. [] Rates start at ₱2000 for this modest 76-room hotel. Facilities include Café Indah and conference and function rooms. Airport and city transfers, tour assistance, and laundry service are available.

becomes nicely formatted into

* {{vCard| type=sight| subtype=religious| name=Birmingham Buddhist Centre| address=11 Park Rd, Moseley| directions=#1, #35 or #50 bus| phone=+44 121 449 5279|| url=| description=A centre run by the Friends of the Western Buddhist Order.}}

* {{vCard| type=hotel| subtype=hotel| name=Hotel Indah Manila| address=350 A J Villegas St| phone=+63 2 5361188, 5362288| url=| price=Rates start at ₱2000 for this modest 76-room hotel| description=Facilities include Café Indah and conference and function rooms. Airport and city transfers, tour assistance, and laundry service are available.}}

I wrote it as a library and gave it a web frontend using CGI or as a standalone version using bottle. After using python intensely for several years, it’s actually the first time, that I used it to display some web content instead of PHP, and I was a bit surprised, how straight forward it was. So, give it a try, and let me know what you think about it! The source code is available at github.

University internships

I stumbled upon my archive with university files recently, and decided some of them would be worth putting online, at least as a memory for myself, like my first first schedule at university. There were some quite interesting projects, that we did there. Sometimes alone, sometimes in small groups. In one of the first terms I had an internship at the Hermann-Gutmann-Werke in Weißenburg. The main idea was to learn, how to work with metal, especially steel and aluminium. I did a lot of drilling, milling, sawing, welding, and in the end, I had build a bunch of nice thingys.

This experience came in quite handy for our next project. We had a competition within our semester, to build a robot for some well defined small task, which we had to accomplish in groups of about ten students. The last years they often had some route finding problems, where robots had to find their way through some labyrinth. But as students already had startet to use some previous code, we got something new. The problem we had to solve, was to grab two pieces of metal from their defined start location and transfer it to a target area. The target area was a printed black ring on white ground, which could be moved around within the white area. So the black circle had to be detected and the metal piece had to be dropped exactly in it’s center. Criterias for the competition were the weight of the whole construction, the precision of finding the circle center and the total time of execution.

When constructing it, it felt like half of the time was spent in endless and pointless discussions about the design, but in the end we achieved some quite good result, actually we even won the competition. The key to that was our design of the mechanism for detecting the circle center. The other groups tried to have some precise construction with a portal robot. In principle they moved a photo diode once in x and once in y direction over the target area, found four spots, and calculated the center from that data. Then they moved the robot there and dropped the load.

The problem with that is, that you have incertainties over the whole construction, and with only one photo diode as a sensor, you have a very bad edge detection. We avoided these issues, using the fact that the circle had a fixed diameter. So our robot arm looked like a fork with three tips. Left and right, we had a whole line of photo diodes upright towards the circle line. The middle tip was holding the metal piece. So the arm was moving into the target area, always following the black lines. As soon as the left and right tip were over the middle of the black line of the circle, no further movement was required and the piece just had to be dropped. It was one of the lightest and still the most precise construction in the field.

An internship, that had quite some impact on my future university life was PEMSY, an internship for programming embedded devices. You get some Atmel ATmega32 microcontroller an put it on a board, which is empty at first. It start putting some LEDs there, and switch them on and of, add some switch, to trigger some action, add a LCD display, a PS/2 keyboard and a RS232 serial connection. Finally we were talking with the RS232 connection to an old mobile phone, could trigger dialing and sending messages, maintaining a phonebook with numbers, triggering all action with the keyboard and the LCD display. Two very intense weeks, where I probably learned more practical knowledge to apply later, than in one year of university before. As I really liked the internship, I ended up becoming the tutor for it as a part time job and stayed with it nearly to the end of my studies.

While studying I was most of the time doing some side jobs, mainly at Siemens, which have their headquater for the medical branch in Erlangen. So as we had to do some more mandatory internships, it was a good choice for me to just stay at Siemens and explore some other departments, which resulted in this first and second report.

I think it was fifth term, when we had a seminar about picking some technical topic and creating an one hour presentation about it. Everybody put quite some effort in that, which resulted for me in these picture rich slides about flow cytometrie.

A bit later I went to China, to the Tongji University in Shanghai, for half a year and became member of the local RoboCup team. Working with Sonys Aibos, I learned a lot about object recognition and wrote a tool in Qt, to visualize the single recognition steps on the fly, make manual adjustments and calibrate the recognition. Quite messy programming, as I had to work with parts of the robots code, where the class structure looked more like a bunch of connected neurons, than like a hierarchy!

Anyways, respect for the team! Unlike many other teams their, they had written their code from scratch totally on their own, and not just reused the open sourced software of the GermanTeam. In the end we became 2nd of the Chinese national competition. My impressions of this half year are collected in my Shanghai Blog. Here are the slides of my final presentation  about it.

After all this experience with embedded systems, doing my diploma work in the same area was the obvious choice. I built a real time measuring system for laser cutting or welding processes, which detects the distance between the laser source and the workpiece, in order to adjust this distance for optimal welding results. I always hated, to write together reports or papers. So doing the actual experiments and implementation was fine, but writing it together was quite a torture, though I’m still quite happy with the final result and this presentation. At that time I was even thinking about starting to work for some German laser manufacturers, nevertheless, the first and only try with an engineering job afterwards was in the car industry. But that’s another story.