While working on the rendering part of Plié–my terminal user interface library–I decided test the time it took for various implementations to run. Normally I don’t worry about timing things in Python, as that seems to be the general advice, but as this would be the core of the library and called many times, it seemed wise to at least look a little into how different implementations compare.
The rendering engine converts a dictionary of (x,y) coordinates representing screen space cells, into a long string (with terminal signals to change lines) that can be printed out to the terminal. That way for compositing and layout I can use screen space coordinates, but still use Blessed for handling fullscreen and terminal signals.
This is a stripped down version of the code I used, with the three different implementation styles.
- not_join() uses the += operator to concatenate strings, and does this for each cell in the dictionary. This method involves a string copy for each cell.
- using_join_poorly() uses the .join() operator to concatenate an iterable–in this case a line/row–together, and then adds that line to the output with the += operator. This method involves a string copy for each line.
- join_all_at_once() creates a list, and uses primarily list operations, until the very end, where it does one .join() to make the list into an output string. This method creates the output string without using any string copies.
For timing I used timeit.timeit() from the standard library with the number of runs set to ten thousand, then recorded results from each run-through to a CSV and ran the total test 34 times giving me the data for the line graph above. Each test uses an empty dictionary to try to minimalize performance differences from anything dictionary key retrieval related. Theres a lot of noise, and the slowest join_all_at_once() runtimes are slower than the fastest not_join() runtimes. So the performance gains are fairly minor.
Calculating the medians on all the runtimes for each type gives me (note these numbers are essentially unit-less):
- not_join() median runtime: 4.510
- using_join_poorly() median runtime: 4.204
- join_all_at_once() median runtime: 4.070
Which means join_all_at_once() is about 10% faster than not_join(), and only about 3% faster than using_join_poorly().
There may be ways to further optimize this, but I’m not sure if it is worth it. join_all_at_once() already sacrifices some readability for speed, which is a quality I don’t really like in my Python. So far the differences are so minor, and the time seems so much more depedent on other factors, that it does not seem worth pursuing additional optimization at this point.
In this process though, I learned a great deal about processing data with Pandas and rendering with matplotlib and Seaborn . Things I’ve done a little of before, but it was an enjoyable diversion to work on trying to get various kinds of graphs to display the data.