At Spotify, we have deployed Python software in Debian packages for a fairly long time. To build them, we have our continuous integration platform with sbuild building and uploading them automatically from each commit. For deploying software this system works fairly well when augmented with puppet. There are, however, some drawbacks on the developer side.
First drawback is the state of the Debian Python packages. Given that we run the Debian stable, the packages, even at the time of the release of the stable version, are often outdated and missing features. This leads to the need of backporting newer packages into our internal debian repositories.
Second, bigger problem, is that when you hire a proficient Python developer, you can safely assume the developer knows how to use Python packaging. However, that won’t fly with our system. In order to push out the software the dev has been working on she needs to know how to package Python inside Debian packages in first place. And Debian packaging has a somewhat steep learning curve.
How the rest of the world does it?
If you look at practically any existing Python project in the outside world, they basically follow the same pattern: setup.py or requirements.txt defines all the installation requirements, and all of them are available for installation from the Python Package Index, PyPI. This all is then installed into a dedicated virtualenv, that makes sure system libraries won’t affect your code.
This is also what developers are accustomed to and the slap in the face is pretty harsh when they see our systems for the first time.
Combining the two views: dh-virtualenv
Debian packaging has one significant advantage over plain Python packaging: It has the ability to define dependencies on system libraries. Using it you can say that installing lxml requires libxml on the target machine. So, in order to have the cake and eat it too, we decided to combine the two great packaging systems and dh-virtualenv was born!
Inspiration behind dh-virtualenv lies in the awesome blog post (and in a series of conference talks) by Hynek Schlawack. While Hynek’s solution uses fpm for packaging, we decided that it would be great to be able to keep using our current sbuild build infrastructure.
How dh-virtualenv works?
Dh-virtualenv works by registering itself in the debhelper build sequence, pretty much in similar way like the current Debian python packaging (dh_python2) does. This way using dh-virtualenv for your package is as easy as build-depending to dh-virtualenv and writing a debian/rules file containing:
%: dh $@ --with python-virtualenv
This will bundle all the requirements of your software (defined in a requirements.txt) into a virtualenv, do some shebang-manipulation and instruct the debian package to drop the virtualenv into some suitable location on the target machine. By default this is /usr/share/python/<package-name>, but can of course be customized.
Simplifying deployment with dh-virtualenv: Sentry
At Spotify, we have recently started testing the Sentry (http://getsentry.com) event logging platform for integrating our system logs into our development workflow. In Python, Java, and PHP services, we can add a few lines of code and all of our log messages (and any uncaught errors that we might have missed) are sent to the Sentry service, where they are aggregated. Because the messages are sent from within the programming language, rather than scraped from syslog output, Sentry is able to intelligently collapse messages that have the same formatting string, making it possible for us to better track unexpected behaviour or incidents.
sentry[postgres]==6.2.0 eventlet==0.13.0 hiredis==0.1.1 django-auth-ldap==1.1.4
The debian/control file lets us add any of the native dependencies that are needed for building/testing/executing the Sentry server. We just use this to make sure the dh-virtualenv package is available for the build, along with the other required packages.
Build-Depends: python (>= 2.6.6-3~), debhelper (>= 8), dh-virtualenv, python-dev, libpq-dev, libldap2-dev, libsasl2-dev, Standards-Version: 3.9.3 X-Python-Version: >= 2.6
%: dh $@ --with python-virtualenv override_dh_virtualenv: dh_virtualenv --index-url='http://localhost/simple'
Sentry is a great piece of software, but can be a bit difficult to deploy without using pip as the deploy mechanism. Using dh-virtualenv, we’re able to build a simple focused Debian package that ensures everything is available in a nicely packaged and sandboxed way.