Debian archive rebuilds on Amazon Web Services

I like to think that archive rebuilds play an important role in Debian Quality Assurance and Release Management efforts. By trying to rebuild every Debian package from source, one can identify packages that do not build anymore due to changes in other packages (compilers, interpreters, libraries, …). It is also a good way to stress-test all packages that are involved in building other packages.

Since 2007, I had been running Debian archive rebuilds on the Grid’5000 testbed, a research infrastructure for performing experiments on distributed systems – HPC/Grid/Cloud/P2P. I filed more than 6000 release-critical bugs in the process.

Late last year, Amazon kindly offered us a grant to allow us to run such QA tests on Amazon Web Services. With Sébastien Badia, we ported the rebuild infrastructure to AWS (scripts), and several rebuilds have already been carried out on AWS.

On the technical level, 50 to 100 EC2 spot instances are started, and then controlled from a master instance using SSH. On build instances, a classic sbuild setup is used. Logs are retrieved to the master node after rebuilds, and build instances are simply shut down when there are no more tasks to process. Several tasks are processed simultaneously on each instance, and when they fail, they are retried again with no other concurrent build on the same instance, to eliminate random failures caused by load or timing issues. All the scripts are designed to support other kind of QA tests, not just rebuilds.

Moving to Amazon Web Services will facilitate sharing the human workload of doing those tests. It is now possible for developers interested in custom tests to do them themselves (hint hint).

8 thoughts on “Debian archive rebuilds on Amazon Web Services

  1. Can you provide some EC2 statistics? What type of instances do you start and how long does it take? Such that one can estimate approximately, how much does it cost to run?

    Personally I would be interested in using this with juju to spin up the rebuild cluster on with ec2 and/or openstack on my own servers. And limit the rebuilds to a particular subset of packages.

    ps. I bet ubuntu is interested in this.

  2. Now I understand how spot instancing is useful and it is really cool.

    If I understand the readme, it would cost something like $33/hr on EC2 for the 50 small + 50 medium nodes and I should think they run for many hours, but I guess Amazon probably sponsor the Debian archive QA rebuilds?

    For the rest of us, maybe with OpenStack, server auctions and/or spare machines we have lying around could be put to good use for testing new build-essential stuff or doing other architecture-specific rebuilds to help with porting.

  3. @Riku:
    The full rebuild takes the time it takes to rebuild the package that takes the more time to build (usually libreoffice). All other packages can be built on the other nodes during the same timespan if you use enough instances. That means 8-10h.

    @Dmitrijs:
    I used both m1.medium and m2.xlarge. m2.xlarge for the “large” packages (libreoffice, linux-2.6, gcc-*, etc.) m1.medium for everything else.
    Regarding cost, it’s hard to say. Probably less than $100.

    @Steven:
    Yes, Amazon sponsored us.
    Note that a trick to reduce costs is to request spot instances at the price of the normal instances, instead of normal instances. Most of the time, they will cost much less than that. But if something really bad happens and the price of spot instances grows higher than the price of normal instances, your instances get terminated.

  4. Hi Lucas, I just uploaded euca2ools 2.0.2 to Unstable. It is a Free implementation of the EC2 API that can replace the ec2-api-tools with a very similar if not identical syntax. Perhaps you can give it a try at your next run ?

Comments are closed.