I like to think that archive rebuilds play an important role in Debian Quality Assurance and Release Management efforts. By trying to rebuild every Debian package from source, one can identify packages that do not build anymore due to changes in other packages (compilers, interpreters, libraries, …). It is also a good way to stress-test all packages that are involved in building other packages.
Since 2007, I had been running Debian archive rebuilds on the Grid’5000 testbed, a research infrastructure for performing experiments on distributed systems – HPC/Grid/Cloud/P2P. I filed more than 6000 release-critical bugs in the process.
Late last year, Amazon kindly offered us a grant to allow us to run such QA tests on Amazon Web Services. With Sébastien Badia, we ported the rebuild infrastructure to AWS (scripts), and several rebuilds have already been carried out on AWS.
On the technical level, 50 to 100 EC2 spot instances are started, and then controlled from a master instance using SSH. On build instances, a classic sbuild setup is used. Logs are retrieved to the master node after rebuilds, and build instances are simply shut down when there are no more tasks to process. Several tasks are processed simultaneously on each instance, and when they fail, they are retried again with no other concurrent build on the same instance, to eliminate random failures caused by load or timing issues. All the scripts are designed to support other kind of QA tests, not just rebuilds.
Moving to Amazon Web Services will facilitate sharing the human workload of doing those tests. It is now possible for developers interested in custom tests to do them themselves (hint hint).