La France, terre d’accueil

Un graphique intéressant, trouvé sur le blog de Jean-Marc Manach (le journaliste qui a interviewé Samuel Hocevar aux RMLL pour Le Monde):

Je sais pas vous, mais je trouve ça assez triste, moi, tous ces gens prêts à énormément de sacrifices pour venir dans notre pays, qu’on rejette comme de la merde.

A propos, avez-vous lu le rapport de la CIMADE sur la régularisation des familles étrangères d’enfants scolarisés ? Après l’avoir lu, on se rend mieux compte de la chance qu’on a d’avoir le droit d’être en France…

J’ai une idée qui traine depuis très longtemps. Il s’agirait de créer un site mettant en relation des informaticiens et des associations, afin de permettre aux associations de trouver facilement des informaticiens prêts à donner un coup de main bénévolement (pour réaliser des sites web, administrer quelques machines, etc). Mais je ne sais pas s’il y a toujours un besoin: le besoin existait probablement il y a 3 ou 4 ans, mais maintenant, ça doit être beaucoup plus facile de trouver quelqu’un.

Collaborative Maintenance

Stefano Zacchiroli blogs his thoughts about collaborative maintenance. He identifies two arguments/religions about it:

  1. it is good because duties are shared among co-maintainers
  2. it is bad because no one feels responsible for the co-maintained packages

And he says he stand for the first one.

I’m not sure that those two positions really contradict.

Sure, collaborative maintenance is a good thing, because it allows to share the load between co-maintainers, which often results in bugs being fixed faster. But collaborative maintenance creates the problem of the dilution of responsabilities, and the dilution of knowledge. In many cases, a single maintainer will have a better knowledge of a package than the sum of the knowledge of all co-maintainers. Also, sometimes, teams don’t work very well, and everybody start thinking that a specific bug is another co-maintainer’s problem.

In the pkg-ruby-extras team, we have been trying different organizations over the past two years. We have now settled with the following:

  • each package has a “main responsible person”, listed in the Maintainer field
  • the team address is listed in the Uploaders field, as well as all the team members that are willing to help with that package (people add themselves to Uploaders manually. Another variant is pkg-kde and pkg-gnome’s automatic generation of the Uploaders field based on the last X entries of debian/changelog (pkg-kde variant, pkg-gnome variant).)
    Interestingly, we discovered that for several packages, nobody was really willing to help, so I’m wondering how other teams with (nb of packages) >> (nb of active team members) work.

Stefano also raises the point of infrastructure issues caused by the switch to the team model. I ran into one recently. My “Automated mails to maintainers of packages with serious problems” are currently only sent to the maintainers of the packages with problems, unless someone has problems in both maintained and co-maintained packages (in that case, all problems are mentioned). I thought for a while about sending mails to co-maintainers as well, but that would mean sending more mails… I might try that and see if I get flamed :-)

Easy migration of a service to another, totally different host with iptables

I’m tired of googling for this every time I need it, so I’m blogging about it.

Q: How can one redirect all connections to hostA:portA to hostB:portB, where hostA and hostB and in totally different parts of the Internet?

echo 1 > /proc/sys/net/ipv4/ip_forward
$IPT -t nat -A PREROUTING -p tcp --dport portA -j DNAT --to hostB:portB
$IPT -A FORWARD -i eth0 -o eth0 -d hostB -p tcp --dport portB -j ACCEPT
$IPT -A FORWARD -i eth0 -o eth0 -s hostB -p tcp --sport portB -j ACCEPT
$IPT -t nat -A POSTROUTING -p tcp -d hostB --dport portB -j SNAT --to-source hostA

Connections are masqueraded, that means that, for hostB, all connections are coming from hostA. So be careful.

How do archive rebuilds and piuparts tests on Grid’5000 work?

With the development of rebuildd and the fact that several people are interested in re-using my scripts, I feel the need to explain how this stuff works.


First, Grid’5000 is a research platform used to study computer grids. It’s not really a grid (it doesn’t use all the classic grid middleware such as Globus). Grid’5000 is composed of 9 sites, each hosting from 1 to 3 clusters. Inside clusters, nodes are usually connected using gigabit ethernet, and sometimes another high speed network (Myrinet, Infiniband, etc). Clusters are connected using a dedicated 1OG ethernet network. Grid’5000 is in a big private network (you access it through special gateways), and one can access each node from any other node directly (no need for complex tunnelling).

Using Grid’5000 nodes

When you want to use some nodes on Grid’5000, you have to use a resource manager to say “I’d like to use 50 nodes for 10 hours”. Then your job starts. At that point, you can use a tool called KaDeploy to install your own system on all the nodes (think of it as “FAI for large clusters”). When KaDeploy finishes, the nodes are rebooted in your environment, and you can connect as root. At that point, you can basically break the nodes the way you want, since they will be restored at the end of your job.

Running Debian QA tasks on Grid’5000

None of that was Debian-specific. I will now try to explain how QA tasks are run on Grid’5000. The scripts mentioned below are in the debcluster directory of the collab-qa SVN repository.

When the nodes are ready, the first node is chosen to play a special role (it’s called the master node from now on). A script is run on the master node to prepare it. This consists in mounting a shared NFS directory, and run another script located on this shared NFS directory to install a few packages, configure some stuff, and start a script (masternode.rb) that will schedule the tasks on all the other nodes.

masternode.rb is also responsible for preparing the other nodes. Which consists in mounting the same shared NFS directory, and executing a script (preparenode.rb) that installs a few packages and configures some stuff. After the nodes have been prepared, they are ready to execute tasks.

To execute a task, masternode.rb connects to the node using ssh and executes a script in the shared directory. Those scripts are basically wrappers around lower-level tools. Examples are buildpackage.rb, and piuparts.rb.

Now, some specific details:

  • masternode.rb schedules tasks, not builds. Tasks are commands. So it is possible, in a single Grid’5000 job, to mix a piuparts test on Ubuntu and an archive rebuild on Debian. When another QA task is created, I just have to write another wrapper.
  • Tasks are scheduled using “longest job first”. This doesn’t matter with piuparts tests (which are usually quite short) but is important for archive rebuilds: some packages take a very long time to build. If I want to rebuild all packages in about 10 hours, has to be the first build to start, since building takes about 10 hours itself… So one node will only build, and the other nodes will build the other packages.
  • I use sbuild to build packages, not pbuilder. pbuilder’s algorithm to resolve build-dependencies is a bit broken (#141888, #215065). sbuild’s is broken as well (#395271, #422879, #272955, #403246), but at least it’s broken in the same way as the buildds’, so something that doesn’t build on sbuild won’t build on the buildds, and you can file bugs.
  • I use schroot with “file” chroots. The tarballs are stored on the NFS directory. Which looks ineffective, but actually works very well and is very flexible. A tarball of a build environment is not that big, and this allows for a lot of flexibility and garantees that my build environment is always clean. If I want to build with a different dpkg-dev, I just have to:
    • cp sid32.tgz sid32-new.tgz
    • add the chroot to schroot.conf
    • tell buildpackage.rb to use sid32-new instead of sid32
  • Logs and (if needed) resulting packages are written to the NFS directory.

Comments and questions welcomed :)