Debian is changing

I’ve been playing with snapshot.d.o, which is a fantastic resource to look at how Debian has been changing over the years.
Below are some preliminary results about various aspects of source packages: format, packaging helpers, and patch systems.

For clarification:

  • “modif-files” indicates packages that do not use a patch system, but modify files outside debian/.
  • “no-modif-files” indicates packages that do not use a patch system, but do not modify files outside debian/.
  • “other” is “everything else”. Like packages using dpatch, but still modifying some files outside Debian. Or using both dpatch and quilt. (yes, some packages do that).

Random comments:

  • People have been moving away from simple-patchsys and dpatch to quilt since 01/2009 (their usage has been decreasing since then). It seems that discussing the introduction of the new package formats was enough to make people move away from dpatch, even if the 3.0 (quilt) was not usable yet. We now have less than 1000 packages still using dpatch or simple-patchsys. Can we get rid of them all for wheezy? It would be nice to standardize on quilt.
  • The growth of the 3.0 (quilt) format is impressive. Its adoption is much faster than the adoption of dh7. That surprises me a lot, since I had the impression that more people were unhappy about 3.0 (quilt) than about dh7.
  • The number of packages with no packaging helper, but no modified files outside debian/ is dropping. Either than means that we are patching upstream more and more (unlikely), or that means that people are switching to 3.0 (quilt) even if they have no patches, probably due to the other benefits! Could this mean that we could standardize on 3.0 (quilt) and 3.0 (native) for wheezy or wheezy+1?
  • The number of packages that modify files outside debian/ without using a patch system has been constantly dropping since 2005.
  • dh7 is not a cdbs killer. While it seems to be the cause for a huge drop in “classic debhelper” packaging, the number of packages using cdbs is still progressing.

If you have other ideas of things to investigate using snapshot.d.o, don’t hesitate to ask in comments.

Debian Spring Cleanup

We have many orphaned packages in Debian. Those are packages that no longer have a maintainer willing to care for them. They are a problem for several reasons. Often, there have better alternatives (that’s often why the maintainer gave up maintenance), but users keep installing them because they show up in apt-cache search. Or, they might be broken, without anybody noticing because they only have a few users. Also, developers need to take them into account. They might slow down transitions, have RC bugs that need fixing, etc.

So, there are good reasons to remove some of the orphaned packages from the Debian archive. Not all of them. But those that have been orphaned for a long time without anybody stepping up to take over the maintenance, have very few users according to popcon, etc. Removing some of them also makes it easier to expose the ones that should really find an adopter.

Also, that removal is not final. The good reason to do this now is that we are at the beginning of a release cycle, and that it will be easy to reintroduce packages that are really needed during the next two years. Also, packages (both source and binaries) are still available on snapshot.debian.org, so users can still install them from there, and re-uploading them to Debian is very easy if someone wants to adopt them.

So, how do I proceed? I use bapase, and go through specific sets of packages (using a custom version of bapase to restrict the list to those packages). For example, orphaned packages that have a popcon lower than 50, have been orphaned for more than 500 days, and whose orphaning bugs have not been modified in the last 100 days. Then, I read the bug log, and decide if there’s a reason to keep the package. If there’s not, I request the removal.

As of today, there are 471 orphaned packages in Debian. The goal is not to get down to zero, but it’s certainly possible to get down to 200 or 250.

How can you help? Start from the list, and find some easy targets. You can take a look at bug #483252 if you want to re-use my template. Also, one thing that is currently missing in the process is that orphaned packages that have reverse-dependencies should not be removed without removing their dependencies (if appropriate). An easy and efficient way to know if a given package has reverse (build-)?dependencies would be great, so I could integrate this info into bapase and exclude those packages from the list, or mark them somehow.

Introducing the Debian packaging tutorial

One of the common complains about Debian packaging is that it’s hard to learn because, while there is quite a lot of high-quality documentation, it is often written more as a reference manual than as a tutorial: it’s great if you already know everything and want to check some detail, but not so great if you want to learn everything from scratch.

I have been volunteered (i.e, someone decided I volunteered) for a “Debian packaging” tutorial at work, so I decided to give a try at tackling this issue. I also volunteered (voluntarily this time) for a similar talk at RMLL 2011 to make sure I would be forced to do the work and prepare the actual tutorial. I’m also considering teaching this next year in Licence Pro ASRALL, but I haven’t made up my mind about it yet.

The result is a work in progress (hey, I still have a lot of time), but in the release-early-release-often tradition, I’m making it public now in the hope that someone will pick up the idea and do all the work for me (you never know).

I’ve decided to create a set of slides using Latex Beamer. The current version can be found here. The sources are available in a git repository, and all contributions are welcomed (including plain comments or suggestions). The last slide is the current TODO list.

Re: “please send a patch”

It seems that Matthew Palmer misread my blog post as a complaint against developers asking for patches in exchange of pet feature requests. He really should pay more attention, since I gave “pet feature requests” as an example of case where it would be appropriate to ask for a patch:

Of course, there are cases where it’s perfectly reasonable to ask for a patch: when the task is expected to take hours, or when the result is of limited interest to everybody except the demander.

But even then, it’s not clear. This morning I got an email from a someone involved in PHP packages maintenance, who said that Bugs Search @ UDD was a great tool, but that he would love to have a way to list all bugs affecting packages with the implemented-in::php debtag.
To produce a working patch for this would probably take him at least an hour. You need to set up a copy of the CGI on alioth, understand the DB structure, dig into the code, etc. If you don’t understand SQL and Ruby, it could be a really difficult process. Also, it’s probably quite uninteresting for him to do that, since he is unlikely to stick around developing UDD.
Instead, it didn’t take me more than 5 minutes to produce a one-liner.
The net result for Debian in that case? 55 minutes saved by a developer.

Update:
Torsten Werner wrote an angry reply to my post. It’s true that yesterday’s episode triggered my blog post, because I felt quite frustrated to have to provide a patch for something that simple, and would have preferred to use the time for a Debian task where I would be more efficient. But I was not particularly angry at that episode, since that’s something I’ve seen on several occasions. That’s also why I did not mention any team in particular.
The feature request I was making was reasonable, and cannot really be considered a pet feature request (though I might be biased with my QA hat on): mentionning in the dak templates used for bug closure that packages removed from Debian can still be found on snapshot.d.o. The fact that he thinks that addressing this himself turns him into a slave raises interesting questions.

“Please send a patch”

There’s a frequent pattern in the Debian community where someone would suggest an improvement to some package or service, and the person responsible for it would reply with “please send a patch”.

Of course, there are cases where it’s perfectly reasonable to ask for a patch: when the task is expected to take hours, or when the result is of limited interest to everybody except the demander.
But often, it would be much more efficient for the person responsible for the service to take 5 or 10 minutes to make the change, than for the demander to spend half an hour learning how to contribute to this service, writing an untestable patch, and sending it back to the person responsible for the service for integration.

It is important to see this problem as the issue of maximizing the usefulness of the resources we have. Often, we are in a situation where we could either:

  • spend 30 minutes on a service we don’t know, just to push one change
  • spend 30 minutes making three or more changes that others would like to see in a service we know

It would be much better if, instead of saying “Please send a patch”, people would say “Are you interested in contributing a patch, or should I make the change myself?”. Remember that each time you ask someone to take some time to contribute a patch in an area where he is not an efficient contributor, you take some time away from him that could be used to contribute something in an area where he is efficient.