Call for help: graphing Debian trends

It has been raised in various discussions how much it’s difficult to make large-scale changes in Debian.

I think that one part of the problem is that we are not very good at tracking those large-scale changes, and I’d like to change that. A long time ago, I did some graphs about Debian (first in 2011, then in 2013, then again in 2015). An example from 2015 is given below, showing the market share of packaging helpers.

Those were generated using a custom script. Since then, classification tags were added to lintian, and I’d like to institutionalize that a bit, to make it easier to track more trends in Debian, and maybe motivate people with switching to new packaging standards. This could include stuff like VCS used, salsa migration, debhelper compat levels, patch systems and source formats, but also stuff like systemd unit files vs traditional init scripts, hardening features, etc. The process would look like:

  1. Add classification tags to lintian for relevant stuff (maybe starting with being able to regenerate the graphs from 2015).
  2. Use lintian to scan all packages on snapshot.debian.org, which stores all packages ever uploaded to Debian (well, since 2005), and generate a dataset
  3. Generate nice graphs

Given my limited time available for Debian, I would totally welcome some help. I can probably take care of the second step (I actually did it recently on a subset of packages to check feasibility), but I would need:

  • The help of someone with Perl knowledge, willing to modify lintian to add additional classification tags. There’s no need to be a Debian Developer, and lintian has an extensive test suite, that should make it quite fun to hack on. The code could either be integrated in lintian, or live in a lintian fork that would only be used to generate this data.
  • Ideally (but that’s less important at this stage), the help of someone with web skills to generate a nice website.

Let me know if you are interested.

On Debian frustrations

Michael Stapelberg writes about his frustrations with Debian, resulting in him reducing his involvement in the project. That’s sad: over the years, Michael has made a lot of great contributions to Debian, addressing hard problems in interesting, disruptive ways.

He makes a lot of good points about Debian, with which I’m generally in agreement. An interesting exercise would be to rank those issues: what are, today, the biggest issues to solve in Debian? I’m nowadays not following Debian closely enough to be able to do that exercise, but I would love to read others’ thoughts (bonus points if it’s in a DPL platform, given that it seems that we have a pretty quiet DPL election this year!)

Most of Michael’s points are about the need for modernization of Debian’s infrastructure and workflows, and I agree that it’s sad that we have made little progress in that area over the last decade. And I think that it’s important to realize that providing alternatives to developers have a cost, and that when a large proportion of developers or packages have switched to doing something (using git, using dh, not using 1.0-based patch systems such as dpatch, …), there are huge advantages with standardizing and pushing this on everybody.

There are a few reasons why this is harder than it sounds, though.

First, there’s Debian culture of stability and technical excellence. “Above all, do not harm” could also apply to the mindset of many Debian Developers. On one hand, that’s great, because this focus on not breaking things probably contributes a lot to our ability to produce something that works as well as Debian. But on the other hand, it means that we often seek solutions that limit short-term damage or disruption, but are far from optimal on the long term.
An example is our packaging software stack. I wrote most of the introduction to Debian packaging found in the packaging-tutorial package (which is translated in six languages now), but am still amazed by all the unjustified complexity. We tend to fix problems by adding additional layers of software on top of existing layers, rather than by fixing/refactoring the existing layers. For example, the standard way to package software today is using dh. However, dh stands on dh_* commands (even if it does not call them directly, contrary to what CDBS did), and all the documentation on dh is still structured around those commands: if you want to install an additional file in a package, probably the simplest way to do that is to add it to debian/packagename.install, but this is documented in the manpage for dh_install, which your are not going to actually call because dh abstracts that away for you! I realize that this could be better explained in packaging-tutorial… (patch welcomed)

There’s also the fact that Debian is very large, very diverse, and hard to test. It’s very easy to break things silently in Debian, because many  of our packages are niche packages, or don’t have proper test suites (because not everything can be easily tested automatically). I don’t see how the workflows for large-scale changes that Michael describes could work in Debian without first getting much better at detecting regressions.

Still, there’s a lot of innovation going on inside packaging teams, with the development of language-specific packaging helpers (listed on the AutomaticPackagingTools wiki page). However, this silo-ed organization tends to fragment the expertise of the project about what works and what doesn’t: because packaging teams don’t talk much together, they often solve the same problems in slightly different ways. We probably need more ways to discuss interesting stuff going on in teams, and consolidating what can be shared between teams. The fact that many people have stopped following debian-devel@ nowadays is probably not helping…

The addition of salsa.debian.org is probably the best thing that happened to Debian recently. How much this ends up being used for improving our workflows remain to be seen:

  • We could use Gitlab merge requests to track patches, rather than attachments in the BTS. Some tooling to provide an overview of open MRs in various dashboards is probably needed (and unfortunately GitLab’s API is very slow when dealing with large number of projects).
  • We could probably have a way to move the package upload to a gitlab-ci job (for example, by committing the signed changes file in a specific branch, similar to what pristine-tar does, but there might be a better way)
  • I would love to see a team experiment with a monorepo approach (instead of the “one git repo per package + mr to track them all” approach). For teams with lots of small packages there are probably a lot of things to win with such an organization.

 

Sending mail from mutt with queueing and multiple accounts/profiles?

I’m looking into upgrading my email setup. I use mutt, and need two features:

  • Local queueing of emails, so that I can write emails when offline, queue them, and send them later when I’m online.
  • Routing of emails through several remote SMTP servers, ideally depending on the From header, so that emails pass SPF/DKIM checks.

I currently use nullmailer, which does queueing just fine, but cannot apparently handle several remote SMTP servers.

There’s also msmtp, which can handle several “accounts” (remote SMTPs). But apparently not when using queueing using msmtpq.

What are you using yourself?

systemd services, and queue management?

I’ve been increasingly using systemd timers as a replacement for cron jobs. The fact that you get free logging is great, and also the fact that you don’t have to care about multiple instances running simultaneously.

However, sometimes I would be interested in more complex scenarios, such as:

  • I’d like to trigger a full run of the service unit: if the service is not running, it should be started immediately. If it’s currently running, it should be started again when it terminates.
  • Same as the above, but with queue coalescing: If I do the above multiple times in a row, I only want the guarantee that there’s one full run of the service after the last time I triggered it (typical scenario: each run processes all pending events, so there’s no point in running multiple times).

Is this doable with systemd? If not, how do people do that outside of systemd?

Implementing “right to disconnect” by delaying outgoing email?

France passed a law about “right to disconnect” (more info here or here). The idea of not sending professional emails when people are not supposed to read them in order to protect their private lifes, is a pretty good one, especially when hierarchy is involved. However, I tend to do email at random times, and I would rather continue doing that, but just delay the actual sending of the email to the appropriate time (e.g., when I do email in the evening, it would actually be sent the following morning at 9am).

I wonder how I could make this fit into my email workflow. I write email using mutt on my laptop, then push it locally to nullmailer, that then relays it, over an SSH tunnel, to a remote server (running Exim4).

Of course the fallback solution would be to use mutt’s postponing feature. Or to draft the email in a text editor. But that’s not really nice, because it requires going back to the email at the appropriate time. I would like a solution where I would write the email, add a header (or maybe manually add a Date: header — in all cases that header should reflect the time the mail was sent, not the time it was written), send the email, and have nullmailer or the remote server queue it until the appropriate time is reached (e.g., delaying while “current_time < Date header in email”). I don’t want to do that for all emails: e.g. personal emails can go out immediately.

Any ideas on how to implement that? I’m attached to mutt and relaying using SSH, but not attached to nullmailer or exim4. Ideally the delaying would happen on my remote server, so that my laptop doesn’t need to be online at the appropriate time.

Update: mutt does not allow to set the Date: field manually (if you enable the edit_headers option and edit it manually, its value gets overwritten). I did not find the relevant code yet, but that behaviour is mentioned in that bug.

Update 2: ah, it’s this code in sendlib.c (and there’s no way to configure that behaviour):

 /* mutt_write_rfc822_header() only writes out a Date: header with
 * mode == 0, i.e. _not_ postponment; so write out one ourself */
 if (post)
   fprintf (msg->fp, "%s", mutt_make_date (buf, sizeof (buf)));

The Linux 2.5, Ruby 1.9 and Python 3 release management anti-pattern

There’s a pattern that comes up from time to time in the release management of free software projects.

To allow for big, disruptive changes, a new development branch is created. Most of the developers’ focus moves to the development branch. However at the same time, the users’ focus stays on the stable branch.

As a result:

  • The development branch lacks user testing, and tends to make slower progress towards stabilization.
  •  Since users continue to use the stable branch, it is tempting for developers to spend time backporting new features to the stable branch instead of improving the development branch to get it stable.

This situation can grow up to a quasi-deadlock, with people questioning whether it was a good idea to do such a massive fork in the first place, and if it is a good idea to even spend time switching to the new branch.

To make things more unclear, the development branch is often declared “stable” by its developers, before most of the libraries or applications have been ported to it.

This has happened at least three times.

First, in the Linux 2.4 / 2.5 era. Wikipedia describes the situation like this:

Before the 2.6 series, there was a stable branch (2.4) where only relatively minor and safe changes were merged, and an unstable branch (2.5), where bigger changes and cleanups were allowed. Both of these branches had been maintained by the same set of people, led by Torvalds. This meant that users would always have a well-tested 2.4 version with the latest security and bug fixes to use, though they would have to wait for the features which went into the 2.5 branch. The downside of this was that the “stable” kernel ended up so far behind that it no longer supported recent hardware and lacked needed features. In the late 2.5 kernel series, some maintainers elected to try backporting of their changes to the stable kernel series, which resulted in bugs being introduced into the 2.4 kernel series. The 2.5 branch was then eventually declared stable and renamed to 2.6. But instead of opening an unstable 2.7 branch, the kernel developers decided to continue putting major changes into the 2.6 branch, which would then be released at a pace faster than 2.4.x but slower than 2.5.x. This had the desirable effect of making new features more quickly available and getting more testing of the new code, which was added in smaller batches and easier to test.

Then, in the Ruby community. In 2007, Ruby 1.8.6 was the stable version of Ruby. Ruby 1.9.0 was released on 2007-12-26, without being declared stable, as a snapshot from Ruby’s trunk branch, and most of the development’s attention moved to 1.9.x. On 2009-01-31, Ruby 1.9.1 was the first release of the 1.9 branch to be declared stable. But at the same time, the disruptive changes introduced in Ruby 1.9 made users stay with Ruby 1.8, as many libraries (gems) remained incompatible with Ruby 1.9.x. Debian provided packages for both branches of Ruby in Squeeze (2011) but only changed the default to 1.9 in 2012 (in a stable release with Wheezy – 2013).

Finally, in the Python community. Similarly to what happened with Ruby 1.9, Python 3.0 was released in December 2008. Releases from the 3.x branch have been shipped in Debian Squeeze (3.1), Wheezy (3.2), Jessie (3.4). But the ‘python’ command still points to 2.7 (I don’t think that there are plans to make it point to 3.x, making python 3.x essentially a different language), and there are talks about really getting rid of Python 2.7 in Buster (Stretch+1, Jessie+2).

In retrospect, and looking at what those projects have been doing in recent years, it is probably a better idea to break early, break often, and fix a constant stream of breakages, on a regular basis, even if that means temporarily exposing breakage to users, and spending more time seeking strategies to limit the damage caused by introducing breakage. What also changed since the time those branches were introduced is the increased popularity of automated testing and continuous integration, which makes it easier to measure breakage caused by disruptive changes. Distributions are in a good position to help here, by being able to provide early feedback to upstream projects about potentially disruptive changes. And distributions also have good motivations to help here, because it is usually not a great solution to ship two incompatible branches of the same project.

(I wonder if there are other occurrences of the same pattern?)

Update: There’s a discussion about this post on HN

Re: Sysadmin Skills and University Degrees

Russell Coker wrote about Sysadmin Skills and University Degrees. I couldn’t agree more that a major deficiency in Computer Science degrees is the lack of sysadmin training. It seems like most sysadmins learned most of what they know from experience. It’s very hard to recruit young engineers (freshly out of university) for sysadmin jobs, and the job interviews are often a bit depressing. Sysadmins jobs are also not very popular with this public, probably because university curriculums fail to emphasize what’s exciting about those jobs.

However, I think I disagree rather deeply with Russell’s detailed analysis.

First, Version Control. Well, I think that it’s pretty well covered in university curriculums nowadays. From my point of view, teaching CS in Université de Lorraine (France), mostly in Licence Professionnelle Administration de Systèmes, Réseaux et Applications à base de Logiciels Libres (warning: french), a BSc degree focusing on Linux systems administration, it’s not usual to see student projects with a mandatory use of Git. And it doesn’t seem to be a major problem for students (which always surprises me). However, I wouldn’t rate Version Control as the most important thing that is required for a sysadmin. Similarly Dependencies and Backups are things that should be covered, but probably not as first class citizens.

I think that there are several pillars in the typical sysadmin knowledge.

First and foremost, sysadmins need a good understanding of the inner workings of an operating system. I sometimes feel that many Operating Systems Design courses are a bit too much focused on the “Design” side of things. Yes, it’s useful to understand the low-level mechanisms, and be able to (mentally) recreate an OS from scratch. But it’s also interesting to know how real systems are actually built, and what are the trade-off involved. I very much enjoyed reading Branden Gregg’s Systems Performance: Enterprise and the Cloud because each chapter starts with a great overview of how things are in the real world, with a very good level of detail. Also, addressing OS design from the point of view of performance could be a way to turn those courses into something more attractive for students: many people like to measure, benchmark, optimize things, and it’s quite easy to demonstrate how different designs, or different configurations, make a big difference in terms of performance in the context of OS design. It’s possible to be a sysadmin and ignore, say, the existence of the VFS, but there’s a large class of problems that you will never be able to solve. It can be a good trade-off for a curriculum (e.g. at the BSc level) to decide to ignore most of the low-level stuff, but it’s important to be aware of it.

Students also need to learn how to design a proper infrastructure (that meets requirements in terms of scalability, availability, security, and maybe elasticity). Yes, backups are important. But monitoring is, too. As well as high availability. In order to scale, it’s important to be able to automatize stuff. Russell writes that Sysadmins need some programming skills, but that’s mostly scripting and basic debugging. Well, when you design an infrastructure, or when you use configuration management tools such as Puppet, in some sense, you are programming, and in terms of needs to abstract things, it’s actually similar to doing object-oriented programming, with similar choices (should I use that off-the-shelf puppet module, or re-develop my own? How should everything fit together?). Also, when debugging, it’s often useful to be able to dig into code, understand what the developer was trying to do, and if the expected behavior actually matches what you are seeing. It often results in spending a lot of time to create a one-line fix, and it requires very advanced programming skills. Again, it’s possible to be a sysadmin with only limited software development knowledge, but there’s a large class of things that you are unlikely to address properly.

I think that what makes sysadmins jobs both very interesting and very challenging is that they require a very wide range of knowledge. There’s often the ability to learn about new stuff (much more than in software development jobs). Of course, the difficult question is where to draw the line. What is the sysadmin knowledge that every CS graduate should have, even in curriculums not targeting sysadmin jobs? What is the sysadmin knowledge for a sysadmin BSc degree? for a sysadmin MSc degree?

Debian packages with /outdated/ packaging style

(This is just a copy of this debian-devel@ email)

Following my blog post yesterday with graphs about Debian packaging evolution, I prepared lists of packages for each kind of outdatedness. Of course not all practices highlighted below are deprecated, and there are good reasons to continue to do some of them. But still, given that they all represent a clear minority of packages, I thought that it would be useful to list the related packages. (I honestly didn’t know if some of my packages would show up in the lists!)

The lists are available at https://people.debian.org/~lucas/qa-20151226/

I also pushed them to alioth, so you can either do:
ssh people.debian.org 'grep -A 10 YOURNAME ~lucas/public_html/qa-20151226/*ddlist'
or:
ssh alioth.debian.org 'grep -A 10 YOURNAME ~lucas/qa-20151226/*ddlist'

the meaning of the lists is:

  • qa-comaint_but_no_vcs.txt (275 packages): Based on the content of Maintainer/Uploaders, the package is co-maintained, but there are no Vcs-* fields.
  • qa-format_10.txt (3153 packages): The package is still using format 1.0.
  • qa-helper_classic_debhelper.txt (3647 packages): The package is still using “classic” debhelper (no dh, no CDBS).
  • qa-helper_not_debhelper.txt (144 packages): The package is not using debhelper (nor dh, nor CDBS).
  • qa-patch_dpatch.txt (170 packages): The package is using dpatch.
  • qa-patch_modified-files-outside-debian.txt (1156 packages): The package has modified files outside the debian/ directory (not tracked using patches).
  • qa-patch_more_than_one.txt (201 packages): The package uses more than one “patch system”. In most cases, it means that the package uses a patch system, but also has files modified directly outside of debian/.
  • qa-patch_other.txt (51 packages): The package has patches, but using an unidentified/unknown patch system.
  • qa-patch_quilt.txt (445 packages): The package uses quilt (with 1.0 format, not 3.0 format).
  • qa-patch_simple-patchsys.txt (129 packages): The package uses simple-patchsys.
  • qa-vcs_but_not_git_or_svn.txt (290 packages): The package is maintained using a VCS, which is not either Git or SVN.
  • qa-vcs_more_than_one_declared_vcs.txt (1 package): The package declares more than one VCS.

If you don’t understand why your package is listed, you can have a look at allpackages-20151226.yaml that provides more details. If you still don’t understand, just ask me.

Excluding duplicates, a total of 5469 packages are listed. The dd-list output for the merged list is also available (which isn’t very useful, except to know if you are listed).

Debian is still changing

Here is an update to the usual graphs generated from snapshot.d.o. See my previous blog post for the background info.

In all graphs, it’s easy to see the effect of the Jessie freeze (and the previous freezes since 2005, too).

Team maintenance

comaint-2015

 

It’s interesting to see that, while the number of team-maintained packages increases, the number of packages that aren’t co-maintained is very stable.

Maintenance using a VCS

vcs-2015

Git is the clear winner now, with the migration rate increasing recently.

Packaging helpers

helpers-2015

As usual, the number of packages using CDBS is quite stable. The number of packages using traditional debhelper might soon be lower than those using CDBS.

Patch systems and packaging formats

formats-patches-2015

 

3.0 is the clear winner, even if we still have 3000+ packages using 1.0, and ~1000 of those modifying files directly. The other patch systems have basically disappeared.

So, all those graphs are kind-of boring now. Any good ideas of additional things to track, that be can identified reliably by looking at source packages?

For those interested, below are links to the graphs with percentages of packages.

comaint-percent-2015    formats-patches-percent-2015    vcs-percent-2015    helpers-percent-2015

DebConf’15

I attended DebConf’15 last week. After being on semi-vacation from Debian for the last few months, recovering after the end of my second DPL term, it was great to be active again, talk to many people, and go back to doing technical work. Unfortunately, I caught the debbug quite early in the week, so I was not able to make it as intense as I wanted, but it was great nevertheless.

I still managed to do quite a lot:

  • I rewrote a core part of UDD, which will make it easier to monitor data importer scripts and reduce the cron-spam
  • with DSA members, I worked on finding a suitable workaround for the storage performance issues that have been plaguing UDD for the last few months. fsyncs() will now longer hang for 15 minutes, yay!
  • I added a DUCK importer to UDD, and added that information to the Debian Maintainer Dashboard
  • I worked a bit on cleaning up the status of my packages, including digging into a strange texlive issue (that showed up in developers-reference), that is now fixed in unstable
  • I worked a bit on improving git-buildpackage documentation (more to come in that area)
  • Last but not least, I played Mao for the first time in years, and it was a lot of fun. (even if my brain is still slowly recovering)

DC15 was a great DebConf, probably one of the two bests I’ve attended so far. I’m now looking forward to DC16 in Cape Town!