Living in France? Not an April member? You are WRONG.

I’ve been a member of April, the french association for promotion and defense of Free Software, for a bit more than a year, and I often regret not becoming a member earlier. (I was feeling so guilty and shameful about not being a member that I actually postponed becoming a member.)

Stop feeling guilty and shameful, become an April member today!

Why Is becoming an April member so important?

  • Clearly, April doesn’t address the same problems as your local LUG. April is a country-wide organization, and it works on country-wide problems. It’s the only group able to work on such problems at this scale (I’m not sure of the situation in other countries, but I think CCC shares a similar role in Germany for example).
  • Each time I talk to people really involved in April (which I’m not), I’m amazed by how powerful they have become. They are able to talk to french or european deputies or ministers’ cabinets, and are considered important. They are doing a fantastic job spreading what matters to us to legislative and executive powers in France and Europe.

Some of the things they worked on recently (from the top of my head):

  • Lobbying on :
    • OOXML
    • General announcements about politics (Plan France Numérique 2012, aka Plan Besson).
    • European telecom package and HADOPI law (french graduated response) law, through Quadrature du net. (OK, it doesn’t have anything to do with April, but most of the people involved in Quadrature du Net are also involved in April :-)
    • vente liée : the fact that it’s not possible to buy a computer without a Windows license. It’s illegal in French law, but still the de facto situation almost everywhere.
  • Organization of a campaign where candidates to elections in France where asked questions, or asked to sign a declaration about Free Software. In 2007, 8 out of the 12 candidates of the french presidential election answered April’s questions.

So, really, become a member today. It’s only 10 EUR, and you already know they will be well used. April is trying to reach 5000 members by the end of 2008.

(Apparently, if you use that address, April will now that you came from me. No benefit for me at all.)

Developer status: problems and solutions

For a minute, I’d like to think about The Announcement in terms of problems and solutions.

Problem: We don’t have enough non-developing contributors in Debian
Solution in The Announcement: Give them special, official, statuses in the Debian community, so they are rewarded for their work.
Alternate solution: Are they really looking forward this status? Aren’t we just thinking that they are just as power-hungry as we are? Last time I checked, the Debian community wasn’t really welcoming non-developers. It isn’t a problem of official status. We have to work on ourselves to make Debian a better place to be for non-developing contributors. Also, giving those people a second-class citizenship, that won’t be widely recognized — “Debian Member” instead of “Debian Developer”, isn’t going to help. (or even worse, “Debian Contributor”, which doesn’t give any right except an email address).

Problem: We have a trust and security problem. It’s difficult to give upload rights and access to Debian machines to more than 1000 people, with some of them not being active anymore, or not having much experience with security (in the case of non-developing contributors).
Solution in The Announcement: Remove rights based on classes of developers, so the group of people having full access stays within manageable boundaries.
Alternate solution: Have fine-grained control on who can do want.

Problem: DMs can’t get access to Debian resources because their keys aren’t managed by the Debian keyring team, but by another team.
Solution in The Announcement: Keyring managers will take care of the Debian Maintainers keyring too. In the past, the keyring managers were the cause for huge delays in the NM process. It has improved a lot recently, but it doesn’t mean that it will always be like that.
Alternate solution: Make the DM keyring team, the keyring managers, and DSA, work together on a solution. it doesn’t sound impossible. If that’s necessary, merge the DM keyring team and the keyring managers.

Problem: DM was done without the blessing of most of the members of the loosely defined group of powerful DDs (sometimes described as “cabal”).
Solution in The Announcement: Drop it, replace it with something very similar, but originating from the cabal.
Alternate solution: Why not keep it?

Problem: Clueless DDs advocate clueless contributors for DMs. Then those DMs upload crap to the archive. (Note: I don’t necessarily agree with that problem, that’s just a problem that the announcement tries to solve)
Solution in The Announcement: Require answering a few questions before becoming a DM. Remove the right for every DD to decide which DM can upload which package. Give that control only to the NM committee.
Alternate solution: If we can’t trust one DD to take the good decision about advocating someone for DM, we could require that each person needs to be advocated by two different DDs. Or three, four, five. Asking the DM to answer a few questions, whose answers he can probably get by asking around on IRC, is not going to improve the average level of our DMs or DDs.


I believe that there are real problems that need to be solved, but that the decisions announced don’t solve the real underlying problem. Here is what I’d like to see:

  • Fine-grained access control for each DD or DM or …:
    This would allow people to have specific rights inside Debian, like:

    • Right to login on each debian.org host (a per-host switch)
    • Right to upload packages for which the person is Maintainer or Uploader
    • Right to upload any package
    • Right to vote

    All rights would default to “NO”. DDs would be allowed to change their rights for everything, while DMs would only be allowed to change specific rights. Of course, we need a secure way to change rights (HTTPS+confirmation using GPG, like the current “sudo password” thing?). But that would help with the security problem (you wouldn’t have access to all Debian hosts without enabling it). In fact, we already have that: many hosts require a switch to be enabled in LDAP before you can login, and we have tons of Unix groups to restrict access to specific areas of our infrastructure. I’m just proposing to extend that to basically everything, and to allow people to grant themselves some rights without going through DSA.

  • Modified NM process for non-developing contributors, that would still include many P&P and T&S questions, but would give “Debian Developer” status without any upload rights. That way, non-developing contributor would have an official status, and will still be able to write “Debian Developer” in their resume. Giving a different status to those people just because we don’t think that they should have the right to upload any package doesn’t feel right.

Is that enough? Probably not. We need to be more welcoming towards non-developing contributors, which is a social problem, not a technical one. But we are not going to solve that in a GR.

Hype

It’s really funny to see how popular things can get, and then totally disappear. Some time ago, there was a TV show on the french TV called “Un an de +”, that talked about what was in the news one year before. It was really interesting to see how quickly everybody can forget stuff that looks so important now.

For example, who remembers Second Life? Apparently, it has been on the decline for some time already, according to Google Trends:

Will Facebook be the next Second Life?

Importing bugs into UDD, faster.

Importing bugs from the Debian Bug Tracking System into UDD (the Ultimate Debian Database) in a reasonable time is challenging. The BTS uses flat files to store bug information, and importing a bug typically requires reading a ‘summary’ file, and a file containing the verisoning information, both of a few hundred bytes. That looks easy, but when you multiply it by ~70000 unarchived bugs, it takes a lot of time (about 40 minutes) to read those ~100k files, because the import process will block on every I/O. The problem is not the amount of data to read (19.8 MB for summary files, 7.4 MB for versioning files), but the number of files (69612 summary files, 17507 versioning files).

The obvious solution is to preload all the files into the page cache, so they are there when you need them. But you can’t simply do that with find /org/bugs.debian.org/versions/pkg -type f -exec cat {} \+ &>/dev/null, because that wouldn’t fix anything: you would still block on each file, and prevent the I/O scheduler to reorder the reads and optimize them (it’s called elevator for a reason). So, how do I tell the kernel “I’m going to read that in the future, please preload it?” readahead(2) is blocking, so it’s not helpful. The right solution is to use posix_fadvise(2), that allows to declare an access pattern. Using fadvise to preload all the files takes less than 5 minutes, and importing the bugs after that takes less than 8 minutes, so it’s really a big win.

Does someone know if there’s already an fadvise-based tool that allows to preload a list of files? That’s something that I could need in other contexts as well.

Cool stats about Debian bugs

Now that bug #500000 has been reported, let’s have a look at all our other bugs, using UDD.

Number of archived bugs:

select count(*) from archived_bugs;
 count  
--------
 402826

Number of unarchived bugs marked done:

select count(*) from bugs where status = 'done';
 count 
-------
  8267

Status of unarchived bugs (“pending” doesn’t mean “tagged pending” here):

select status, count(*) from bugs group by status;
    status     | count 
---------------+-------
 pending       | 53587
 pending-fixed |  1195
 forwarded     |  6778
 done          |  8267
 fixed         |   167

The sum isn’t even close to 500000. That’s because quite a lot of bugs disappeared:

select id from bugs union select id from archived_bugs order by id limit 10;
 id  
-----
 563
 660
 710
 725
 740
 773
 775
 783
 817
 819

Now, let’s look at our open bugs.
Oldest open bugs:

select id, package, title, arrival from bugs where status != 'done' order by id limit 10;
  id  |    package     |                                   title                                    |       arrival       
------+----------------+----------------------------------------------------------------------------+---------------------
  825 | trn            | trn warning messages corrupt thread selector display                       | 1995-04-22 18:33:01
 1555 | dselect        | dselect per-screen-half focus request                                      | 1995-10-06 15:48:04
 2297 | xterm          | xterm: xterm sometimes gets mouse-paste and RETURN keypress in wrong order | 1996-02-07 21:33:01
 2298 | trn            | trn bug with shell escaping                                                | 1996-02-07 21:48:01
 3175 | xonix          | xonix colors bad for colorblind                                            | 1996-05-31 23:18:04
 3180 | linuxdoc-tools | linuxdoc-sgml semantics and formatting problems                            | 1996-06-02 05:18:03
 3251 | acct           | accounting file corruption                                                 | 1996-06-12 17:44:10
 3773 | xless          | xless default window too thin and won't go away when asked nicely          | 1996-07-14 00:03:09
 4073 | make           | make pattern rules delete intermediate files                               | 1996-08-08 20:18:01
 4448 | dselect        | [PERF] dselect performance gripe (disk method doing dpkg -iGROEB)          | 1996-09-09 03:33:05

Breakdown by severity:

select severity, count(*) from bugs where status != 'done' group by severity;
 severity  | count 
-----------+-------
 normal    | 27680
 important |  7606
 minor     |  6921
 wishlist  | 18898
 critical  |    29
 grave     |   209
 serious   |   384

Top 10 submitters for open bugs:

select submitter, count(*) from bugs where status != 'done' group by submitter order by count desc limit 10;
submitter                      | count 
----------------------------------------------------+-------
 Dan Jacobson                  |  1455
 martin f krafft                |   667
 Raphael Geissert                |   422
 Joey Hess                        |   392
 Marc Haber            |   368
 Julien Danjou                     |   342
 Josh Triplett                |   331
 Vincent Lefevre                |   296
 jidanni@jidanni.org                                |   260
 Justin Pryzby  |   245

Top bugs reporters ever:

select submitter, count(*) from (select * from bugs union select * from archived_bugs) as all_bugs
group by submitter order by count desc limit 10;
                  submitter                   | count 
----------------------------------------------+-------
 Martin Michlmayr             |  4279
 Dan Jacobson            |  3652
 Daniel Schepler  |  3045
 Joey Hess                  |  2836
 Lucas Nussbaum     |  2701
 Andreas Jochens                |  2605
 Matthias Klose         |  2442
 Christian Perrier        |  2302
 James Troup                |  2198
 Matt Zimmerman               |  2027

You want more data? Connect to UDD (from master.d.o or alioth.d.o, more info here), run your own queries, and post them with the results in the comments!

Looking for cliques in the GPG signatures graph

The strongly connected set of the GPG keys graph contains a bit more than 40000 keys now (yes, that’s a lot of geeks!). I wondered what was the biggest clique (complete subgraph) in that graph, and also of course the biggest clique I was in.

It’s easy to grab the whole web of trust there. Finding the maximum clique in a graph is NP-complete, but there are algorithms that work quite well for small instances (and you don’t need to consider all 40000 keys: to be in a clique of n keys, a key must have at least n-1 signatures, so it’s easy to simplify the graph — if you find a clique with 20 keys, you can remove all keys that have less than 19 signatures).

My first googling result pointed to Ashay Dharwadker’s solver implementation (which also proves P=NP ;). Googling further allowed me to find the solver provided with the DIMACS benchmarks. It’s clearly not the state of the art, but it was enough in my case (allowed to find the result almost immediately).

The biggest clique contains 47 keys. However, it looks like someone had fun, and injected a lot of bogus keys in the keyring. See the clique. So I ignored those keys, and re-ran the solver. And guess what’s the size of the biggest “real” clique? Yes. 42. Here are the winners:

CF3401A9 Elmar Hoffmann 
AF260AB1 Florian Zumbiehl 
454C864C Moritz Lapp 
E6AB2957 Tilman Koschnick 
A0ED982D Christian Brueffer 
5A35FD42 Christoph Ulrich Scholler 
514B3E7C Florian Ernst 
AB0CB8C0 Frank Mohr 
797EBFAB Enrico Zini 
A521F8B5 Manuel Zeise 
57E19B02 Thomas Glanzmann 
3096372C Michael Fladerer 
E63CD6D6 Daniel Hess 
A244C858 Torsten Marek 
82FB4EAD Timo Weingärtner
1EEF26F4 Christoph Ulrich Scholler 
AAE6022E Karlheinz Geyer 
EA2D2C41 Mattia Dongili 
FCC5040F Stephan Beyer 
6B79D401 Giunchedi Filippo 
74B11360 Frank Mohr 
94C09C7F Peter Palfrader
2274C4DA Andreas Priesz 
3B443922 Mathias Rachor 
C54BD798 Helmut Grohne 
9DE1EEB1 Marc Brockschmidt 
41CF0322 Christoph Reeg 
218D18D7 Robert Schiele 
0DCB0431 Daniel Hess 
B84EF12A Mathias Rachor 
FD6A8D9D Andreas Madsack 
67007C30 Bernd Paysan 
9978AF86 Christoph Probst 
BD8B050D Roland Rosenfeld 
E3DB4EA7 Christian Barth 
E263FCD4 Kurt Gramlich 
0E6D09CE Mathias Rachor 
2A623F72 Christoph Probst 
E05C21AF Sebastian Inacker 
5D64F870 Martin Zobel-Helas 
248AEB73 Rene Engelhard 
9C67CD96 Torsten Veller

It’s likely that this happened thanks to a very successful key signing party somewhere in germany (looking at the email addresses). [Update: It was the LinuxTag 2005 KSP.] It might be a nice challenge to beat that clique during next Debconf ;)

And the biggest clique I’m in contains 23 keys. Not too bad.

tool to mirror a website locally?

Dear lazyweb,

I need a tool to mirror a website locally (so I can browse it offline). Requirements:
– not GUI-based (I want to run it in a script)
– support recursive retrieval and include/exclude lists (like wget)
– no output when everything is fine, but still output errors (not possible with wget, which still output “basic information” when running with –no-verbose, and doesn’t output errors when running with –quiet)
– understands timestamps, and retransfers files if timestamps or sizes don’t match
– not too over-engineered, not too badly maintained, etc…

Thank you.

New Debian Developers!

We got a lot of (>= 10) new Debian developers recently. I’m really happy to see that the bottlenecks in the New Maintainer process were (at least partially) solved. My first NM (actually my second, my first one is on hold) also became a DD today.

So, how long does it take to become a DD ? Let’s take 2 examples. Both are very active and skilled new contributors, that probably were quite close from being the faster you can be through NM:

Name Applied AM assigned Approved by AM Account created
Chris Lamb 2008-05-01 2008-06-12 2008-07-22 2008-09-16
Sandro Tosi 2008-03-24 2008-05-06 2008-06-22 2008-09-16

We have the proof: provided you have all the required skills, you can become a DD in less than 6 months!

Of course, some things are not perfect yet:

  • A lot of very good contributors are waiting for an AM, because not enough DDs volunteer to be AMs.
  • Some NMs still take too long to answer questions, using AMs that could probably mentor faster NMs. If your AM is waiting for you, feel guilty now!
  • Front Desk and DAM are still managed by a small set of very active (and very busy elsewhere) DDs. Many of the new DDs were FD-approved and DAM-approved by the same person, which is not so great if we want to keep this two-steps check.

Is Mozilla the new XFree86? Could Ubuntu actually help?

All the recent moves of Mozilla make me feel that they are really taking the XFree86 path. Reading the Launchpad bug log about the EULA shows that most of the posters agree on who is on the wrong side, and favor switching to IceWeasel or Epiphany+Webkit.

Even if Mozilla is apparently going to back off on the EULA story, it looks like the harm was done. If they want to fix that, they will have to start listening to other players in the Free Software community. Or just watch Webkit eat their market share.

Since Ubuntu leaders are apparently talking to Mozilla about that, I really hope that they are aiming for a solution that will help the Free Software community as a whole, and are not looking for a work-around that will “fix” the problem for Ubuntu.

There has been a lot of noise about the lack of “giving back” to the community by Ubuntu. Using Ubuntu user base to weight in and solve such issues in a way that benefit the whole community would probably be seen as a much more valuable contribution than another bunch of patches.