UDD and Buildstat

Ultimate Debian Database (UDD) is a GSoC project (now finished) which aimed at importing all different data sources that we have in Debian in a single SQL database, to make it easy to combine that data. Currently, we import information about source and binary packages, all bugs (both archived and unarchived), lintian, carnivore, popcon, history of uploads, history of migrations to testing, and orphaned packages. The goal is really to have that data, without really thinking of a specific use cases: there will be lots of use cases.

Buildstat is a project by Gonéri Le Bouder, that provides a framework for running QA tests (rebuilds and lintian currently, but buildstat is built in a very extensible way) on packages, using both packages in the archive, and packages in the VCS repositories of teams. This is pretty cool: it allows teams to get an overview of the status of their packages, not using the archive as reference, but using their VCS. Buildstat schedules and runs the tests, store the data in an SQL database, and allows to browse the data using a web interface. Buildstat also import some data from other sources (only the BTS currently, using the LDAP dump) to display it on the web interface.

Since both projects are using an SQL DB, people have been asking why we don’t simply merge them. The big advantage would be that the data is synchronized: buildstat would display more up-to-date info about the data sources it doesn’t generate locally (like bugs), and UDD would get fresh buildstat data. We have been talking a lot with Gonéri, considering the different possibilities. But I don’t think it’s a good idea.

I think that both projects should try to do one thing, but do it very well, instead of trying to fix the world. UDD focuses on importing data that exists elsewhere. Sometimes it means doing some complex processing. But data should be be generated by UDD. Merging the projects would mean having a very big piece of software that does everything (or tries to do everything).

Both databases were designed differently, with different goals. UDD tries to stay close to the data it imports. There might be some incoherences in the data sources, but that’s fine: one of the goal of UDD is to make it easy to find them (and fix them), so we need them in the DB. In buildstat, since the goal of importing data is to display it on the web interface, with a strict use case, you can freely “simplify” data if it helps. Another big difference is that UDD is designed to be easy to use (ie write and run queries) by a human user: UDD uses multi-column primary keys, while buildstat uses surrogate keys (integer “id” keys), that ORM tools usually require.

There are also more technical concerns: currently UDD makes a compromise, for each data source, on what it imports: it tries to import the data that is useful, not all the data available. Merging buildstat and UDD would mean increasing the DB size significantly, by adding all the “private” data that buildstat needs. Another problem is the stable API problem: if buildstat and UDD are merged, it means that buildstat cannot change its DB schema without making sure that it wouldn’t break what UDD users are doing.

So, what should we do, from my POV, instead of merging?

– Continue to talk, and get Gonéri into the UDD “team”. He gathered a lot of experience working on buildstat, and he probably would be able to help a lot.

– Data that is not generated by buildstat (bugs data) should be imported from UDD. Doing SQL->SQL will probably make things easier there.

– A summary of buildstat’s infos should be imported into UDD.

There’s also the issue of providing the data through a web interface, to the DDs, which buildstat tries to address partially. The Debian Developer Packages Overview (DDPO)’s main limitations are:

– lack of knowledge about VCS (what buildstat solves)

– lack of knowledge about complex organizations (you can’t get any list of packages, or list of packages maintained by teams not using a consistant Maintainer/Uploaders scheme, or list of packages from tasks, etc)

– poor handling of large amount of packages. In teams with lots of packages, it’s useful to have restricted views, such as “packages outdated compared to upstream”, “packages which have bugs/RC bugs”, “packages which are newer in the VCS than in the archive”. The perl team’s work on PET clearly shows the kind of things that are needed.

At this point, I think that DDPO would benefit from a full rewrite (using the existing code as a source of inspiration, and making sure that there are no regressions, of course).

Using UDD as the data source, it should be easy to get something done (even if UDD still lacks some of the data DDPO has currently). But it still requires web developer skills, which I don’t have. If you are interested, contact me!

tiling terminals manager

I tried terminator (thanks go to Nicolas Valcarcel for asking me to sponsor a Debian upload, thus forcing me to try it, and Asheesh Laroia for doing a lightning talk at debconf about it), but I’m not convinced.
– More keybindings are clearly missing. You can only switch terminals using Previous/Next keybindings.
– More features would be great, like the ability to switch the position of two terminals (so you could reorganize them).
– It has some small usability problems, like the fact that the config is text-based, not using gconf, that it’s not possible to change the config without restarting it, that the title bar doesn’t display anything useful most of the time, since it prefixes the current terminal’s title with “Terminator: “, etc.

So, is there any other tiling terminals manager I should try, before filing tons of feature requests on terminator? My other requirement is that it mustn’t reinvent the wheel, but use the gnome-terminal widget.

Thank you.

Debian’s Freeze

Debian’s freeze sounds like a technical hack to address a social problem, and that disturbs me a bit.

The social problem is: At some point, we need everybody in Debian to make only non-disruptive changes, so everything can converge very fast into a releasable state.

The “solution” we are using is that we are blocking all packages from migrating to testing, and requiring manual review from someone on the release team. Consequences are:
– many people feel that you need to be very convincing to fix a small, not RC bug, even if fixing that bug definitely increases your package’s quality.
– the release team is completely overwhelmed by unblock requests during the freeze
– many people just stop trying to fix things during the freeze (which definitely doesn’t improve Debian’s quality), both because they think it’s hard to get a fix in, and because they don’t want to bother the release team

I wonder if we really need such a strict policy. Are there other Free Software projects that use such a technical measure to prevent software from disrupting stable releases? I am the impression that most other projects rely on social pressure instead of technical measures for that, except maybe during the last few hours before the release.

Couldn’t we act on the social level? We could default to allow everyone’s package to migrate to testing, and, when someone fucks up and uploads something that should not have been uploaded, block all his packages (switching to manual review mode) until the release. Of course, that require the release team to make decisions about _people_, which is harder than making decisions about _packages_. But if the rules are clearly stated, couldn’t this work?

Code for Debian versions comparison?

Do you know code that compare versions of debian packages (for example, that knows if 2:23.2.3~rc1-1 is lower than 2:23.2.3-2?), besides dpkg –compare-versions? If yes, please write a comment to this blog post, preferably with a link to the code.

Also, did someone already write a test suite for that? Who would be interested in such a test suite?

I’m considering writing a function in PL/SQL to compare debian versions (for the Ultimate Debian Database project). If someone already wrote that, I’m interested as well.

Of popular packages removed from testing, and the Ultimate Debian Database GSOC project

Some time ago, there was some flamewars^H^Hdebate about the Release Team’s removals of RC-buggy packages from testing. Basically, some people claimed that popular packages shouldn’t be removed, even if RC-buggy.

But, do we really miss popular packages in testing?

It’s difficult to know. You could get the popcon data, and compare it with the Packages files for testing and unstable. Or work with source packages (which removes a lot of noise), but then, you have to convert the popcon data (which uses binary packages names) to source packages. Not completely trivial.

That’s where the Ultimate Debian Database GSOC project comes to the rescue. The goal of Christian von Essen’s project is to gather data from various sources in Debian into a single SQL DB, so queries that combine all those data sources can easily be written.

For example, here is the query that lists the source packages that are in unstable, but not in testing, sorted by their popcon (using the number of insts of the most popular binary package of the source package as value for the source package):

SELECT DISTINCT unstable.package, insts
FROM (SELECT DISTINCT package FROM sources
WHERE distribution = 'debian' and release = 'sid') AS unstable, popcon_src
WHERE unstable.package NOT IN (
   SELECT package FROM sources
   WHERE distribution = 'debian' AND release = 'lenny')
AND popcon_src.source = unstable.package ORDER BY insts DESC;

And the results are available on the web!

Top packages (> 1000 insts):

lzo	64962
gnome-cups-manager	32346
db4.6	20708
ffmpeg-debian	12908
freetype1	10569
flashplugin-nonfree	7116
perlftlib	6769
nvidia-graphics-drivers	3864
wxwindows2.4	3640
dvi2tty	2239
kdebase-runtime	1725
easytag	1717
g-wrap	1582
yaird	1507
slocate	1499
youtube-dl	1390
hugin	1275
w3c-libwww	1058

Interested in UDD? Join #debian-qa or debian-qa@lists.d.o (or talk to me @DebConf!)

Exporting logs from Suunto X6HR watches on Linux

I’m the happy owner of a nice geeky toy: a Suunto X6HR watch, that includes an altimeter and an heart rate monitor, which I use mainly for moutain biking and hiking.

During outings, the watch can log the altitude and heart rate every 2, 10 or 60 seconds, and the data can be transfered to a PC using a serial interface. The problem is that Suunto only provides software for Windows. I got tired of using virtualbox to connect to the watch (qemu doesn’t work, Suunto Activity Manager apparently does strange things with the serial port), so I reverse-engineered the protocol (using skimanager and Jérome Kieffer’s work as a basis) and implemented a script to fetch the logs, and export them in a format suitable for gnuplot.

Of course, Suuntux is publicly available. I’d be happy to hear from you if it works for you too. Also, if you own a Suunto X6 (similar watch, without HRM), I’d be interested in supporting it too (if it’s not supported already).

Below is a example graph, from a short mountain bike ride just before leaving for Debconf.

example suuntux output

3G Internet access using mobile phone + laptop in France

In France, we only have 3 mobile network operators: Orange, SFR, and Bouygues Telecom. They usually discuss their rates together to make sure that one of them doesn’t break the market by creating too interesting rates (that’s why it would be so cool if Free.fr could get a 3G license, and why the governement can tell them “do what we want, or we will give Free a 3G license” – source).

I’ve long been interested in using my mobile phone with my laptop to access the Internet. With the 3 operators, you can get a contract for a 3G USB key, but that’s not really interesting: it’s expensive, and there’s a limit on volume of data, so you have to monitor your bandwidth. Also, I only need to use my mobile phone as modem a few times a month. So I’m not really interested in paying an expensive monthly fee for that.

What do french MNOs offer?

Most of them have rather interesting offers when you want to get unlimited internet access from your mobile, but that doesn’t cover the case of using your phone as a modem, to access the internet on another device (e.g laptop). I’m not sure how they can see the difference, but apparently they do. (It seems that there are hacks making use of those unlimited rates, using your phone as a router between the Wifi network (but your phone has to support Wifi) and the 3G network, using apps like WMWiFiRouter. Another hack is to use the same HTTP proxy on the laptop than on the phone, which seems to give unlimited HTTP access.)

But offers for using your phone as a modem are clearly less interesting (probably because they want to push their USB 3G cards offers):

Bouygues:
Apparently, you are billed only based on the volume (see page 46 of this document), with a progressive rate. 5 MB/month costs 9 EUR, 100 MB costs 20 EUR, and over 100 MB, you pay 1 EUR/MB (!).

SFR:
See page 40-41 of this document. You pay 0.50 EUR for a 30 minutes session, with 2 MB of data included. After those 2 MB of data, you pay 1 EUR/MB.

Those rates are really crazy. Using a mobile phone on an HSDPA network (called “3G+” in France), you can easily reach 1 Mbps. (and I did see this rate while fetching files using rsync, so I was not particularly aiming at performance!). 1 Mbps translates to 7.5 EUR/min using the 1 EUR/MB rate. Or 12.5 cts/sec!!!

Surprisingly, Orange (ex-France Telecom), which is usually not the most innovative MNO and ISP in France, saves us. With Orange, you pay 0.5 EUR for a 20 mins session, with unlimited volume (sessions called “session multimédia” by Orange). I was so surprised that I called the assistance to check. And after trying it yesterday, I was a bit anxious when checking my online bill this morning. But it works!

Important note before you try (from Orange website):
La session multimédia est une tarification valable pour les clients forfaits mobile Orange (hors forfaits Orange pour iPhone, Mobicarte et cartes prépayées). Pour les clients Classique, Intense, Pro, Click le forfait et Initial dont la souscription ou le réengagment est antérieur au 14 juin 2007, pour les clients forfaits bloqués dont la souscription ou le réengagement est antérieur au 16 août 2007 et pour les clients bénéficiant d’une offre blackberry, l’accord exprès de l’abonné à bénéficier de la session multimédia est nécessaire. Les autres clients bénéficient de la tarification au volume.

Ubuntu information on the Debian Package Tracking System and the Developer Packages Overview

Users of Debian derivatives sometimes report bugs that are not reported in the
Debian BTS, but that also affect Debian. It already happened a few times that
looking at the Ubuntu bugs for my packages allowed me to fix an unreported bug
in my Debian packages.

But it’s difficult to keep track of the status of our packages in Ubuntu, since
Launchpad doesn’t provide a per-Debian-maintainer summary. Since it’s always fun to abuse proprietary software, I fetched all the bug data from Launchpad and inserted it in an SQLite DB (takes about 30 mins at 1200 HTTP requests/minute — it would be so much easier if the Launchpad devs added a text export of all bugs).

The result is that there’s now an “Ubuntu” box on the Packages Tracking System, giving the current version in Ubuntu, a link to the Ubuntu patch (if any), and the number of open bugs. An Ubuntu column has also been added to the Debian Developer Packages Overview by Christoph Berg, with the current version in Ubuntu and the number of open bugs. It’s hidden by default: click on Display Configuration to enable it (then it’s stored in a cookie).

I hope that this will help Debian maintainers to track what has been reported/fixed in Ubuntu. Also, if other Debian derivatives want to export the same kind of information, don’t hesitate to contact us.

See for example:

PS: the data might be slightly outdated, as it is processed on merkel.d.o, which was offline until recently. Expect it to be up-to-date in the next 24 hours.