Peter Eisentraut played with Ultimate Debian Database, and wanted to create a “maintenance effort” metric by multiplying each package’s installed size by its popcon. His query is:
SELECT rank() OVER (ORDER BY score DESC), source,
sum(installed_size::numeric * insts) AS score
FROM packages JOIN popcon USING (package)
WHERE distribution = 'debian' AND release = 'sid'
AND component = 'main' AND architecture IN ('all', 'i386')
GROUP BY source, version ORDER BY score DESC LIMIT 30;
Besides all the interesting things that I learnt by looking at his query (rank(), and a bug in UDD because installed_size should really be numeric to avoid the conversion), Peter had a problem with his query: linux-2.6 is missing from the results, while it should obviously have a large popcon and a large install size.
The problem is that the binary packages for linux-2.6 often change, so they don’t get very high in popcon. The unstable kernel package gets a ridiculous popcon score:
select package, insts from popcon
where package in (select package from packages where source ='linux-2.6' and release='sid')
order by insts desc limit 30;
package | insts ----------------------------------+------- linux-libc-dev | 38703 linux-source-2.6.29 | 614 linux-headers-2.6.29-2-common | 256 linux-image-2.6.29-2-amd64 | 239
A solution could be to change the metric to be: MAX(insts over all binary packages from this source package) * SUM(installed_size)
The good thing is that UDD already offers a popcon_src view, that gives the popcon score for a source package. So the query becomes:
SELECT rank() OVER (ORDER BY score DESC), source,
sum(installed_size::numeric * insts) AS score
FROM packages JOIN popcon_src USING (source)
WHERE distribution = 'debian' AND release = 'sid'
AND component = 'main' AND architecture IN ('all', 'i386')
GROUP BY source ORDER BY score DESC LIMIT 30;
rank | source | score ------+---------------+------------- 1 | openoffice.org | 92177633504 2 | qt4-x11 | 18503941620 3 | linux-2.6 | 16036201020 4 | gcc-4.3 | 14369300376 5 | mesa | 12962475968 6 | eglibc | 12581290240 7 | gcc-4.4 | 11411296672 8 | samba | 10021083072 9 | xulrunner | 9037295424 10 | mysql-dfsg-5.0 | 8348333532
This time, linux-2.6 shows up near the top of the list.