{"id":344,"date":"2009-05-13T23:29:25","date_gmt":"2009-05-13T21:29:25","guid":{"rendered":"http:\/\/www.lucas-nussbaum.net\/blog\/?p=344"},"modified":"2009-05-13T23:29:25","modified_gmt":"2009-05-13T21:29:25","slug":"udd-and-packages-metrics","status":"publish","type":"post","link":"https:\/\/www.lucas-nussbaum.net\/blog\/?p=344","title":{"rendered":"UDD and packages metrics"},"content":{"rendered":"<p><a href=\"http:\/\/petereisentraut.blogspot.com\/2009\/05\/big-shots.html\">Peter Eisentraut played with Ultimate Debian Database<\/a>, and wanted to create a &#8220;maintenance effort&#8221; metric by multiplying each package&#8217;s installed size by its popcon. His query is:<br \/>\n<code>SELECT rank() OVER (ORDER BY score DESC), source,<br \/>\nsum(installed_size::numeric * insts) AS score<br \/>\nFROM packages JOIN popcon USING (package)<br \/>\nWHERE distribution = 'debian' AND release = 'sid'<br \/>\nAND component = 'main' AND architecture IN ('all', 'i386')<br \/>\nGROUP BY source, version ORDER BY score DESC LIMIT 30;<\/code><\/p>\n<p>Besides all the interesting things that I learnt by looking at his query (rank(), and a bug in UDD because installed_size should really be numeric to avoid the conversion), Peter had a problem with his query: linux-2.6 is missing from the results, while it should obviously have a large popcon and a large install size.<\/p>\n<p>The problem is that the binary packages for linux-2.6 often change, so they don&#8217;t get very high in popcon. The unstable kernel package gets a ridiculous popcon score:<\/p>\n<p><code>select package, insts from popcon<br \/>\nwhere package in (select package from packages where source ='linux-2.6' and release='sid')<br \/>\norder by insts desc limit 30;<\/code><\/p>\n<pre>package                           | insts\r\n----------------------------------+-------\r\nlinux-libc-dev                    | 38703\r\nlinux-source-2.6.29               |   614\r\nlinux-headers-2.6.29-2-common     |   256\r\nlinux-image-2.6.29-2-amd64        |   239<\/pre>\n<p>A solution could be to change the metric to be: MAX(insts over all binary packages from this source package) * SUM(installed_size)<br \/>\nThe good thing is that UDD already offers a popcon_src view, that gives the popcon score for a source package. So the query becomes:<br \/>\n<code>SELECT rank() OVER (ORDER BY score DESC), source,<br \/>\nsum(installed_size::numeric * insts) AS score<br \/>\nFROM packages JOIN popcon_src USING (source)<br \/>\nWHERE distribution = 'debian' AND release = 'sid'<br \/>\nAND component = 'main' AND architecture IN ('all', 'i386')<br \/>\nGROUP BY source ORDER BY score DESC LIMIT 30;<\/code><\/p>\n<pre> rank |     source    |    score\r\n------+---------------+-------------\r\n1 | openoffice.org    | 92177633504\r\n2 | qt4-x11           | 18503941620\r\n3 | linux-2.6         | 16036201020\r\n4 | gcc-4.3           | 14369300376\r\n5 | mesa              | 12962475968\r\n6 | eglibc            | 12581290240\r\n7 | gcc-4.4           | 11411296672\r\n8 | samba             | 10021083072\r\n9 | xulrunner         |  9037295424\r\n10 | mysql-dfsg-5.0   |  8348333532<\/pre>\n<p>This time, linux-2.6 shows up near the top of the list.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Peter Eisentraut played with Ultimate Debian Database, and wanted to create a &#8220;maintenance effort&#8221; metric by multiplying each package&#8217;s installed size by its popcon. His query is: SELECT rank() OVER (ORDER BY score DESC), source, sum(installed_size::numeric * insts) AS score FROM packages JOIN popcon USING (package) WHERE distribution = &#8216;debian&#8217; AND release = &#8216;sid&#8217; AND [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"0","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[],"class_list":["post-344","post","type-post","status-publish","format-standard","hentry","category-planetdebian"],"_links":{"self":[{"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/344","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=344"}],"version-history":[{"count":2,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/344\/revisions"}],"predecessor-version":[{"id":346,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/344\/revisions\/346"}],"wp:attachment":[{"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=344"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=344"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lucas-nussbaum.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=344"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}