16 January 2007 10:00

A cute introduction to Debtags

The Debian archive is getting larger and larger, and the software more and more diverse and complex. Organising software in the archive is difficult, and the existing section system, designed to cope with a much smaller number of packages, is no longer sufficient. The goal of Debtags is to provide a working alternative for categorising software that can cope with our numbers. The core idea of Debtags is to adapt the technique of Faceted Classification to be used for our packages. Faceted Classification is a 70-years-old library science technique which is being rediscovered and loved by modern Information Architects. Debags attaches categories (we call them tags) to packages, creating a new set of useful structured metadata that can be used to implement more advanced ways of presenting, searching, maintaining and navigating the package archive. Example uses of Debtags include searching for software, browsing the archive, and filtering out unwanted groups of packages. The Debtags effort needs to face three major problems: 1. Creating a suitable vocabulary of categories. 2. Categorizing the vast array of packages. 3. Having applications make use of Debtags data. All three issues are being actively addressed with good results: * Debtags has already acquired a large set of tags, even if the set is in continuous need of refining; * a large part of our package archive has been at least partially categorised, and there is a tool called debtags-edit that every developer and user can use to categorise the packages they know best; * a new library called libapt-front is being developed as a smart front-end to libapt which can also access other data sources, such as Debtags, popularity contest (popcon) results, debram metadata and more. This paper gives a broad technical overview of the Debtags project, its theoretical foundations, and the tools available for it now. The paper also offers some practical tutorials on how to do all sort of nice Debtags tricks.