About Tyrannioware

Tyrannioware is named after Atticus' library organizer, Tyrannio. Marcus Tullius Cicero says in his Letters to Atticus: You will find that Tyrannio has made a wonderful job of arranging my books. ¹ , and later: Now that Tyrannio has put my books straight, my house seems to have woken to life. Your Dionysius and Menophilus have worked wonders over that. Those shelves of yours are the last word in elegance, now that the labels have brightened up the volumes.² The name is especially appropriate because the library supports the work of my two very similar-looking cats, Tully and Cicero. At first I thought the software should have some Borgesian name, but then I realized I'd probably end up cataloging books consisting of nothing but "M C V" repeated endlessly, according to the code of the Bibliographic Institute of Brussels (see The Library of Babel and John Wilkins' Analytical Language). So far, I've used the package to catalog 2222 bibliographic entries (some of which are for multivolume works).

Tyrannioware (written in Python) retrieves MARC data via Z39.50 from a host (I've tested against the Library of Congress, and the National Library of Canada/Bibliothèque Nationale du Canada), parses it, stores it in a PostGreSQL database, and makes it available via the Lucien front end, a set of CGIs (tested with Apache). (MARC is a standard for bibliographic data, with additional information not usually available in online displays. This information varies in utility from indicating leading articles in titles, to enable correct sorting without language-dependent tables, to indicating whether a work is a festschrift. There's the usual metadata tradeoff of richness against speed of description and accuracy.) Tyrannio uses my PyZ3950 toolkit, which you'll need to install first. You can download the latest Tyrannio tarball (currently 0.8) here.

Using Tyrannioware

To use, run tyrannio.py. When you see "Query: ", you can either scan (currently only the :CueCat scanner is supported) the Bookland-format barcode, recognizable by beginning with 978 (usually on the back of the book, but if the barcode on the back is a UPC, sometimes there's a Bookland barcode on the inside front cover), or type a CCL query. (Quickly and oversimplifiedly: CCL queries are of the form <qualifier>=<value>, where supported qualifiers are "AU" for author, "TI" for title, "ISBN" for ISBN, and "LCCN" for Library of Congress Control/Card Number. <value> should be quoted unless the query is unambiguous otherwise. CCL queries can also be combined with AND or OR, e.g. ti="Life of Python" and au="Perry, George" or ti=Life of Python and au=Perry, George and isbn=0316700150.) ISBNs should be formatted without hyphens. LCCNs should be typed without hyphens, and the part after the hyphen 0-padded on the left: e.g. 89-45711 should be entered as 89045711. You'll see a list of bibliographic records, with "Record #", and comparatively raw data for MARC titles, edition statement, publisher info, physical description, and isbn. If one matches, type the record #. If it's close enough (that is, another edition), prefixing the record # with "x" will import the record, deleting the edition/publisher information. If none match, type "0" (and possibly do another search). If more records are retrieved than can be displayed, type "n" to go forward and "p" to go back. (In the future, I plan to add more sophisticated editing capabilities, since it would be useful to edit publisher/edition information, fill in physical description for CIP-cataloged books, harmonize practice among various sources, and edit the original cataloging to correct errors, especially because of MARC's richness, as alluded to above. See Arlene Taylor's Cataloging with Copy for a fuller discussion.)

There's also an experimental GUI editor, which you can run as ./guiedit.py (yeah, an awful name), which should be documented in a future release, and requires pygtk. Here's a screenshot.

Once the record is imported, Tyrannio will print the call number(s) LC has assigned. (In general, I file my books under the first call number, unless it's a PZ call number, in which case I prefer the appropriate alternate class.) At this point I copy the call number onto a little label and tape it onto the book. It would be possible to use a printer to print the labels out, keeping track of which positions had already been used, but it's not worth it for my typical book-buying habits, and most of my books had already been manually LC-classified. Note: many of the labels I had used at first fell off the books after a year or so, leaving some residue behind. You may wish to be careful.

When done with a cataloging session, you can just hit return to exit.

Files used: the records selected are placed in the libcode_acc directory, named as <LC internal control>{.marc|.isbn|.edit}. The .marc files contain the MARC data; the .isbn files contain the ISBN, if the search was by ISBN; and the .edit files contain instructions for removing edition data, if necessary. WARNING: I consider the marc data to be primary. Future versions will, in all probability, have different database schemas, and you'll need to re-import the MARC data into the new database. (Re-importing isn't particularly difficult: all this means is that any editing should be of the MARC data, not the database, and that the seqnos and bibnos which appear in URLs aren't permanent identifiers.) The MARC data is currently standard MARC (earlier versions used a lossless transformation which is a little easier to read and edit manually, but I don't yet provide a conversion utility).

Using Lucien

With any luck, Lucien should be self-explanatory. The initial screen (index.py, by default installed in cgi/bin/lucien/index.py) allows you to search by title, personal author, corporate author (no discorporate authors, alas), meeting author, call number, series, subject, publisher, physical description, notes, ISBN, LC Control Number, or electronic link text (for example, tables of contents).

The ordering imposed on call numbers is a little nonintuitive at first (all valid call numbers collate before invalid ones in the standard order, and then invalid ones in alphabetical order, so, for example, a search for "IN" will pick up valid call numbers subsequent to "IN" (in my case, starting with "JC"), but a search for "IN " will pick up "IN PROGRESS". The ordering for Physical Description searches is by number of pages. All other orderings are just case-insensitive orderings (ignoring non-collating articles in titles and series titles). Lucien also reports various statistics, including a list of most frequent personal author and a breakdown by LC call number. You can access these by invoking "stats.py?stat=0"³ and "stats.py?stat=1", respectively. I've put together a de-dynamicized (since I don't have access to web hosting supporting python CGIs and PostGreSQL) version of Lucien's output for a search by title in the cookbook section of my library. (Currently, many of the searches are a little slow: I'm trying to figure out how to persuade PostGreSQL to use the indices I've defined.)

Lucien accesses the underlying database directly, instead of using Z39.50. This is somewhat less open, but saves me the work of interfacing a Z39.50 server to my database, and allows it to be stateless. (I've always been vaguely annoyed by timeouts on session-oriented library interfaces.)

All the HTML-generating and parameter-extracting code is in a single module (merv.py), so a non-web-based user interface should be relatively easy to implement.

Installation instructions

Once you have all the supporting software (Python 2.1 and PostGreSQL 7.1 or later), type make install as a user authorized to write to the cgi-bin directory. Type make dbinstall as a postgres user authorized to create other users. Make sure that the pgsql libraries are accessible from the Apache environment (I stuck SetEnv LD_LIBRARY_PATH /usr/local/pgsql/lib in my httpd.conf). By default, lucien is installed to /usr/local/apache/cgi-bin/lucien. If you want a different location, override the makefile.

Security

There is no Z39.50 encryption: be aware that you're theoretically exposing the fact of doing a lookup on a particular book. If your system is ever connected to the Internet while Apache is running and you're at all concerned about who might access your collection, you probably want to use Apache's security system to limit access. If you allow remote users to access your library and don't require https, remember that data they retrieve is transmitted in the clear.

Related Work

In general, you might be interested in oss4lib, a site devoted to Open Source and libraries. Specific programs (this list may be out of date, since I haven't updated it in a while):

The first four have much simpler data models than I find useful (for example, a single "subject" field per-book instead of, in Tyrannio, a subject authority table, a bibliographic entity table, and a many-to-many mapping between the two). Pytheas preserves the MARC structure in the database, which makes writing correct importers easier and code which works with the data a little harder. ReaderWare is a shareware program which used to have some of the limitations of the first four, but has been upgraded to be much more useful, and, according to its web pages, has a much spiffier user interface than Tyrannio.

There are also a number of programs more focused on providing scholarly bibliographies, such as BookWare, EndNote, or BibTeX (and allied programs using the BibTex format). (The first two programs are commercial, and neither is available for Linux.)

Licensing

Tyrannioware is licensed under the GPL, except for the z3950 component, which is licensed under the X license.

Misc

I'd like to thank Marty Busse and Eric Fischer for random encouragement, Bill Maddex for specifically enabling my book habit, and the Seminary Coop and Powell's Books of Chicago, among others, for helping to create the need for this program. If you have any bug reports, suggestions, or whatever, send me, Aaron Lav, email.

Some book distributors sell MARC data to libraries along with the books. It'd be spiffy if retail bookstores could also do so (either by emailing you the data, or providing a URL (perhaps keyed to something on the receipt to discourage free riding))

Footnotes

1. "offendes dissignationem Tyrannionis mirificam librorum meorum ....", Letter IV.4a (SB 78), Antium, ca. 20 June 56. All text from the Loeb 1999 edition, edited by Shackleton Bailey. I am eliding the footnotes about textual variants.

2. "postea vero quam Tyrannio mihi libros disposuit, mens addita videtur meis aedibus. qua quidem in re mirifica opera Dionysi et Menophili tui fuit. nihil venustius quam illa tua pegmata, postquam sittybae libros illustrarunt." Letter IV.8 (SB 79), Antium, shortly after letter IV.4a.

3. The top 9 authors being Philip K. Dick, Robert Silverberg, Stanislaw Lem, Samuel R. Delany, a two-way tie between Tom Stoppard and Salman Rushdie, and then a three-way tie between Raymond Chandler, Roger Zelazny, and Neil Gaiman (note that Sandman is cataloged as only one 10-volume work).

Aaron's home page