Libraries are sneaky, crafty places.  If you walk into one, things may never look the same when you walk out.

Libraries are dangerous places.  If you open your mind in one, you may be forever changed.

And, more mundanely, university libraries are places that employ a lot of work-study students.  I was one of them at Ganser Library at Millersville University.  Although I’ve always been a bookish lad, when I started as a reference shelver at Ganser I wasn’t thinking of the job as anything more than a way to pay the rent while I pursued a degree in mathematics.  And, of course, there were decidedly limits to how much fascination I found filing updated pages in a set of the loose-leaf CCH tax codes.  While some of the cases I skimmed were interesting, I can safely say that a career in tax accountancy was not in my future, either then or now.

Did I mention that libraries are crafty?  Naturally, much of the blame for that attaches to the librarians. As time passed, I ended up working in just about every department of the library, from circulation to serials to systems, as if there were a plot to have me learn to love every nook and cranny of that building and the folks who made it live.  By the time I graduated, math degree in hand, I had accepted a job with an ILS vendor, directly on the strength of the work I had done to help the library migrate to the (at the time) hot new ILS.

While writing this post, it has hit me hard how much I owe an incredible debt of gratitude to my mentors at Ganser.  To name some of them, Scott Anderson, Krista Higham, Barbara Hunsberger, Sally Levit, Marilyn Parrish, Elaine Pease, Leo Shelley, Marjorie Warmkessel, and David Zubatsky have each taught me much, professionally and personally.  To be counted among them as a member of the library profession is an honor.

Today I have an opportunity to toot my horn a bit, having been named one of the “Movers and Shakers” this year by Library Journal.  I am grateful for the recognition, as well as the opportunity to sneak a penguin into the pages of LJ.

Original image by Larry Ewing
Original image by Larry Ewing
Why a penguin? In part, simply because that’s how my whimsy runs. But there’s also a serious side to my choice, and I’m happy that the photographer and editors ran with it. Tux the penguin is a symbol of the open source Linux project, and moreover is a symbol that the Linux community rallies behind. Why have I emphasized community? Because it’s the strength of the library open source communities, particularly those of the Koha and Evergreen projects, that inspire me from day to day. Not that it’s all sunshine and kittens — any strong community will have its share of disappointments and conflicts. However, I deeply believe that open source software is a necessary part of librarians (I use that term broadly) building their own tools with which to share knowledge (and I use that term very broadly) with the wider communities we serve.

The recognition that LJ has given me for my work for Koha and Evergreen is very flattering, but for me it is at heart an opportunity to reflect, and to thank the many friends and mentors in libraryland I have met over the years.

Thanks, and may the work we share ever continue.

I wasn’t one of the people viscerally affected by Google’s announcement of the forthcoming shutdown of Google Reader, since so far I’ve relied on a combination of standalone RSS clients and antediluvian hit-the-refresh-button-repeatedly habits.  I am dinosaur: hear me roar!

However, the announcement prompted me to take another look at RSS readers.  Nowadays, I rotate among my PC, phone, tablet, and laptop frequently, so going back to a purely standalone reader like NetNewsWire wasn’t appealing.  The online services like NewsBlur would fit my needs better than a standalone reader (and I’m willing, even happy, to pay for the hope of longevity), but suffer from two disadvantages.  They’re not open source, and as a consequence, it’s hard to dig through their guts.  And I like digging!

I first found out about Tiny Tiny RSS from one of Ed Corrado’s tweets, and like a lot of people, visited the website and saw… a whole lotta nothing at first.  Google really ought to give their open source competitors a little more warning of service cancellation announcements so that they can beef up their web hosting in advance!

Fortunately, I persevered and installed it.  Tiny Tiny RSS, which is primarily written and maintained by Andrew Dolgov, is a web-based feed reader that you can install on your web server.  It’s licensed under version 2 of the GPL.  Installation is very simple if you have a VPS, and looks fairly easy to install on shared hosting as long as it provides MySQL or PostgreSQL databases.  Once you have it running, you can easy import an OPML file or manually subscribe to feeds.  The web interface for reading and managing feeds is clean and responsive. It also has an API and there are at least two Android clients. I couldn’t find an iOS client, but I suspect that somebody will scratch that itch soon.

So far, I’m quite happy with it.  Thanks, Ed!  Thanks, Andrew and all of the the other contributors!

In his column in American Libraries today, Will Manley makes a good point that librarians should think twice about agreeing to projects that — no matter how useful — don’t add to the library’s mission. In fact, librarians can even say “no” every now and again. Unfortunately, I found that the column has a few too many cheap shots, detracting from Manley’s message.

Manley’s target? A proposal floated by the U.S. Postal Service to offer retail postal services via partner libraries. It’s understandable that the idea should raise eyebrows among librarians. After all, the IRS program to distribute tax forms through libraries has been a perfect example of an unfunded federal mandate from the point of view of libraries that find themselves turning into ad hoc tax advice services every spring. (And as far as I know, nobody’s offering a joint MLS/tax accountancy degree.) While providing tax forms is a useful service, it’s not clear that it’s one that libraries need to be involved in, or that being involved furthers library aims.

Where Manley goes too far is in a series of lazy clichés about the USPS:

After going billions of dollars into debt and being almost aced out of business by the double whammy of email and private-sector carriers that actually deliver your letters and packages on time and in good condition, the USPS is finally thinking outside of the post office box: The agency has hatched the concept of putting post office kiosks in libraries.

Aced out of business by private competition? There’s no doubt that the environment has drastically changed for the USPS, but it doesn’t follow that the shift from letters to email has made it a dinosaur. A (to say the least) challenging oversight structure and uniquely onerous pension funding requirements imposed on the USPS by Congress have handicapped its ability to react. The USPS covers more territory at cheaper rates than postal systems in many other countries.  Also, it covers rural areas that private firms either would not serve at all or only at exorbitant rates.

Suffice it to say, I generally like the USPS — a stint living in Alaska tends to do that to one. The USPS also has a mandate that is very consonant with library values: universal service.

Of course, whether or not the USPS is fairly treated by Manley doesn’t speak to whether a library should agree to start selling stamps and collecting mail. It’s certainly a stretch from traditional services. But a little digging turned up a big difference from the IRS program: it’s not an unfunded mandate. The “Village Post Office” program, as it’s called, does offer compensation to the small businesses (and libraries!) that operate them. For a struggling library in a rural community whose post office has recently closed or reduced hours, starting a VPO could be a net gain.

Indeed, librarians should know how to say “no”. But they also should know to do their due diligence before deciding.

Both Koha and Evergreen use memcached to cache user sessions and data that would be expensive to continually fetch and refetch from the database. For example, Koha uses memcached to cache MARC frameworks, while Evergreen caches search results, bibliographic added content, search suggestions, and other data.

Even though the data that gets cached is transitory, at times it can be useful to look at it. For example, you may need to check to see if some stale data is present in the cache, or you may want to capture some statistics about user sessions that would otherwise be lost when the cache expires.

The library libMemcached include several command-line utilities for interrogating a memcached server. We’ll look at memcdump and memccat.

memcdump prints a list of keys that are (or were, since the data may have expired) stored in a memcached server. Here’s an example of the sorts of keys you might see in an Evergreen system:

memcdump --servers 127.0.0.1:11211
oils_AS_21a5dc5cd2aa42ee7c0ecc239dcb25b5
ac.toc.html.0531301990
open-ils.search_9fd0c6c3553e6979fc63aa634a78b362_facets
open-ils.search_9fd0c6c3553e6979fc63aa634a78b362
oils_auth_8682b1017b7b27035576fecbfc7715c4

The --servers 127.0.0.1:11211 bit tells memcdump to check memcached running on the local server.

A list of keys, however, doesn’t tell you much. To see the value that’s stored under that key, use memccat. Here’s an example of looking at a user session record in Koha (assuming you’ve set the SessionStorage system preference to use memcached):

memccat --servers 127.0.0.1:11211 KOHA78c879b9942dee326710ce8e046acede
---
_SESSION_ATIME: '1363060711'
_SESSION_CTIME: '1363060711'
_SESSION_ID: 78c879b9942dee326710ce8e046acede
_SESSION_REMOTE_ADDR: 192.168.1.16
branch: CPL
branchname: Centerville
cardnumber: cat
emailaddress: ''
firstname: ''
flags: 1
id: cat
ip: 192.168.1.16
lasttime: '1363060711'
number: 51
surname: cat

And here’s an example of an Evergreen user session cached object:

memccat --servers 127.0.0.1:11211 oils_auth_8682b1017b7b27035576fecbfc7715c4
{"authtime":420,"userobj":{"__c":"au","__p":[null,null,null,null,null,null,null,null,null,"186",null,"t",null,"f",119284,38997,0,0,"2011-05-31T11:17:16-0400","0.00","1-888-555-1234","1923-01-01T00:00:00-0500","user@example.org",null,"2015-10-29T00:00:00-0400","User","Test",186,654440,3,null,null,null,"1358890660.7173220299.6945940294",119284,"f",1,null,"",null,null,10,null,1,null,"t",654440,"user",null,"f","2013-01-22T16:37:40-0500",null,"f"]}}

We’ll let the YAMLites and JSONistas square off outside, and take a look at a final example. This is an excerpt a cached catalog search result in Evergreen:

memccat --servers 127.0.0.1:11211 open-ils.search_4b81a8a59544e8c7e9fdcda357d7b05f
{"0":{"summary":{"checked":630,"visible":"546","excluded":84,"deleted":0,"total":630,"complex_query":1},"results":[["74093"],["130197"], ...., ["880940"],["574457"]]}}

There are other tools that let you manipulate the cache, including memcrm to remove keys and memccp to load key/value pairs into memcached.

For a complete list of the command-line tools provided by libMemcached, check out its documentation. To install them on Debian or Ubuntu, run apt-get install libmemcached-tools. Note that the Debian package renames the tools from ‘memdump’ to ‘memcdump’, ‘memcat’ to ‘memccat’, etc., to avoid a naming conflict with another package.

This is the first part in an occasional series on how good data can go bad.bestiary_viper_thumbnail

Consider the following snippets of a MARC21 record for the Spanish edition of the fourth Harry Potter book.

00998nam  2200313 c 4500
...
240 10 $a Harry Potter and the goblet of fire $l Español
245 10 $a Harry Potter y el cáliz de fuego / $c J.K. Rowling ; [traducción, Adolfo Muñoz García y Nieves Martín Azofra]

The original record uses the Unicode character set with the UTF-8 character encoding. However, If you load this record into a modern ILS, e.g. Koha or Evergreen, the title is likely to end up displayed as:

Harry Potter y el c©Łliz de fuego / J.K. Rowling ; [traducci©đn, Adolfo Mu©łoz Garc©Ưa y Nieves Mart©Ưn Azofra]

Too much copyright! This isn’t an electronic course reserves blog!

What happened? Look at the 9th position of the leader (counting from zero), and you’ll see that it is blank. In MARC21, blank means that the record uses the MARC-8 character set, while ‘a’ means that it uses Unicode. Many, if not most, modern MARC tools will go by the Leader/09 to decide if a character conversion is needed. If the leader position is wrong, poor, defenseless diacritics will get mangled.

Why are there so many copyright signs in the mistreated title? As it happens, the UTF-8 representation of many common characters with Western European diacritics starts with byte 195 (or C3 in hexadecimal). What does C3 mean in the MARC-8 character encoding? You’ve guessed it: the copyright symbol.

There are a couple lessons to draw from this. First, using a good character encoding isn’t enough; you must also say what you’re up to. Second, if you look at enough bad data, you will start to recognize patterns on sight. If you deal with a lot of data, that “second sight” is an arcane but useful skill to develop.

CC-BY image of a woodcut of a viper courtesy of the Penn Provenance Project.

As has been noted all over, Anne McCaffrey has left us.

How can one mark the passage of an author? For me, a stranger to her, there’s really only one way: she lives on in her books, and so shall I reread. My wife calls for all who love Pern to read Dragonflight again in memory and honor, but for my part, I will travel to the stars with Helva as she serenades the void.

My local library uses OverDrive, so this evening I went ahead and tried to check out a couple ebooks for my Kindle (well, Kindle app). The steps required were pretty simple: library website to OverDrive catalog to title to checkout page. After I checked it out, I got dropped into Amazon’s website, where I finished by specifying which Kindle app to send the book to.

Of course, Amazon then gave me plenty of opportunity to buy more Kindle books:

One thing that’s not on that page is a link back to the library. It would be nice for the library to be acknowledged, although of course there could be privacy implications if OverDrive is sending Amazon enough information that they could construct such a link.

But suppose I were to purchase one of Amazon’s recommendations. Who benefits? Amazon, obviously. Who else? Is anybody collecting referral fees? And if somebody is collecting referral fees, can the library who paid OverDrive to lend the book that inspired the recommendations in the first place get a piece of the action? What about libraries who have signed themselves up as Amazon affiliates?

There’s a lot to discuss about the announcement, including concerns about patron privacy, Amazon’s DRM policies, and whether and how this will benefit libraries in the long run (in the short run, it at least means that librarians don’t have to answer the question of why they can’t lend books to patrons’ Kindles). But one thing seems pretty clear to me: libraries are about to see their OverDrive hold queues lengthen significantly, which will mean pressure to send more money to OverDrive to meet patron demand. But that doesn’t mean that the libraries can just stop buying physical books, so how is a library to deal with a potentially significant shift in their acquisitions budget?

Bringing this full circle back to the title of this post: can libraries get a piece of the action? Should they?

“It’s just politics.”

This is a common enough phrase, and the usual implication of it is dismissive: if it’s just politics, it’s not about anything really important. It’s grandstanding, it’s just more sound and fury, it’s a sausage factory. At best, it’s the domain of the politicians; let them worry about it. There’s a long post in me about how the attitude behind “it’s just politics” contributes to poor participation in democracy and bad policymaking.

This is not that post.

The inspiration for the post before you was somebody making a comment to more or less that effect the other day in regards to the past and ongoing controversy regarding Koha, its licensing, its trademarks, and its forks. My position on the matter should come as no surprise. If you want Koha, go to http://koha-community.org/. If you’re a librarian using it, please contribute back where you can and participate in the Koha community. If you’re a vendor supporting, implementing, or hacking it, know that it is not just yours, you should give back, obey both the letter and the spirit of the GPL, be a good community member, and don’t worry: you can do all that and still make money! Look Ma! No monopoly!

But dragging myself back on topic, one thing to clear up first: this post is not about the comment that inspired it. I am going after a generality here, not any particular throwaway comment.

What can “it’s just politics” mean when talking about a dispute concerning an open source project and its licensing? Quite a few things:

  1. (Re)opening this can of worms is going to derail any discussion of the software itself for weeks. This can be a very real concern: disputes about the license or the direction of the project can take years to resolve, can become very acrimonious, and frankly can be terribly boring. I, for one, personally don’t find license disputes inherently interesting, and I strongly suspect that most participants in F/OSS projects don’t either. But bitter experience has shown me that sometimes it is necessary to participate anyway and not leave it just to the language lawyers. What can make resolving disputes even more difficult is that email and IRC as communication media have weaknesses that can exacerbate conflict.
  2. Less talk, more code! What doesn’t get done if you’ve just spent an hour fisking the latest volley in the GPL2+ vs. AGPL3 debate? There’s an opportunity cost — that hour wasn’t spent writing some code, or testing, or proofreading the latest documentation edits. That opportunity cost can compound — if you don’t get the kudos for the results of that fisking and miss the warm feeling you get seeing a longstanding bug get closed because of your patch, you may end up disengaging.
  3. Can’t we all get along? It can be very unpleasant being in the middle of an important dispute. While I do think that the Koha community has come out of this stronger than ever, I also mourn the opportunities for human connection and friendships that have been permanently sundered as a result of the conflict.
  4. Newbie here. What is going on?!? It can be very disorienting checking out the mailing list of a F/OSS project you’re considering using only to find that everybody apparently hates each other. It can be even worse if you find yourself making an innocent statement that gets interpreted as firmly putting yourself in one camp or another. Tying this back to the previous point, is the Koha community stronger? Yes. Has it also developed a few shibboleths that cause project regulars to sometimes come down a little too hard on new members of the community? Unfortunately, yes.
  5. From the point of view of an external observer, it’s hard to make sense of what’s going on. It’s all too easy to lose the thread of what’s is being disputed, and the definitive histories of the war tend to come out only after the last bodies have been buried. On the other hand, particularly if you’re an external observer who has some external or self-imposed reason to make judgements about the dispute, do your research: a snap conclusion will almost certainly be the wrong one, or at least lack important nuance.
  6. The noise is getting in the way of figuring out if this software is useful. Fair enough — and if you’re a librarian evaluating ILSs, obviously a key part of your decision should be based on the answer to the following question: will a given ILS solve my problem, either now or in the the realistically foreseeable future. But read on, since that isn’t the only question to be answered.

The outcome of a big dispute in a F/OSS project can be positive, but there’s no question that it can be tremendously messy and painful. But just like in the realm of the elephants, donkeys, and greens, politics informs policy. And policy consequences matter. And there’s no royal road to success:

  • Software doesn’t write itself. People are always involved, and unless you’ve just fired-and-forgotten some code into the wild, any F/OSS project worth thinking about involves more than just one person.
  • The invisible hand is still here. The economics of a F/OSS project may not be based on cash money (though there’s a place for both money and passion), but the fundamental economic problem of resource allocation and human motivation is inescapable.
  • Communities don’t build themselves, and they certainly don’t maintain themselves without effort. In the case of library-related F/OSS projects, there are special considerations: both the library profession and F/OSS hackerdom value sharing. However, there are significant differences in the ways that libraries and hackers tend to communicate and collaborate, and those differences can’t be negotiated without a lot of communication.
  • Regardless of whether you fall more on the “F” side or the “OS” side of the divide in the acronym, F/OSS works for a combination of baldly pragmatic and ethical reasons. But as the very structure of the “F/OSS” acronym implies, there’s are many disagreements and differences of emphasis among F/OSS contributors and users.

What’s missing from these bullet points? The One True Path to Open Source Success. Why is it missing? Because it doesn’t exist: free software has been around long enough that there’s a good body of recommendations on how to do it well, but there’s no consensus.

And if there’s no consensus, then what? It has to be found — or created — or not found, leading to a hopefully amicable parting of the ways. But that can’t happen without discussion, conflict, and resolution. While it certainly isn’t necessary for everybody to participate in every single debate, (constructive!) engagement with the discussion can be a valuable contribution in its own right. If you can help improve how the discussion takes place, even better.

If you or your institution has a stake in the outcome, participating is part of both the duty and promise of F/OSS for libraries: owning our tools, without which our collections will just gather dust.

Put another way, politics, in its broadest and most noble meaning, can’t be avoided, even if engaging means spending some time away from the code. You may as well embrace it.

By the way, I suspect that if you did get manage to get software to write itself, you still couldn’t escape politics. I doubt that an artificial intelligence creative enough to code can be built without incorporating a sufficient degree of complexity that it would be able to avoid all moments of indecision. AI politics may well end up looking rather bizarre to humans, but they’d still be faced with the basic political problem of resolving conflict and allocating resources.

I had a great time at the Evergreen conference this year. Some of the highlights for me are:

  • I got to see a lot of friends, both old and new.
  • We started signing the fiscal sponsorship agreement (PDF) with the Software Freedom Conservancy. After the document crosses the country and back and gets the last few signatures, Evergreen will officially become the latest member project of the SFC.
  • We made great progress setting up for the project’s move from SVN to Git. (If you’re a developer, please weigh in now on the vote taking place on open-ils-dev.)
  • I talked a lot.
  • Every restaurant I ate at was good, without exception.
  • I learned how to play Munchkin.
  • My wife and I found a house to rent.
  • I even managed to get some sleep one night.

Many thanks to the conference committee for organizing a wonderful event. I’ll close with this image from the presentation by Rogan Hamby and Shasta Brewer:

[image of kitten shouting huzzah!]

There are lot of good changes coming in Koha 3.4.0, which will be released tomorrow. Check out the current draft of the release notes. But this release of Koha includes some major architectural changes, and although the upgrade process is simple, it definitely pays to read the instructions first.  

In particular, there are two upgrade steps that should not be missed:

Install Template::Toolkit

Koha 3.4.0 uses the Template::Toolkit Perl module instead of HTML::Template::Pro for the OPAC and staff interface templates.  Template::Toolkit must be installed before trying to run the web updater, as the web installer itself now uses TT.  If you run Koha on Debian or Ubuntu, run apt-get install libtemplate-perl. On other Linux and Unix platforms, install the packaged version of TT if available; if a packaged version isn’t available, run cpan Template.

Note that if you’re following the instructions, running ./koha_perl_deps.pl -u -m will catch the TT dependency requirement. Just don’t forget to actually install it.

Run scripts to update your bib records

Koha 3.4.0 will no longer store copies of the item record data as MARC fields in the bibliographic records. This resolves a long-standing performance issue where changing an item record (even just to change its status when it is checked out) required that Koha update the bibliographic record as well. However, this means that during upgrade it is necessary to touch all of the bib records in order to remove the item tags. To do this, run the following steps:

misc/maintenance/remove_items_from_biblioitems.pl --run
misc/migration_tools/rebuild_zebra.pl -b -r

This can take several hours on a large database, so plan accordingly.