Preserving the usefulness of the Hugo Awards as a selection tool for libraries

The Hugo Awards have been awarded by the World Science Fiction Convention for decades, and serve to recognize the works of authors, editors, directors – fans and professionals – in the genres of science fiction and fantasy.  The Hugos are unique in being a fan-driven award that has as much process – if not more – as juried awards.

That process has two main steps.  First, there’s a nomination period where members of Worldcon select works to appear on the final ballot. Second, members of the upcoming Worldcon vote on the final ballot and the awards are given out at the convention.

Typically, rather more folks vote on the final ballot than nominate – and that means that small, organized groups of people can unduly influence the nominations.  However, there’s been surprisingly few attempts to actually do that.

Until this year.

Many of the nominations this year match the slates of two groups, the “Sad Puppies” and the “Rabid Puppies.”  Not only that, some of the categories contain nothing but Puppy nominations.

The s.f. news site File 770 has a comprehensive collection of back-and-forth about the matter, but suffice it so say that the Puppy slates are have a primarily political motivation – and one, in the interests of full disclosure, that I personally despise.

There are a lot of people saying smart things about the situation, so I’ll content myself with the following observation:

Slate nominations and voting destroy the utility of the Hugo Award lists for librarians who select science fiction and fantasy.

Why? Ideally, the Hugo process ascertains the preferences of thousands of Worldcon members to arrive at a general consensus of science fiction and fantasy that is both good and generally appealing.  As it happens, that’s a pretty useful starting point for librarians trying to round out collections or find new authors that their patrons might like – particularly for those librarians who are not themselves fans of the genre.

However, should slate voting become a successful tactic, the Hugo Awards are in danger of ending up simply reflecting which factions in fandom are best able to game the system.  The results of that… are unlikely to be all that useful for librarians.

Here’s my suggestion for librarians who are fans of science fiction and fantasy and who want to help preserve a collection development tool: get involved.  In particular:

  1. Join Worldcon. A $40 supporting membership suffices to get voting privileges.
  2. Vote on the Hugos this year. I won’t tell you who the vote for, but if you agree with me that slate nominations are a problem, consider voting accordingly.
  3. Next year, participate in the nomination process. Don’t participate in nomination slates; instead, nominate those works that you think are worthy of a Hugo – full stop.

Three tales regarding a decrease in the number of catalogers

Discussions on Twitter today – see the timelines of @cm_harlow and @erinaleach for entry points – got me thinking.

In 1991, the Library of Congress had 745 staff in its Cataloging Directorate. By the end of FY 2004, the LC Bibliographic Access Divisions had between 5061 and 5612 staff.

What about now? As of 2014, the Acquisitions and Bibliographic Access unit has 238 staff3.

While I’m sure one could quibble about the details (counting FTE vs. counting humans, accounting for the reorganizations, and so forth), the trend is clear: there has been a precipitous drop in the number of cataloging staff employed by the Library of Congress.

I’ll blithely ignore factors such as shifts in the political climate in the U.S. and how they affect civil service. Instead, I’ll focus on library technology, and spin three tales.

The tale of the library technologists

The decrease in the number of cataloging staff are one consequence of a triumph of library automation. The tools that we library technologists have written allow catalogers to work more efficiently. Sure, there are fewer of them, but that’s mostly been due to retirements. Not only that, the ones who are left are now free to work on more intellectually interesting tasks.

If we, the library technologists, can but slip the bonds of legacy cruft like the MARC record, we can make further gains in the expressiveness of our tools and the efficiencies they can achieve. We will be able to take advantage of metadata produced by other institutions and people for their own ends, enabling library metadata specialists to concern themselves with larger-scale issues.

Moreover, once our data is out there – who knows what others, including our patrons, can achieve with it?

This will of course be pretty disruptive, but as traditional library catalogers retire, we’ll reach buy-in. The library administrators have been pushing us to make more efficient systems, though we wish that they would invest more money in the systems departments.

We find that the catalogers are quite nice to work with one-on-one, but we don’t understand why they seem so attached to an ancient format that was only meant for record interchange.

The tale of the catalogers

The decrease in the number of cataloging staff reflects a success of library administration in their efforts to save money – but why is it always at our expense? We firmly believe that our work with the library catalog/metadata services counts as a public service, and we wish more of our public services colleagues knew how to use the catalog better.  We know for a fact that what doesn’t get catalogued may as well not exist in the library.

We also know that what gets catalogued badly or inconsistently can cause real problems for patrons trying to use the library’s collection.  We’ve seen what vendor cataloging can be like – and while sometimes it’s very good, often it’s terrible.

We are not just a cost center. We desperately want better tools, but we also don’t think that it’s possible to completely remove humans from the process of building and improving our metadata. 

We find that the library technologists are quite nice to work with one-on-one – but it is quite rare that we get to actually speak with a programmer.  We wish that the ILS vendors would listen to us more.

The tale of the library directors

The decrease in the number of cataloging staff at the Library of Congress is only partially relevant to the libraries we run, but hopefully somebody has figured out how to do cataloging more cheaply. We’re trying to make do with the money we’re allocated. Sometimes we’re fortunate enough to get a library funding initiative passed, but more often we’re trying to make do with less: sometimes to the point where flu season makes us super-nervous about our ability to keep all of the branches open.

We’re concerned not only with how much of our budgets are going into electronic resources, but with how nigh-impossible it is to predict increases in fees for ejournal subscriptions/ fees for ebook services.

We find that the catalogers and the library technologists are pleasant enough to talk to, but we’re not sure how well they see the big picture – and we dearly wish they could clearly articulate how yet another cataloging standard / yet another systems migration will make our budgets any more manageable.

Each of these tales is true. Each of these tales is a lie. Many other tales could be told. Fuzziness abounds.

However, there is one thing that seems clear: conversations about the future of library data and library systems involve people with radically different points of view. These differences do not mean that any of the people engaged in the conversations are villains, or do not care about library users, or are unwilling to learn new things.

The differences do mean that it can be all too easy for conversations to fall apart or get derailed.

We need to practice listening.

1. From testmony by the president of the Library of Congress Professional Guild to Congress on 6 March 2015.
2. From the BA FY 2004 report. This including 32 staff from the Cataloging Distribution Service, which had been merged into BA and had not been part of the Cataloging Directorate.
3. From testmony by the president of the Library of Congress Professional Guild to Congress on 6 March 2015.

Unsolved problems

I saw a lot of pain yesterday. I will see more pain today.

Pain from women saying that it’s back to the whisper network for them. Pain from women acknowledging the many faults of whisper networks.

Pain from women who do not want to be chilled — and who yet find themselves in the far north, with the wolves circling.

Pain from women who have seen their colleagues fail them before, and before, and before — and who have less hope now that the future of libraries will be any better.

Pain from women who fear that licenses were issued yesterday — licenses to maintain the status quo, licenses to grind away the hopes and dreams of those women in libraries who want to change the world (or who simply want to catalog books in peace and go home at the end of the day).

Above all, pain from women whose words are now constrained by the full force of the law — and who are now the target of every passerby who has much time and little empathy.

I will speak plainly: Lisa Rabey and nina de jesus did a brave thing, a thing that could never have rebounded to their personal advantage no matter the outcome of the lawsuit. I respect them, and I wish them whatever peace they can find after this.

I will speak bluntly to men in the library profession: regardless of what you think of the case that ended yesterday — regardless of what you think of Joe Murphy’s actions or of the actions of Team Harpy — sexual harassment in our profession is real; the pain our colleagues experience due to it is real.

It remains an unsolved problem.

It remains our unsolved problem.

We must do our part to fix it.

Not sure how? Neither am I. But at least as librarians and library workers, we have access to plenty of tools to learn, to listen.

Time to roll up our sleeves.

Briefly: information-seeking in the age of Google juice

Just now I read a comment on a blog to the effect that although the commenter was mildly interested in something, she wasn’t going to Google it.

The something in question was a racist cultural practice that used to be common, but nowadays is condemned by most.

Why not search to learn more about it? She wasn’t particularly concerned that somebody would go through her web search history and think she endorsed the practice. Rather, she didn’t want to contribute even so much as a click to making the practice have higher visibility on the web.

Google searches ≠ endorsement, of course – but I find it interesting that nowadays some find it necessary to say so explicitly.

Henriette Avram versus the world: Is COBOL capable of processing MARC?

Is the COBOL programming language capable of processing MARC records?

A computer programmer in 2015 could be excused for thinking to herself, what kind of question is that!?! Surely it’s obvious that any programming language capable of receiving input can parse a simple, antique record format?

In 1968, it apparently wasn’t so obvious. I turned up an article by Henriette Avram and a colleague, MARC II and COBOL, that was evidently written in response to a review article by a Hillis Griffin where he stated

Users will require programmers skilled in languages other than FORTRAN or COBOL to take advantage of MARC records.

Avram responded to Griffin’s concern in the most direct way possible: by describing COBOL programs developed by the Library of Congress to process MARC records and generate printed catalogs. Her article even include source code, in case there were any remaining doubts!

I haven’t yet turned up any evidence that Henriette Avram and Grace Hopper ever met, but it was nice to find a close, albeit indirect connection between the two of them via COBOL.

Is the debate between Avram and Griffen in 1968 regarding COBOL and MARC anything more than a curiosity? I think it is — many of the discussions she participated in are reminiscent of debates that are taking place now. To fair to Griffin, I don’t know enough about the computing environment of the late sixties to be able to definitely say that his statement was patently ill-informed at the time — but given that by 1962 IBM had announced that they were standardizing on COBOL, it seems hardly surprising that Avram and her group would be writing MARC processing code in COBOL on an IBM/360 by 1968. To me, the concerns that Griffin raised seem on par with objections to Library Linked Data that assume that each library catalog request would necessarily mean firing off a dozen requests to RDF providers — objections that have rejoinders that are obvious to programmers, but perhaps not so obvious to others.

Plus ça change, plus c’est la même chose?

Notes on making my WordPress blog HTTPS-only

The other day I made this blog,, HTTPS-only.  In other words, if Eve want to sniff what Bob is reading on my blog, she’ll need to do more than just capture packets between my blog and Bob’s computer to do so.

This is not bulletproof: perhaps Eve is in possession of truly spectacular computing capabilities or a breakthrough in cryptography and can break the ciphers. Perhaps she works for any of the sites that host external images, fonts, or analytics for my blog and has access to their server logs containing referrer headers information.  Currently these sites are Flickr (images), Gravatar (more images), Google (fonts) or WordPress (site stats – I will be changing this soon, however). Or perhaps she’s installed a keylogger on Bob’s computer, in which case anything I do to protect Bob is moot.

Or perhaps I am Eve and I’ve set up a dastardly plan to entrap people by recording when they read about MARC records, then showing up at Linked Data conferences and disclosing that activity.  Or vice versa. (Note: I will not actually do this.)

So, yes – protecting the privacy of one’s website visitors is hard; often the best we can do is be better at it than we were yesterday.

To that end, here are some notes on how I made my blog require HTTPS.


I got my SSL certificate from Why them?  Their price was OK, I already register my domains through them, and I like their corporate philosophy: they support a number of free and open source software projects; they’re not annoying about up-selling, and they have never (to my knowledge) run sexist advertising, unlikely some of their larger and more well-known competitors. But there are, of course, plenty of options for getting SSL certificates, and once Let’s Encrypt is in production, it should be both cheaper and easier for me to replace the certs next year.

I have three subdomains of that I wanted a certificate for, so I decided to get a multi-domain certificate.  I consulted this tutorial by rtCamp to generate the CSR.

After following the tutorial to create a modified version of openssl.conf specifying the subjectAltName values I needed, I generated a new private key and a certificate-signing request as follows:

The openssl command asked me a few questions; the most important of which being the value to set the common name (CN) field; I used “” for that, as that’s the primary domain that the certificate protects.

I then entered the text of the CSR into a form and paid the cost of the certificate.  Since I am a library techie, not a bank, I purchased a domain-validated certificate.  That means that all I had to prove to the certificate’s issuer that I had control of the three domains that the cert should cover.  That validation could have been done via email to an address at or by inserting a special TXT field to the DNS zone file for I ended up choosing to go the route of placing a file on the web server whose contents and location were specified by the issuer; once they (or rather, their software) downloaded the test files, they had some assurance that I had control of the domain.

In due course, I got the certificate.  I put it and the intermediate cert specified by Gandi in the /etc/ssl/certs directory on my server and the private key in /etc/private/.

Operating System and Apache configuration

Various vulnerabilities in the OpenSSL library or in HTTPS itself have been identified and mitigated over the years: suffice it to say that it is a BEASTly CRIME to make a POODLE suffer a HeartBleed — or something like that.

To avoid the known problems, I wanted to ensure that I had a recent enough version of OpenSSL on the web server and had configured Apache to disable insecure protocols (e.g., SSLv3) and eschew bad ciphers.

The server in question is running Debian Squeeze LTS, but since OpenSSL 1.0.x is not currently packaged for that release, I ended up adding Wheezy to the APT repositories list and upgrading the openssl and apache2 packages.

For the latter, after some Googling I ended up adapting the recommended Apache SSL virtualhost configuration from this blog post by Tim Janik.  Here’s what I ended up with:

I also wanted to make sure that folks coming in via old HTTP links would get permanently redirected to the HTTPS site:

Checking my work

I’m a big fan of the Qualsys SSL Labs server test tool, which does a number of things to test how well a given website implements HTTPS:

  • Identifying issues with the certificate chain
  • Whether it supports vulnerable protocol versions such as SSLv3
  • Whether it supports – and request – use of sufficiently strong ciphers.
  • Whether it is vulnerable to common attacks.

Suffice it to say that I required a couple iterations to get the Apache configuration just right.


To be fully protected, all of the content embedded on a web page served via HTTPS must also be served via HTTPS.  In other words, this means that image URLs should require HTTPS – and the redirects in the Apache config are not enough.  Here is the sledgehammer I used to update image links in the blog posts:


I also needed to tweak a couple plugins to use HTTPS rather than HTTP to embed their icons or fetch JavaScript.

Finishing touches

In the course of testing, I discovered a couple more things to tweak:

  • The web sever had been using Apache’s mod_php5filter – I no longer remember why – and that was causing some issues when attempting to load the WordPress dashboard.  Switching to mod_php5 resolved that.
  • My domain ownership proof on failed after the switch to HTTPS.  I eventually tracked that down to the fact that doesn’t have a bunch of intermediate certificates in its certificate store that many browsers do. I resolved this by adding a cross-signed intermediate certificate to the file referenced by SSLCertificateChainFile in the Apache config above.

My blog now has an A+ score from SSL Labs. Yay!  Of course, it’s important to remember that this is not a static state of affairs – another big OpenSSL or HTTPS protocol vulnerability could turn that grade to an F.  In other words, it’s a good idea to test one’s website periodically.

The Vanilla Password Reflex, or libraries and security education by example

At the first face-to-face meeting of the LITA Patron Privacy Technologies Interest Group at Midwinter, one of the attendees mentioned that they had sent out an RFP last year for library databases. One of the questions on the RFP asked how user passwords were stored — and a number of vendors responded that their systems stored passwords in plain text.

Here’s what I tweeted about that, and here is Dorothea Salo’s reply:

This is a repeatable response, by the way — much like the way a hammer strike to the patellar ligament instigates a reflexive kick, mention of plain-text password storage will trigger an instinctual wail from programmers, sysadmins, and privacy and security geeks of all stripes.

Call it the Vanilla Password Reflex?

I’m not suggesting that you should whisper “plain text passwords” into the ear of your favorite system designer, but if you are the sort to indulge in low and base amusements…

A recent blog post by Eric Hellman discusses the problems with storing passwords in plain text in detail. The upshot is that it’s bad practice — if a system’s password list is somehow leaked, and if the passwords are stored in plain text, it’s trivially easy for a cracker to use those passwords to get into all sorts of mischief.

This matters, even “just” for library reference databases. If we take the right to reader privacy seriously, it has to extend to the databases offered by the library — particularly since many of them have features to store citations and search results in a user’s account.

As Eric mentions, the common solution is to use a one-way cryptographic hash function to transform the user’s password into a bunch of gobbledegook.

For example, “p@ssw05d” might be stored as the following hash:


To make it more secure, I might add some random salt and end up with the following salted hash:


To log in, the user has to prove that they know the password by supplying it, but rather than compare the password directly, the result of the one-way function applied to the password is compared with the stored hash.

How is this more secure? If a hacker gets the list of password hashes, they won’t be able to deduce the passwords, assuming that the hash function is good enough. What counts as good enough? Well, relatively few programmers are experts in cryptography, but suffice it to say that there does exist a consensus on techniques for managing passwords and authentication.

The idea of one-way functions to encrypt passwords is not new; in fact, it dates back to the 1960s. Nowadays, any programmer who wants to be considered a professional really has no excuse for writing a system that stores passwords in plain text.

Back to the “Vanilla Password Reflex”. It is, of course, not actually a reflex in the sense of an instinctual response to a stimulus — programmers and the like get taught, one way or another, about why storing plain text passwords is a bad idea.

Where does this put the public services librarian? Particularly the one who has no particular reason to be well versed in security issues?

At one level, it just changes the script. If a system is well-designed, if a user asks what their password is, it should be impossible to get an answer to the question. How to respond to a patron who informs you that they’ve forgotten their password? Let them know that you can change it for them. If they respond by wondering why you can’t just tell them, if they’re actually interested in the answer, tell them about one-way functions — or just blame the computer, that’s fine too if time is short.

However, libraries and librarians can have a broader role in educating patrons about online security and privacy practices: leading by example. If we insist that the online services we recommend follow good security design; if we use HTTPS appropriately; if we show that we’re serious about protecting reader privacy, it can only buttress programming that the library may offer about (say) using password managers or avoiding phishing and other scams.

There’s also a direct practical benefit: human nature being what it is, many people use the same password for everything. If you crack an ILS’s password list, you’ve undoubtedly obtained a non-negligible set of people’s online banking passwords.

I’ll end this with a few questions. Many public services librarians have found themselves, like it or not, in the role of providing technical support for e-readers, smartphones, and laptops. How often does online security come up during such interactions? How often to patrons come to the library seeking help against the online bestiary of spammers, phishers, and worse? What works in discussing online security with patrons, who of course can be found at all levels of computer savvy? And what doesn’t?

I invite discussion — not just in the comments section, but also on the mailing list of the Patron Privacy IG.

Ogres, hippogriffs, and authorized Koha service providers

What do ogres, hippogriffs, and authorized Koha service providers have in common?

Each of them is an imaginary creature.

20070522 Madrid: hippogriff -- image by Larry Wentzel on Flickr (CC-BY)

20070522 Madrid: hippogriff — image by Larry Wentzel on Flickr (CC-BY)

Am I saying that Koha service providers are imaginary creatures? Not at all — at the moment, there are 54 paid support providers listed on the Koha project’s website.

But not a one of them is “authorized”.

I bring this up because a friend of mine in India (full disclosure: who himself offers Koha consulting services) ran across this flyer by Avior Technologies:

Avior information sheet

The bit that I’ve highlighted is puffery at best, misleading at worst. The Koha website’s directory of paid support providers is one thing, and one thing only: a directory. The Koha project does not endorse any vendors listed there — and neither the project nor the Horowhenua Library Trust in New Zealand (which holds various Koha trademarks) authorizes any firm to offer Koha services.

If you want your firm to get included in the directory, you need only do a few things:

  1. Have a website that contains an offer of services for Koha.
  2. Ensure that your page that offers services links back to
  3. Make a public request to be added to the directory.

That’s it.

Not included on this list of criteria:

  • Being good at offering services for Koha libraries.
  • Contributing code, documentation, or anything else to the Koha project.
  • Having any current customers who are willing to vouch for you.
  • Being alive at present (although eventually, your listing will get pulled for lack of response to inquiries from Koha’s webmasters).

What does this mean for folks interested in getting paid support services?  There is no shortcut to doing your due diligence — it is on you to evaluate whether a provider you might hire is competent and able to keep their customers reasonably happy. The directory on the Koha website exists as a convenience for folks starting a search for a provider, but beyond that: caveat emptor.

I know nothing about Avior Technologies. They may be good at what they do; they may be terrible — I make no representation either way.

But I do know this: while there are some open source projects where the notion of an “authorized” or “preferred” support provider may make some degree of sense, Koha isn’t such a project.

And that’s generally to the good of all: if you have Koha expertise or can gain it, you don’t need to ask anybody’s permission to start helping libraries run Koha — and get paid for it.  You can fill niches in the market that other Koha support providers cannot or do not fill.

You can in time become the best Koha vendor in your niche, however you choose to define it.

But authority? It will never be bestowed upon you. It is up to you to earn it by how well you support your customers, and by how much you contribute to the global Koha project.


Putting my time where my mouth is on the patron privacy front

Shortly after it came to light that Adobe Digital Editions was transmitting information about ebook reading activity in the clear, for anybody to snoop upon, I asked a loaded question: does ALA have a role in helping to verify that the software libraries use protect the privacy of readers?

As with any loaded question, I had an answer in mind: I do think that ALA and LITA, by virtue of their institutional heft and influence with librarians, can provide significant assistance in securing library software.

I waited a bit, wondering how the powers that be at ALA would respond. Then I remembered something: an institution like ALA is not, in fact, a faceless, inscrutable organism. Like Soylent Green, ALA is people!

Well, maybe not so much like Soylent Green. My point is that despite ALA’s reputation for being a heavily bureaucratic, procedure-bound organization, it does offer ways for members to take up and idea an run with it.

And that’s what I did — I floated a petition to form a new interest group within LITA, the Patron Privacy Technologies IG. Quite a few people signed it… and it now lives!

Here’s the charge of the IG:

The LITA Patron Privacy Technologies Interest Group will promote the design and implementation of library software and hardware that protects the privacy of library users and maximizes user ability to make informed decisions about the use of personally identifiable information by the library and its vendors.

Under this remit, activities of the Interest Group would include, but are not necessarily limited to:

  1. Publishing recommendations on data security practices for library software.
  2. Publishing tutorials on tools for libraries to use to check that library software is handling patron information responsibly.
  3. Organizing efforts to test commercially available software that handle patron information.
  4. Providing a conduit for responsible disclosure of defects in software that could lead to exposure of library patron information.
  5. Providing sample publicity materials for libraries to use with their patrons in explaining the library’s privacy practices.

I am fortunate to have two great co-chairs, Emily Morton-Owens of the Seattle Public Library and Matt Beckstrom of the Lewis and Clark Library, and I’m happy to announce that the IG’s first face-to-face meeting will at ALA Midwinter 2015 — specifically  tomorrow, at 8:30 a.m. Central Time in the Ballroom 1 of the Sheraton in Chicago.

We have two great speakers lined up — Alison Macrina of the Library Freedom Project and Gary Price of INFODocket, and I’m very much looking forward to it.

But I’m also looking forward to the rest of the meeting: this is when the IG will, as a whole, decide how far to reach.  We have a lot of interest and the ability to do things that will teach library staff and our patrons how to better protect privacy, teach library programmers how to design and code for privacy, and verify that our tools match our ideals.

Despite the title of this blog post… it’s by no means my effort alone that will get us anywhere. Many people are already engaging in issues of privacy and technology in libraries, but I do hope that the IG will provide one more point of focus for our efforts.

I look forward to the conversation tomorrow.

Our move, by the numbers

We’ve landed in Atlanta, having completed our move from Seattle driving cross-country.  Here are some numbers to ponder, with apologies to Harper’s magazine.

  • Humans: 2
  • Cats: 3
  • Miles as the car rolls: 3,600
  • Miles per gallon: 42.1
  • Average speed of the car: 174,720 furlongs per fortnight
  • Seconds spent pondering whether to use furlongs or smoots for the previous measure: 15
  • Cracked windshields: 1
  • Cats who forgot that if the tail is visible, the cat is visible: 1
    2014-12-02 21.25.56
  • Mornings that the cats were foiled by platform beds: 5
  • Mornings that the cats were foiled by an air mattress: 2
  • Mornings that the humans were foiled by a bed with an underneath: 2
  • Number of cats disappointed that said beds turned out to be moveable: 3
  • Hours spent experiencing the thrills of Los Angeles rush hour traffic: 3
  • Calls from a credit card fraud monitoring department: 1
  • Hotel hot tubs dipped into: 2
  • Restaurant restrooms with disconcerting signs: 1
    restroom sign
  • Progress of feline excavation to China: no report
  • Fueling stops: 10
  • Net timezone difference: +3.0
  • Number of moving company staff involved: 9
  • Host cats consternated by the arrival of three interlopers: 4
  • Cats who decided to spend a few hours under the covers to bring down the number of whelms: 1
  • Tweets sent using the #SEAtoATL hashtag, including this post’s tweet: 23
  • Nights spent in California: 2
  • Nights spent in Texas: 3
  • Humans and cats happy to have arrived: 5