Category Archives: Libraries

Continuing the lesson

The other day, school librarian and author Jennifer Iacopelli tweeted about her experience helping a student whose English paper had been vandalized by some boys. After she had left the Google Doc open in the library computer lab when she went home, they had inserted some “inappropriate” stuff. When she and her mom went to work on it later that evening, mom saw the insertions, was appalled, and grounded the student. Iacopelli, using security camera footage from the library’s computer lab, was able to demonstrate that the boys were responsible, with the result that the grounding was lifted and the boys suspended.

This story has gotten retweeted 1,300 times as of this writing and earned Iacopelli a mention as a “badass librarian” in HuffPo.

Before I continue, I want to acknowledge that there isn’t much to complain about regarding the outcome: justice was served, and mayhap the boys in question will think thrice before attacking the reputation of another or vandalizing their work.

Nonetheless, I do not count this as an unqualified feel-good story.

I have questions.

Was there no session management software running on the lab computers that would have closed off access to the document when she left at the end of the class period? If not, the school should consider installing some. On the other hand, I don’t want to hang too much on this pin; it’s possible that some was running but that a timeout hadn’t been reached before the boys got to the computer.

How long is security camera footage from the library computer lab retained? Based on the story, it sounds like it is kept at least 24 hours. Who, besides Iacopelli, can access it? Are there procedures in place to control access to it?

More fundamentally: is there a limit to how far student use of computers in that lab is monitored? Again, I do not fault the outcome in this case—but neither am I comfortable with Iacopelli’s embrace of surveillance.

Let’s consider some of the lessons learned. The victim learned that adults in a position of authority can go to bat for her and seek and acquire justice; maybe she will be inspired to help others in a similar position in the future. She may have learned a bit about version control.

She also learned that surveillance can protect her.

And well, yes. It can.

But I hope that the teaching continues—and not the hard way. Because there are other lessons to learn.

Surveillance can harm her. It can cause injustice, against her and others. Security camera footage sometimes doesn’t catch the truth. Logs can be falsified. Innocent actions can be misconstrued.

Her thoughts are her own.

And truly badass librarians will protect that.

How to build an evil library catalog

Consider a catalog for a small public library that features a way to sort search results by popularity. There are several ways to measure “popularity” of a book: circulations, hold requests, click-throughs in the catalog, downloads, patron-supplied ratings, place on bestseller lists, and so forth.

But let’s do a little thought experiment: let’s use a random number generator to calculate popularity.

However, the results will need to be plausible. It won’t do to have the catalog assert that the latest J.D. Robb book is gathering dust in the stacks. Conversely, the copy of 1959 edition of The geology and paleontology of the Elk Mountain and Tabernacle Butte area, Wyoming that was given to the library right after the last weeding is never going to be a doorbuster.

So let’s be clever and ensure that the 500 most circulated titles in the collection retain their expected popularity rating. Let’s also leave books that have never circulated alone in their dark corners, as well as those that have no cover images available. The rest, we leave to the tender mercies of the RNG.

What will happen? If patrons use the catalog’s popularity rankings, if they trust them — or at least are more likely to look at whatever shows up near the top of search results — we might expect that the titles with an artificial bump from the random number generator will circulate just a bit more often.

Of course, testing that hypothesis by letting a RNG skew search results in a real library catalog would be unethical.

But if one were clever enough to be subtle in one’s use of the RNG, the patrons would have a hard time figuring out that something was amiss.  From the user’s point of view, a sufficiently advanced search engine is indistinguishable from a black box.

This suggests some interesting possibilities for the Evil Librarian of Evil:

  • Some manual tweaks: after all, everybody really ought to read $BESTBOOK. (We won’t mention that it was written by the ELE’s nephew.)
  • Automatic personalization of search results. Does geolocation show that the patron’s IP address is on the wrong side of the tracks? Titles with a lower reading level just got more popular!
  • Has the patron logged in to the catalog? Personalization just got better! Let’s check the patron’s gender and tune accordingly!

Don’t be the ELE.

But as you work to improve library catalogs… take care not to become the ELE by accident.

What makes the annual Code4Lib conference special?

There’s now a group of people taking a look at whether and how to set up some sort of ongoing fiscal entity for the annual Code4Lib conference.  Of course, one question that comes to mind is why go to the effort? What makes the annual Code4Lib conference so special?

There are lot of narratives out there about how the Code4Lib conference and the general Code4Lib community has helped people, but for this post I want to focus on the conference itself. What does the conference do that is unique or uncommon? Is there anything that it does that would be hard to replicate under another banner? Or to put it another way, what makes Code4Lib a good bet for a potential fiscal host — or something worth going to the effort of forming a new non-profit organization?

A few things that stand out to me as distinctive practices:

  • The majority of presentations are directly voted upon by the people who plan to attend (or who are at least invested enough in Code4Lib as a concept to go to the trouble of voting).
  • Similarly, keynote speakers are nominated and voted upon by the potential attendees.
  • Each year potential attendees vote on bids by one or more local groups for the privilege of hosting the conference.
  • In principle, most any aspect of the structure of the conference is open to discussion by the broader Code4Lib community — at any time.
  • Historically, any surplus from a conference has been given to the following year’s host.
  • Any group of people wanting to go to the effort can convene a local or regional Code4Lib meetup — and need not ask permission of anybody to do so.

Some practices are not unique to Code4Lib, but are highly valued:

  • The process for proposing a presentation or a preconference is intentionally light-weight.
  • The conference is single-track; for the most part, participants are expected to spend most of each day in the same room.
  • Preconferences are inexpensive.

Of course, some aspects of Code4Lib aren’t unique. The topic area certainly isn’t; library technology is not suffering any particular lack of conferences. While I believe that Code4Lib was one of the first libtech conferences to carve out time for lightning talks, many conferences do that nowadays. Code4Lib’s dependence on volunteer labor certainly isn’t unique, although putting aside keynote speakers) Code4Lib may be unique in having zero paid staff.

Code4Lib’s practice of requiring local hosts to bootstrap their fiscal operations from ground zero might be unique, as is the fact that its planning window does not extend much past 18 months. Of course, those are both arguably misfeatures that having fiscal continuity could alleviate.

Overall, the result has been a success by many measures. Code4Lib can reliably attract at least 400 or 500 attendees. Given the notorious registration rush each fall, it could very likely be larger. With its growth, however, come substantially higher expectations placed on the local hosts, and rather larger budgets — which circles us right back to the question of fiscal continuity.

I’ll close with a question: what have I missed? What makes Code4Lib qua annual conference special?

Update 2016-06-29: While at ALA Annual, I spoke with someone who mentioned another distinctive aspect of the conference: the local host is afforded broad latitude to run things as they see fit; while there is a set of lore about running the event and several people who have been involved in multiple conferences, there is no central group that dictates arrangements.  For example, while a couple recent conferences have employed a professional conference organizer, there’s nothing stopping a motivated group from doing all of the work on their own.

Cataloging and coding as applied empathy: a Mashcat discussion prompt

Consider the phrase “Cataloging and coding as applied empathy”.  Here are some implications of those six words:

  • Catalogers and coders share something: what we build is mainly for use by other people, not ourselves. (Yes, programmers often try to eat our own dogfood, and catalogers tend to be library users, but that’s mostly not what we’re paid for.)
  • Consideration of the needs of our users is needed to do our jobs well, and to do right by our users.
  • However: we cannot rely on our users to always tell us what to do:
    • sometimes they don’t know what it is possible to want;
    • sometimes they can’t articulate what they want in a way that lends itself to direct translation to code or taxonomy;
    • it is rarely their paid job to tell us what they want, and how to build it.
  • Waiting for users to tell exactly us what to do can be a decision… to do nothing. Sometimes doing nothing is the best thing to do; often it’s not.
  • Therefore, catalogers and coders need to develop empathy.
  • Applied empathy: our catalogs and our software in some sense embody our empathy (or lack thereof).
  • Applied empathy: empathy can be a learned skill.

Is “applied empathy” a useful framework for discussing how to serve our users? I don’t know, so I’d like to chat about it.  I will be moderating a Mashcat Twitter chat on Thursday, 12 May 2016, at 20:30 UTC (time converter). Do you have questions to suggest? Please add them to the Google doc for this week’s chat.

Natural and unnatural problems in the domain of library software

I offer up two tendentious lists. First, some problems in the domain of library software that are natural to work on, and in the hopeful future, solve:

  • Helping people find stuff. On the one hand, this surely comes off as simplistic; on the other hand, it is the core problem we face, and has been the core problem of library technology from the very moment that a library’s catalog grew too large to stay in the head of one librarian.  There are of course a number of interesting sub-problems under this heading:
    • Helping people produce and maintain useful metadata.
    • Usefully aggregating metadata.
    • Helping robots find stuff (presumably with the ultimate purpose of helping people to find stuff).
    • Artificial intelligence. By this I’m not suggesting that library coders should be aiming to have an ILS kick off the Singularity, but there’s plenty of room for (e.g.) natural language processing to assist in the overall task of helping people find stuff.
  • Helping people evaluate stuff. “Too much information, little knowledge, less wisdom” is one way of describing the glut of bits infesting the Information Age. Libraries can help and should help—even though pitfalls abound.
  • Helping people navigate software and information resources. This includes UX for library software, but also a lot of other software that librarians, like it or not, find themselves helping patrons use. There are some areas of software engineering where the programmer can assume that the user is expert in the task that the software assists with; library software isn’t one of them.
  • Sharing stuff. What is Evergreen if not a decade-long project in figuring out ways to better share library materials among more users? Sharing stuff is not a solved problem even for digital stuff.
  • Keeping stuff around. This is an increasingly difficult problem. Time was, you could leave a pile of books sitting around and reasonably expect that at least a few would still exist five hundred years hence. Digital stuff never rewards that sort of carelessness.
  • Protecting patron privacy. This nearly ended up in the unnatural list—a problem can be unnatural but nonetheless crucial to work on. However, since there’s no reason to expect that people will stop being nosy about what other people are reading—and for that nosiness to sometimes turn into persecution—here we are.
  • Authentication. If the library keeps any transaction information on behalf of a patron so that they can get to it later, the software had better be trying to make sure that only the correct patron can see it. Of course, one could argue that library software should never store such information in the first place (after, say, a loan is returned), but I think there can be an honest conflict with patrons’ desires to keep track of what they used in the past.

Second, some distinctly unnatural problems that library technologists all too often must work on:

  • Digital rights management. If Ambrose Bierce were alive, I would like to think that he might define DRM in a library context thus: “Something that is ineffective in its stated purpose—and cannot possible be effective—but which serves to compromise libraries’ commitment to patron privacy in the pursuit of a misunderstanding about what will keep libraries relevant.”
  • Walled garden maintenance. Consider EZproxy. It takes the back of a very small envelope to realize that hundreds of thousands of person-hours have been expended fiddling with EZproxy configuration files for the sake of bolstering the balance sheets of Big Journal. Is this characterization unfair? Perhaps. Then consider this alternative formulation: the opportunity cost imposed by time spent maintaining or working around barriers to the free exchange of academic publications is huge—and unlike DRM for public library ebooks, there isn’t even a case (good, bad, or indifferent) to be made that the effort results in any concrete financial compensation to the academics who wrote the journal articles that are being so carefully protected.
  • Authorization. It’s one thing to authenticate a patron so that they can get at whatever information the library is storing on their behalf. It’s another thing to spend time coding authentication and authorization systems as part of maintaining the walled gardens.

The common element among the problems I’m calling unnatural? Copyright; in the particular, the current copyright regime that enforces the erection of barriers to sharing—and which we can imagine, if perhaps wistfully, changing to the point where DRM and walled garden maintenance need not occupy the attention of the library programmer, who then might find more time to work on some of the natural problems.

Why is this on my mind? I would like to give a shout-out to (and blow a raspberry at) an anonymous publisher who had this to say in a recent article about Sci-Hub:

And for all the researchers at Western universities who use Sci-Hub instead, the anonymous publisher lays the blame on librarians for not making their online systems easier to use and educating their researchers. “I don’t think the issue is access—it’s the perception that access is difficult,” he says.

I know lots of library technologists who would love to have more time to make library software easier to use. Want to help, Dear Anonymous Publisher? Tell your bosses to stop building walls.

Wherein I complain about Pearson’s storage of passwords in plaintext and footnote my snark

From a security alert 1 from Langara College:

Langara was recently notified of a cyber security risk with Pearson online learning which you may be using in your classes. Pearson does not encrypt user names or passwords for the services we use, which puts you at risk. Please note that they are an external vendor; therefore, this security flaw has no direct impact on Langara systems.

This has been a problem since at least 20112; it is cold comfort that at least one Pearson service has a password recovery page that outright says that the user’s password will be emailed to them in clear text3.

There have been numerous tweets, blog posts, and forum posts about this issue over the years. In at least one case4, somebody complained to Pearson and ended up getting what reads like a canned email stating:

Pearson must strike a reasonable balance between support methods that are accessible to all users, and the risk of unauthorized access to information in our learning applications. Allowing customers to retrieve passwords via email was an industry standard for non-financial applications.

In response to the changing landscape, we are developing new user rights management protocols as part of a broader commitment to tighten security and safeguard customer accounts, information, and product access. Passwords will no longer be retrievable; customers will be able to reset passwords through secure processes.

This is a risible response for many reasons; I can only hope that they actually follow through with their plan to improve the situation in a timely fashion. Achieving the industry standard for password storage as of 1968 might be a good start5.

In the meantime, I’m curious whether there are any libraries who are directly involved in the acquisition of Pearson services on behalf of their school or college. If so, might you have a word with your Pearson rep?

Adapted from an email I sent to the LITA Patron Privacy Interest Group’s mailing list. I encourage folks interested in library patron privacy to subscribe; you do not have to be a member of ALA to do so.


1. Pearson Cyber Security Risk
2. Report on Plain Text Offenders
3. Pearson account recovery page
4. Pearson On Password Security
5. Wilkes, M V. Time-sharing Computer Systems. New York: American Elsevier Pub. Co, 1968. Print.. It was in this book that Roger Needham first proposed hashing passwords.

Absent friends

Gratitude to Cecily Walker and Kelly McElroy for calling us together for LIS Mental Health Week 2016.

Pondering my bona fides. I will say this: the black dog is my constant companion. I cannot imagine life without that weight.

I am afraid to say more too openly.

I will deflect, then, but in a way that I hope is useful to others.

Consider this: I am certain, as much as I am certain of anything, that my profession has killed at least three men of my acquaintance.

A mentor. A friend. A colleague who I did not know as well as I would have liked, but who I respected.

All of whom were loved. All of whom had the respect of their colleagues — and the customers they served.

All of whom cared, deeply. Too much? I cannot say.

I have been working in library automation long enough to have become a member of that strange group of folks who have their own lore of long nights, of impossible demands and dilemmas, of being at once part of and separate from the overall profession of librarianship. Long enough to have seen friends and colleagues pass away, and to know that my list of the departed will only lengthen.

But these men? All I know is that they left us, or were taken, too soon — and that I can all too easily imagine circumstances where they could have stayed longer. (But please, please don’t take this as an expression of blame.)

I am haunted by the others whom I don’t know, and never will.

I cannot reconcile myself to this. If this blog post were a letter, it would be spotted by my tears.

But I can make a plea.

The relationship between librarians and their vendors is difficult and fraught. It is all to easy to demonize vendors — but sometimes, enmity is warranted; more often, adversariality at least is; and accountability: always. Thus do the strictures of the systems we live in constrain us and alienate us from one another.

At times, circumstances may not permit warmth or even much kindness. But please remember this, if not for me, for the memory of my absent friends: humans occupy both ends of the library/vendor relationship. Humans.

Securing Z39.50 traffic from Koha and Evergreen Z39.50 servers using YAZ and TLS

There’s often more than way to search a library catalog; or to put it another way, not all users come in via the front door.  For example, ensuring that your public catalog supports HTTPS can help prevent bad actors from snooping on patron’s searches — but if one of your users happens to use a tool that searches your catalog over Z39.50, by default they have less protection.

Consider this extract from a tcpdump of a Z39.50 session:

No, MARC is not a cipher; it just isn’t.

How to improve this state of affairs? There was some discussion back in 2000 of bundling SSL or TLS into the Z39.50 protocol, although it doesn’t seem like it went anywhere. Of course, SSH tunnels and stunnel are options, but it turns out that there can be an easier way.

As is usually the case with anything involving Z39.50, we can thank the folks at IndexData for being on top of things: it turns out that TLS support is easily enabled in YAZ. Here’s how this can be applied to Evergreen and Koha.

The first step is to create an SSL certificate; a self-signed one probably suffices. The certificate and its private key should be concatenated into a single PEM file, like this:


Evergreen’s Z39.50 server can be told to require SSL via a <listen> element in /openils/conf/oils_yaz.xml, like this:

To supply the path to the certificate, a change to will do the trick:

For Koha, a <listen> element should be added to koha-conf.xml, e.g.,

zebrasrv will also need to know how to find the SSL certificate:

And with that, we can test: yaz-client ssl:localhost:4210/CONS or yaz-client ssl:localhost:4210/biblios. Et voila!

Of course, not every Z39.50 client will know how to use TLS… but lots will, as YAZ is the basis for many of them.

Books and articles thud so nicely: a response to a lazy post about gender in library technology

The sort of blog post that jumbles together a few almost randomly-chosen bits on a topic, caps them off with an inflammatory title, then ends with “let’s discuss!” has always struck me as one of the lazier options in the blogger’s toolbox.  Sure, if the blog has an established community, gently tweaking the noses of the commentariat may provide some weekend fun and a breather for the blogger. If the blog doesn’t have such a community, however, a post that invites random commenters to tussle is better if the blogger takes the effort to put together a coherent argument for folks to respond to.  Otherwise, the assertion-jumble approach can result in the post becoming so bad that it’s not even wrong.

Case in point: Jorge Perez’s post on the LITA blog yesterday, Is Technology Bringing in More Skillful Male Librarians?

It’s a short read, but here’s a representative quote:

[…] I was appalled to read that the few male librarians in our profession are negatively stereotyped into being unable to handle a real career and the male dominated technology field infers that more skillful males will join the profession in the future.

Are we supposed to weep for the plight of the male librarian, particularly the one in library technology? On reflection, I think I’ll just follow the lead of the scrivener Bartleby and move on. I do worry about many things in library technology: how money spent on library software tends to be badly allocated; how few libraries (especially public ones) are able to hire technology staff in the first place; how technology projects all too often get oversold; the state of relations between library technologists and other sorts of library workers; and yes, a collective lack of self-confidence that library technology is worth doing as a distinct branch of library work (as opposed to giving the game up and leaving it to our commercial, Google-ish “betters”).

I am also worried about gender balance (and balance on all axes) among those who work in library technology — but the last thing I worry about in that respect is the ability of men (particularly men who look like me) to secure employment and promotions building software for libraries.  For example, consider Melissa Lamont’s article in 2009, Gender, Technology, and Libraries. With men accounting for about 65% of heads of library systems department positions and about 65% of authorship in various library technology journals… in a profession that is predominantly comprised of women… no, I’m not worried that I’m a member of an underrepresented class. Exactly the opposite.  And to call out the particular pasture of library tech I mostly play in: the contributor base of most large library open source software projects, Koha and Evergreen included, continue to skew heavily male.

I do think that library technology does better at gender balance than Silicon Valley as a whole.

That previous statement is, of course, damning with faint praise (although I suppose there could be some small hope that efforts in library technology to do better might spill over into IT as whole).

Back to Perez’s post. Some other things that I raise my eyebrow at: an infographic of a study of stereotypes of male librarians from 23 years ago. Still relevant? An infographic without a complete legend (leading free me to conclude that 79.5% of folks in ALA-accredited library schools wear red socks ALL THE TIME).  And, to top it off, a sentence that all too easily could be read as a homophobic joke — or perhaps as a self-deprecating joke where the deprecation comes from imputed effemination, which is no improvement. Playing around with stereotypes can be useful, but it requires effort to do well, which this post lacks.

Of course, by this point I’ve written over 500 words regarding Perez’s post, so I suppose the “let’s discuss!” prompt worked on me.  I do think think that LITA should be tackling difficult topics, but… I am disappointed.

LITA, you can do better. (And as a LITA member, perhaps I should put it this way: we can do better.)

I promised stuff to make satisfying thuds with.  Sadly, what with the epublishing revolution, most of the thuds will be virtual, but we shall persevere nonetheless: there are plenty of people around with smart things to say about gender in library technology.  Here some links:

I hope LITA will reach out to some of them.

Update 2015-10-26:

Update 2015-10-28:

  • Swapped in a more direct link to Lisa Rabey’s post.
Update 2015-11-06:

Perez has posted follow-up on the LITA blog. I am underwhelmed by the response — if in fact it’s actually a response as such. Perez states that “I wanted to present information I found while reading”, but ultimately missed an opportunity to more directly let Deborah Hicks’ work speak for itself. Karen Schneider picked up that task, got a copy of Hicks’ book, and posted about it on LITA-L.

I agree with Karen Schneider’s assessment that Hicks’ book is worth reading by folks interested in gender and librarianship (and it is on my to-be-read pile), but I am not on board with her suggestion that the matter be viewed as just the publication of a very awkward blog post from which a reference to a good book can be extracted (although I acknowledge her generosity in that viewpoint). It’s one thing to write an infelicitously-composed post that provides a technical tip of interest to systems librarians; it’s another thing to be careless when writing about gender in library technology.

In his follow-up, Perez expresses concerns how certain stereotypes about librarianship can affect others’ perceptions of librarianship — and consequently, salaries and access to perceived authority. He also alludes to (if I understand him correctly) how being a Latino and a librarian has affected perceptions of him and his work. Should the experiences of Latino librarians be discussed? Of course! Is librarianship and how that interacts with the performance of masculinity worthy of study? Of course! But until women in library technology (and in technology fields in general) can count on getting a fair shake, and until the glass escalator is shattered, failing to acknowledge that the glass escalator is still operating when writing about gender in library technology can transform awkwardness into a source of pain.