Author Archives: Galen Charlton

Wherein I complain about Pearson’s storage of passwords in plaintext and footnote my snark

From a security alert 1 from Langara College:

Langara was recently notified of a cyber security risk with Pearson online learning which you may be using in your classes. Pearson does not encrypt user names or passwords for the services we use, which puts you at risk. Please note that they are an external vendor; therefore, this security flaw has no direct impact on Langara systems.

This has been a problem since at least 20112; it is cold comfort that at least one Pearson service has a password recovery page that outright says that the user’s password will be emailed to them in clear text3.

There have been numerous tweets, blog posts, and forum posts about this issue over the years. In at least one case4, somebody complained to Pearson and ended up getting what reads like a canned email stating:

Pearson must strike a reasonable balance between support methods that are accessible to all users, and the risk of unauthorized access to information in our learning applications. Allowing customers to retrieve passwords via email was an industry standard for non-financial applications.

In response to the changing landscape, we are developing new user rights management protocols as part of a broader commitment to tighten security and safeguard customer accounts, information, and product access. Passwords will no longer be retrievable; customers will be able to reset passwords through secure processes.

This is a risible response for many reasons; I can only hope that they actually follow through with their plan to improve the situation in a timely fashion. Achieving the industry standard for password storage as of 1968 might be a good start5.

In the meantime, I’m curious whether there are any libraries who are directly involved in the acquisition of Pearson services on behalf of their school or college. If so, might you have a word with your Pearson rep?

Adapted from an email I sent to the LITA Patron Privacy Interest Group’s mailing list. I encourage folks interested in library patron privacy to subscribe; you do not have to be a member of ALA to do so.


1. Pearson Cyber Security Risk
2. Report on Plain Text Offenders
3. Pearson account recovery page
4. Pearson On Password Security
5. Wilkes, M V. Time-sharing Computer Systems. New York: American Elsevier Pub. Co, 1968. Print.. It was in this book that Roger Needham first proposed hashing passwords.

Absent friends

Gratitude to Cecily Walker and Kelly McElroy for calling us together for LIS Mental Health Week 2016.

Pondering my bona fides. I will say this: the black dog is my constant companion. I cannot imagine life without that weight.

I am afraid to say more too openly.

I will deflect, then, but in a way that I hope is useful to others.

Consider this: I am certain, as much as I am certain of anything, that my profession has killed at least three men of my acquaintance.

A mentor. A friend. A colleague who I did not know as well as I would have liked, but who I respected.

All of whom were loved. All of whom had the respect of their colleagues — and the customers they served.

All of whom cared, deeply. Too much? I cannot say.

I have been working in library automation long enough to have become a member of that strange group of folks who have their own lore of long nights, of impossible demands and dilemmas, of being at once part of and separate from the overall profession of librarianship. Long enough to have seen friends and colleagues pass away, and to know that my list of the departed will only lengthen.

But these men? All I know is that they left us, or were taken, too soon — and that I can all too easily imagine circumstances where they could have stayed longer. (But please, please don’t take this as an expression of blame.)

I am haunted by the others whom I don’t know, and never will.

I cannot reconcile myself to this. If this blog post were a letter, it would be spotted by my tears.

But I can make a plea.

The relationship between librarians and their vendors is difficult and fraught. It is all to easy to demonize vendors — but sometimes, enmity is warranted; more often, adversariality at least is; and accountability: always. Thus do the strictures of the systems we live in constrain us and alienate us from one another.

At times, circumstances may not permit warmth or even much kindness. But please remember this, if not for me, for the memory of my absent friends: humans occupy both ends of the library/vendor relationship. Humans.

Whence library technology innovation?

Rob McGee has been moderating the “View from the Top” presidents [of library technology companies] seminar for 26 years. As an exercise in grilling executives, its value to librarians varies; while CEOs, presidents, senior VPs and the like show up, the discussion is usually constrained. Needless to say, it’s not common for concerns to be voiced openly by the panelists, and this year was no different. The trend of consolidation in the library automation industry continued to nobody’s surprise; that a good 40 minutes of the panel was spent discussing the nuts and bolts of who bought whom for how much did not result in any scintillating disclosures.

But McGee sometimes mixes it up. I was present to watch the panel, but ended up letting myself get plucked from the audience to make a couple comments.

One of the topics discussed during the latter half of the panel was patron privacy, and I ended up in the happy position of getting the last word in, to the effect that for 2016, patron privacy is a technology trend. With the ongoing good work of the Library Freedom Project and Eric Hellmann, the release of the NISO Privacy Principles, the launch of Let’s Encrypt, and various efforts by groups within ALA doing educational and policy work related to patron privacy, lots of progress is being made in turning our values into working code.

However, the reason I ended up on the panel was that McGee wanted to stir the pot about where innovation in library technology comes from. The gist of my response: it comes from the libraries themselves and from free and open source projects initiated by libraries.

This statement requires some justification.

First, here are some things that I don’t believe:

  • The big vendors don’t innovate. Wrong: if innovation is an idea plus the ability to implement it plus the ability to convince others that the idea is good in the first place, well, the big firms do have plenty of resources to apply to solving problems. So do, of course, the likes of OCLC and, in particular, OCLC Research. On the other hand, big firms do have constraints that limit the sorts of risks they can take. It’s one thing for a library project to fail or for a startup to go bust; it’s another thing for a firm employing hundreds of people and (often) answering to venture capital to take certain kinds of technology risks: nobody is running Taos or Horizon 8, and nobody wants to be the one to propose the next big failure.
  • Libraries are the only source of innovative new ideas. Nope; lots of good ideas come from outside of libraries (although that’s no reason to think that they only originate from outside). Also, automation vendors can attain a perspective that few librarians enjoy: I submit that there are very few professional librarians outside of vendor employees who have broad experience with school libraries and public libraries and academic libraries and special libraries and national libraries. A vendor librarian who works as an implementation project manager can gain that breadth of experience in the space of three years.
  • Only developers who work exclusively in free or open source projects come up with good ideas. Or only developers who work exclusively for proprietary vendors come up with good ideas. No: technical judgment and good design sense doesn’t distribute itself that way.
  • Every idea for an improvement to library software is an innovation. Librarians are not less prone to bikeshedding than anybody else (nor are they necessarily more prone to it). However, there is undoubtedly a lot of time and money spent on local tweaks, or small tweaks, or small and local tweaks (for both proprietary and F/LOSS projects) that would be better redirected to new things that better serve libraries and their users.

That out of the way, here’s what I do believe:

  • Libraries have initiated a large number of software and technology projects that achieved success, and continue to do so. Geac, anybody? NOTIS? VTLS? ALEPH. Many ILSs had their roots in library projects that later were commercialized. For that matter, from one point of view both Koha and Evergreen are also examples of ILSs initiated by libraries that got commercialized; it’s just that the free software model provides a better way of doing it as opposed to spinning off a proprietary firm.
  • Free and open source software models provide a way for libraries to experiment and more readily get others to contribute to the experiments than was the case previously.
  • And finally, libraries have different incentives that affect not just how they innovate, but to what end. It still matters that the starting point of most library projects is better serving the needs of the library, their users, or both, not seeking a large profit in three years time.

But about that last point and the period of three years to profit—I didn’t pull that number out of my hat; it came from a fellow panelist who was describing the timeframe that venture capital firms care about. (So maybe that nuts-and-bolts discussion about mergers and acquisitions was useful after all).

Libraries can afford to take a longer view. More time, in turn, can contribute to innovations that last.

Securing Z39.50 traffic from Koha and Evergreen Z39.50 servers using YAZ and TLS

There’s often more than way to search a library catalog; or to put it another way, not all users come in via the front door.  For example, ensuring that your public catalog supports HTTPS can help prevent bad actors from snooping on patron’s searches — but if one of your users happens to use a tool that searches your catalog over Z39.50, by default they have less protection.

Consider this extract from a tcpdump of a Z39.50 session:

No, MARC is not a cipher; it just isn’t.

How to improve this state of affairs? There was some discussion back in 2000 of bundling SSL or TLS into the Z39.50 protocol, although it doesn’t seem like it went anywhere. Of course, SSH tunnels and stunnel are options, but it turns out that there can be an easier way.

As is usually the case with anything involving Z39.50, we can thank the folks at IndexData for being on top of things: it turns out that TLS support is easily enabled in YAZ. Here’s how this can be applied to Evergreen and Koha.

The first step is to create an SSL certificate; a self-signed one probably suffices. The certificate and its private key should be concatenated into a single PEM file, like this:


Evergreen’s Z39.50 server can be told to require SSL via a <listen> element in /openils/conf/oils_yaz.xml, like this:

To supply the path to the certificate, a change to will do the trick:

For Koha, a <listen> element should be added to koha-conf.xml, e.g.,

zebrasrv will also need to know how to find the SSL certificate:

And with that, we can test: yaz-client ssl:localhost:4210/CONS or yaz-client ssl:localhost:4210/biblios. Et voila!

Of course, not every Z39.50 client will know how to use TLS… but lots will, as YAZ is the basis for many of them.

Books and articles thud so nicely: a response to a lazy post about gender in library technology

The sort of blog post that jumbles together a few almost randomly-chosen bits on a topic, caps them off with an inflammatory title, then ends with “let’s discuss!” has always struck me as one of the lazier options in the blogger’s toolbox.  Sure, if the blog has an established community, gently tweaking the noses of the commentariat may provide some weekend fun and a breather for the blogger. If the blog doesn’t have such a community, however, a post that invites random commenters to tussle is better if the blogger takes the effort to put together a coherent argument for folks to respond to.  Otherwise, the assertion-jumble approach can result in the post becoming so bad that it’s not even wrong.

Case in point: Jorge Perez’s post on the LITA blog yesterday, Is Technology Bringing in More Skillful Male Librarians?

It’s a short read, but here’s a representative quote:

[…] I was appalled to read that the few male librarians in our profession are negatively stereotyped into being unable to handle a real career and the male dominated technology field infers that more skillful males will join the profession in the future.

Are we supposed to weep for the plight of the male librarian, particularly the one in library technology? On reflection, I think I’ll just follow the lead of the scrivener Bartleby and move on. I do worry about many things in library technology: how money spent on library software tends to be badly allocated; how few libraries (especially public ones) are able to hire technology staff in the first place; how technology projects all too often get oversold; the state of relations between library technologists and other sorts of library workers; and yes, a collective lack of self-confidence that library technology is worth doing as a distinct branch of library work (as opposed to giving the game up and leaving it to our commercial, Google-ish “betters”).

I am also worried about gender balance (and balance on all axes) among those who work in library technology — but the last thing I worry about in that respect is the ability of men (particularly men who look like me) to secure employment and promotions building software for libraries.  For example, consider Melissa Lamont’s article in 2009, Gender, Technology, and Libraries. With men accounting for about 65% of heads of library systems department positions and about 65% of authorship in various library technology journals… in a profession that is predominantly comprised of women… no, I’m not worried that I’m a member of an underrepresented class. Exactly the opposite.  And to call out the particular pasture of library tech I mostly play in: the contributor base of most large library open source software projects, Koha and Evergreen included, continue to skew heavily male.

I do think that library technology does better at gender balance than Silicon Valley as a whole.

That previous statement is, of course, damning with faint praise (although I suppose there could be some small hope that efforts in library technology to do better might spill over into IT as whole).

Back to Perez’s post. Some other things that I raise my eyebrow at: an infographic of a study of stereotypes of male librarians from 23 years ago. Still relevant? An infographic without a complete legend (leading free me to conclude that 79.5% of folks in ALA-accredited library schools wear red socks ALL THE TIME).  And, to top it off, a sentence that all too easily could be read as a homophobic joke — or perhaps as a self-deprecating joke where the deprecation comes from imputed effemination, which is no improvement. Playing around with stereotypes can be useful, but it requires effort to do well, which this post lacks.

Of course, by this point I’ve written over 500 words regarding Perez’s post, so I suppose the “let’s discuss!” prompt worked on me.  I do think think that LITA should be tackling difficult topics, but… I am disappointed.

LITA, you can do better. (And as a LITA member, perhaps I should put it this way: we can do better.)

I promised stuff to make satisfying thuds with.  Sadly, what with the epublishing revolution, most of the thuds will be virtual, but we shall persevere nonetheless: there are plenty of people around with smart things to say about gender in library technology.  Here some links:

I hope LITA will reach out to some of them.

Update 2015-10-26:

Update 2015-10-28:

  • Swapped in a more direct link to Lisa Rabey’s post.
Update 2015-11-06:

Perez has posted follow-up on the LITA blog. I am underwhelmed by the response — if in fact it’s actually a response as such. Perez states that “I wanted to present information I found while reading”, but ultimately missed an opportunity to more directly let Deborah Hicks’ work speak for itself. Karen Schneider picked up that task, got a copy of Hicks’ book, and posted about it on LITA-L.

I agree with Karen Schneider’s assessment that Hicks’ book is worth reading by folks interested in gender and librarianship (and it is on my to-be-read pile), but I am not on board with her suggestion that the matter be viewed as just the publication of a very awkward blog post from which a reference to a good book can be extracted (although I acknowledge her generosity in that viewpoint). It’s one thing to write an infelicitously-composed post that provides a technical tip of interest to systems librarians; it’s another thing to be careless when writing about gender in library technology.

In his follow-up, Perez expresses concerns how certain stereotypes about librarianship can affect others’ perceptions of librarianship — and consequently, salaries and access to perceived authority. He also alludes to (if I understand him correctly) how being a Latino and a librarian has affected perceptions of him and his work. Should the experiences of Latino librarians be discussed? Of course! Is librarianship and how that interacts with the performance of masculinity worthy of study? Of course! But until women in library technology (and in technology fields in general) can count on getting a fair shake, and until the glass escalator is shattered, failing to acknowledge that the glass escalator is still operating when writing about gender in library technology can transform awkwardness into a source of pain.

Ada Lovelace Day, during which I call out some folk for awesomeness

Today is Ada Lovelace Day, a celebration of the work and achievements of women in science, technology, engineering, and math.

And library technology, whose place in STEM is not to be denied.

Here are a few (and I should emphasize that this is a very incomplete list) of the women I have had the privilege to collaborate with and learn from:

  • Ruth Bavousett: Ruth is a Perl monger, contributor of many patches to Koha, has served as Koha’s translation manager, and is an author for
  • Katrin Fischer: Katrin has contributed over 500 patches to Koha and has served many terms as Koha’s quality assurance manager. QA Manager is not an easy position to occupy, and never comes with enough thanks, but Katrin has succeeded at it. Thanks, Katrin!
  • Christina Harlow (@cm_harlow): Christina walks the boundary between library metadata and library software and bridges it. In her blog’s title, she gives herself the sobriquet of “metadata lackey” — but to me that seems far too modest. She’s been instrumental in the revival of Mashcat this year.
  • Kathy Lussier: Kathy has contributed both code and documentation to the Evergreen project and has served in many roles on the project, including on its oversight board and its web team. She has spearheaded various initiatives to make the Evergreen project more inclusive and is a strong advocate for universal, accesible design.

Henriette Avram [image via Wikipedia]

Henriette Avram [image via Wikipedia]

Although she is no longer with us, Henriette Avram, the creator of the MARC format, deserves a callout today as well: it is not every programmer who ships, and moreover, ships something that remains in use 50 years later. I am sure that Avram, were she still alive and working, would be heavily involved in libraries’ efforts to adopt Linked Open Data.

Evergreen 2.9: now with fewer zombies

While looking to see what made it into the upcoming 2.9 beta release of Evergreen, I had a suspicion that something unprecedented had happened. I ran some numbers, and it turns out I was right.

Evergreen 2.9 will feature fewer zombies.

Considering that I’m sitting in a hotel room taking a break from Sasquan, the 2015 World Science Fiction Convention, zombies may be an appropriate theme.

But to put it more mundanely, and to reveal the unprecedented bit: more files were deleted in the course of developing Evergreen 2.9 (as compared to the previous stable version) than entirely new files were added.

To reiterate: Evergreen 2.9 will ship with fewer files, even though it includes numerous improvements, including a big chunk of the cataloging section of the web staff client.

Here’s a table counting the number of new files, deleted files, and files that were renamed or moved from the last release in a stable series to the first release in the next series.

Between release… … and release Entirely new files Files deleted Files renamed
rel_1_6_2_3 rel_2_0_0 1159 75 145
rel_2_0_12 rel_2_1_0 201 75 176
rel_2_1_6 rel_2_2_0 519 61 120
rel_2_2_9 rel_2_3_0 215 137 2
rel_2_3_12 rel_2_4_0 125 30 8
rel_2_4_6 rel_2_5_0 143 14 1
rel_2_5_9 rel_2_6_0 83 31 4
rel_2_6_7 rel_2_7_0 239 51 4
rel_2_7_7 rel_2_8_0 84 30 15
rel_2_8_2 master 99 277 0

The counts were made using git diff --summary --find-rename FROM..TO | awk '{print $1}' | sort | uniq -c and ignoring file mode changes. For example, to get the counts between release 2.8.2 and the master branch as of this post, I did:

Why am I so excited about this? It means that we’ve made significant progress in getting rid of old code that used to serve a purpose, but no longer does. Dead code may not seem so bad — it just sits there, right? — but like a zombie, it has a way of going after developers’ brains. Want to add a feature or fix a bug? Zombies in the code base can sometimes look like they’re still alive — but time spent fixing bugs in dead code is, of course, wasted. For that matter, time spent double-checking whether a section of code is a zombie or not is time wasted.

Best for the zombies to go away — and kudos to Bill Erickson, Jeff Godin, and Jason Stephenson in particular for removing the remnants of Craftsman, script-based circulation rules, and JSPac from Evergreen 2.9.

ALA Annual 2015 schedule, with bonus mod_proxy hackery

My ALA Annual this year is going to focus on five hashtags: #mashcat, #privacy, #nisoprivacy, #kohails, and #evgils.

#mashcat is for Mashcat, which an effort to build links between library systems and library metadata folks. We’ve had some recent success with Twitter chats, and I’ve made up some badge ribbons. If you’d like one, tweet at me (@gmcharlt)!

#privacy and #nisoprivacy are for patron privacy. My particular interest in using our technology to better protect it. I’ll be running the LITA Patron Privacy Technologies Interest Group meeting on Saturday, (where I look forward to Alison Macrina’s update on Let’s Encrypt). I’ll also be participating in the face-to-face meeting on Monday and Tuesday for the NISO project to create a consensus framework for patron privacy in digital library and information systems.

#kohails and #evgils are for Koha and Evergreen, both of which I hack on and which MPOW supports – so one of the things I’ll also be doing is wearing my vendor hat while boothing and meeting.

Here’s my conference schedule so far, although I hope to squeeze in a Linked Data program as well:

In the title of the post, I promised mod_proxy hackery. Not typical for an ALA schedule post? Well, the ALA scheduler website allows you to choose you make your schedule public. If you do that, you can embed the schedule in a blog post using an iframe.

Here’s the HTML that the scheduler suggests:

There’s a little problem with that suggestion, though: my blog is HTTPS-only. As a consequence, an HTTP iframe won’t be rendered by the browser.

What if I change the embedded URL to “”? Still doesn’t work, as the SSL certificate returned is for, which doesn’t match *cough*

Rather than do something simple, such as using copy-and-paste, I ended up configuring Apache to set up a reverse proxy. That way, my webserver can request my schedule from ALA’s webserver (as well as associated CSS), then present it to the web browser over HTTPS. Here’s the configuration I ended up with, with a bit of help from Stack Overflow:

This is a bit ugly (and I’ll be disabling the reverse proxy after the conference is over)… but it works for the moment, and also demonstrates how one might make a resolutely HTTP-only service on your intranet accessible over HTTPS publicly.

Onward! I look forward to meeting friends old and new in San Francisco!

Exercises involving a MARC record for an imaginary book, inspired by a recent AUTOCAT thread

Consider the following record, inspired by the discussion on AUTOCAT that was kicked off by this query:

100 1_ ‡a Smith, June, ‡d 1977-
245 00 ‡a Regarding events in Ferguson / ‡c June Smith.
260 _1 ‡a New York : ‡b Hope Press, ‡c 2017.
300 __ ‡a 371 p. : ‡b ill. ; ‡c 10 x 27 cm
336 __ ‡a text ‡2 rdacontent
337 __ ‡a unmediated ‡2 rdamedia
338 __ ‡a volume ‡2 rdacarrier
650 _0 ‡a United States ‡x History ‡y Civil War, 1861-1865.
650 _0 ‡a Police brutality ‡z Missouri ‡z Ferguson ‡y 2014.
650 _0 ‡a Ferguson (Mo.) Riot, 2015.
650 _0 ‡a Reconstruction (U.S. history, 1865-).
650 _0 ‡a Race riots ‡z Missouri ‡z Ferguson ‡y 2014.
650 _0 ‡a Demonstrations ‡z Missouri ‡z Ferguson ‡y 2014.
653 20 ‡a #BlackLivesMatter
651 _0 ‡a Ferguson (Mo.) ‡x History ‡y 21st century.
650 _0 ‡a Political violence ‡z Missouri ‡z Ferguson ǂx History 
       ‡y 21st century.
650 _0 ‡a Social conflict ‡z Missouri ‡z Ferguson.
650 _0 ‡a Civil rights demonstrations ‡z Missouri ‡z Saint Louis County.
650 _0 ‡a Social conflict ‡z Missouri ‡z Saint Louis.
650 _0 ‡a Protest movements ‡z Missouri ‡z Ferguson.
650 _0 ‡a Protest movements ‡z Missouri ‡z Saint Louis.
650 _0 ‡a Militarization of police ‡z Missouri ‡z Ferguson.
650 _0 ‡a Militarization of police ‡z United States.
651 _0 ‡a Ferguson (Mo.) ‡z Race relations.
653 _0 ‡a 2014 Ferguson unrest
650 _0 ‡a African Americans ‡x Civil rights ‡x History.
650 _0 ‡a African Americans ‡x Crimes against ‡x History.
650 _0 ‡a Police brutality ‡z United States.
650 _0 ‡a Police ‡x Complaints against ‡z Missouri ‡z Ferguson.
650 _0 ‡a Police-community relations ‡z Missouri ‡z Ferguson.
650 _0 ‡a Discrimination in criminal justice administration ‡z Missouri
       ‡z Ferguson.
650 _0 ‡a United States ‡x Race relations ‡x History.

Some exercises for the reader:

  1. Identify the subject headings that detract from the neutrality of this record. Show your work.
  2. Identify the subject headings whose absence lessens the accuracy or neutrality of this record. Show your work.
  3. Of the headings that detract from the neutrality of this record, identify the ones that are inaccurate. Show your work.
  4. Adopt the perspective of someone born in 1841 and repeat exercises 1-3. Show your work.
  5. Adopt the perspective of someone born in 2245 and repeat exercises 1-3. Show your work.
  6. Repeat exercises 1-3 in the form of a video broadcast over the Internet.
  7. Repeat exercises 1-3 as a presentation to your local library board.

I acknowledge with gratitude the participants in the AUTOCAT thread who grappled with the question; many of them suggested subject headings used in this record.