Securing Z39.50 traffic from Koha and Evergreen Z39.50 servers using YAZ and TLS

There’s often more than way to search a library catalog; or to put it another way, not all users come in via the front door.  For example, ensuring that your public catalog supports HTTPS can help prevent bad actors from snooping on patron’s searches — but if one of your users happens to use a tool that searches your catalog over Z39.50, by default they have less protection.

Consider this extract from a tcpdump of a Z39.50 session:

No, MARC is not a cipher; it just isn’t.

How to improve this state of affairs? There was some discussion back in 2000 of bundling SSL or TLS into the Z39.50 protocol, although it doesn’t seem like it went anywhere. Of course, SSH tunnels and stunnel are options, but it turns out that there can be an easier way.

As is usually the case with anything involving Z39.50, we can thank the folks at IndexData for being on top of things: it turns out that TLS support is easily enabled in YAZ. Here’s how this can be applied to Evergreen and Koha.

The first step is to create an SSL certificate; a self-signed one probably suffices. The certificate and its private key should be concatenated into a single PEM file, like this:


Evergreen’s Z39.50 server can be told to require SSL via a <listen> element in /openils/conf/oils_yaz.xml, like this:

To supply the path to the certificate, a change to will do the trick:

For Koha, a <listen> element should be added to koha-conf.xml, e.g.,

zebrasrv will also need to know how to find the SSL certificate:

And with that, we can test: yaz-client ssl:localhost:4210/CONS or yaz-client ssl:localhost:4210/biblios. Et voila!

Of course, not every Z39.50 client will know how to use TLS… but lots will, as YAZ is the basis for many of them.

Books and articles thud so nicely: a response to a lazy post about gender in library technology

The sort of blog post that jumbles together a few almost randomly-chosen bits on a topic, caps them off with an inflammatory title, then ends with “let’s discuss!” has always struck me as one of the lazier options in the blogger’s toolbox.  Sure, if the blog has an established community, gently tweaking the noses of the commentariat may provide some weekend fun and a breather for the blogger. If the blog doesn’t have such a community, however, a post that invites random commenters to tussle is better if the blogger takes the effort to put together a coherent argument for folks to respond to.  Otherwise, the assertion-jumble approach can result in the post becoming so bad that it’s not even wrong.

Case in point: Jorge Perez’s post on the LITA blog yesterday, Is Technology Bringing in More Skillful Male Librarians?

It’s a short read, but here’s a representative quote:

[…] I was appalled to read that the few male librarians in our profession are negatively stereotyped into being unable to handle a real career and the male dominated technology field infers that more skillful males will join the profession in the future.

Are we supposed to weep for the plight of the male librarian, particularly the one in library technology? On reflection, I think I’ll just follow the lead of the scrivener Bartleby and move on. I do worry about many things in library technology: how money spent on library software tends to be badly allocated; how few libraries (especially public ones) are able to hire technology staff in the first place; how technology projects all too often get oversold; the state of relations between library technologists and other sorts of library workers; and yes, a collective lack of self-confidence that library technology is worth doing as a distinct branch of library work (as opposed to giving the game up and leaving it to our commercial, Google-ish “betters”).

I am also worried about gender balance (and balance on all axes) among those who work in library technology — but the last thing I worry about in that respect is the ability of men (particularly men who look like me) to secure employment and promotions building software for libraries.  For example, consider Melissa Lamont’s article in 2009, Gender, Technology, and Libraries. With men accounting for about 65% of heads of library systems department positions and about 65% of authorship in various library technology journals… in a profession that is predominantly comprised of women… no, I’m not worried that I’m a member of an underrepresented class. Exactly the opposite.  And to call out the particular pasture of library tech I mostly play in: the contributor base of most large library open source software projects, Koha and Evergreen included, continue to skew heavily male.

I do think that library technology does better at gender balance than Silicon Valley as a whole.

That previous statement is, of course, damning with faint praise (although I suppose there could be some small hope that efforts in library technology to do better might spill over into IT as whole).

Back to Perez’s post. Some other things that I raise my eyebrow at: an infographic of a study of stereotypes of male librarians from 23 years ago. Still relevant? An infographic without a complete legend (leading free me to conclude that 79.5% of folks in ALA-accredited library schools wear red socks ALL THE TIME).  And, to top it off, a sentence that all too easily could be read as a homophobic joke — or perhaps as a self-deprecating joke where the deprecation comes from imputed effemination, which is no improvement. Playing around with stereotypes can be useful, but it requires effort to do well, which this post lacks.

Of course, by this point I’ve written over 500 words regarding Perez’s post, so I suppose the “let’s discuss!” prompt worked on me.  I do think think that LITA should be tackling difficult topics, but… I am disappointed.

LITA, you can do better. (And as a LITA member, perhaps I should put it this way: we can do better.)

I promised stuff to make satisfying thuds with.  Sadly, what with the epublishing revolution, most of the thuds will be virtual, but we shall persevere nonetheless: there are plenty of people around with smart things to say about gender in library technology.  Here some links:

I hope LITA will reach out to some of them.

Update 2015-10-26:

Update 2015-10-28:

  • Swapped in a more direct link to Lisa Rabey’s post.
Update 2015-11-06:

Perez has posted follow-up on the LITA blog. I am underwhelmed by the response — if in fact it’s actually a response as such. Perez states that “I wanted to present information I found while reading”, but ultimately missed an opportunity to more directly let Deborah Hicks’ work speak for itself. Karen Schneider picked up that task, got a copy of Hicks’ book, and posted about it on LITA-L.

I agree with Karen Schneider’s assessment that Hicks’ book is worth reading by folks interested in gender and librarianship (and it is on my to-be-read pile), but I am not on board with her suggestion that the matter be viewed as just the publication of a very awkward blog post from which a reference to a good book can be extracted (although I acknowledge her generosity in that viewpoint). It’s one thing to write an infelicitously-composed post that provides a technical tip of interest to systems librarians; it’s another thing to be careless when writing about gender in library technology.

In his follow-up, Perez expresses concerns how certain stereotypes about librarianship can affect others’ perceptions of librarianship — and consequently, salaries and access to perceived authority. He also alludes to (if I understand him correctly) how being a Latino and a librarian has affected perceptions of him and his work. Should the experiences of Latino librarians be discussed? Of course! Is librarianship and how that interacts with the performance of masculinity worthy of study? Of course! But until women in library technology (and in technology fields in general) can count on getting a fair shake, and until the glass escalator is shattered, failing to acknowledge that the glass escalator is still operating when writing about gender in library technology can transform awkwardness into a source of pain.

Ada Lovelace Day, during which I call out some folk for awesomeness

Today is Ada Lovelace Day, a celebration of the work and achievements of women in science, technology, engineering, and math.

And library technology, whose place in STEM is not to be denied.

Here are a few (and I should emphasize that this is a very incomplete list) of the women I have had the privilege to collaborate with and learn from:

  • Ruth Bavousett: Ruth is a Perl monger, contributor of many patches to Koha, has served as Koha’s translation manager, and is an author for
  • Katrin Fischer: Katrin has contributed over 500 patches to Koha and has served many terms as Koha’s quality assurance manager. QA Manager is not an easy position to occupy, and never comes with enough thanks, but Katrin has succeeded at it. Thanks, Katrin!
  • Christina Harlow (@cm_harlow): Christina walks the boundary between library metadata and library software and bridges it. In her blog’s title, she gives herself the sobriquet of “metadata lackey” — but to me that seems far too modest. She’s been instrumental in the revival of Mashcat this year.
  • Kathy Lussier: Kathy has contributed both code and documentation to the Evergreen project and has served in many roles on the project, including on its oversight board and its web team. She has spearheaded various initiatives to make the Evergreen project more inclusive and is a strong advocate for universal, accesible design.

Henriette Avram [image via Wikipedia]

Henriette Avram [image via Wikipedia]

Although she is no longer with us, Henriette Avram, the creator of the MARC format, deserves a callout today as well: it is not every programmer who ships, and moreover, ships something that remains in use 50 years later. I am sure that Avram, were she still alive and working, would be heavily involved in libraries’ efforts to adopt Linked Open Data.

Evergreen 2.9: now with fewer zombies

While looking to see what made it into the upcoming 2.9 beta release of Evergreen, I had a suspicion that something unprecedented had happened. I ran some numbers, and it turns out I was right.

Evergreen 2.9 will feature fewer zombies.

Considering that I’m sitting in a hotel room taking a break from Sasquan, the 2015 World Science Fiction Convention, zombies may be an appropriate theme.

But to put it more mundanely, and to reveal the unprecedented bit: more files were deleted in the course of developing Evergreen 2.9 (as compared to the previous stable version) than entirely new files were added.

To reiterate: Evergreen 2.9 will ship with fewer files, even though it includes numerous improvements, including a big chunk of the cataloging section of the web staff client.

Here’s a table counting the number of new files, deleted files, and files that were renamed or moved from the last release in a stable series to the first release in the next series.

Between release… … and release Entirely new files Files deleted Files renamed
rel_1_6_2_3 rel_2_0_0 1159 75 145
rel_2_0_12 rel_2_1_0 201 75 176
rel_2_1_6 rel_2_2_0 519 61 120
rel_2_2_9 rel_2_3_0 215 137 2
rel_2_3_12 rel_2_4_0 125 30 8
rel_2_4_6 rel_2_5_0 143 14 1
rel_2_5_9 rel_2_6_0 83 31 4
rel_2_6_7 rel_2_7_0 239 51 4
rel_2_7_7 rel_2_8_0 84 30 15
rel_2_8_2 master 99 277 0

The counts were made using git diff --summary --find-rename FROM..TO | awk '{print $1}' | sort | uniq -c and ignoring file mode changes. For example, to get the counts between release 2.8.2 and the master branch as of this post, I did:

Why am I so excited about this? It means that we’ve made significant progress in getting rid of old code that used to serve a purpose, but no longer does. Dead code may not seem so bad — it just sits there, right? — but like a zombie, it has a way of going after developers’ brains. Want to add a feature or fix a bug? Zombies in the code base can sometimes look like they’re still alive — but time spent fixing bugs in dead code is, of course, wasted. For that matter, time spent double-checking whether a section of code is a zombie or not is time wasted.

Best for the zombies to go away — and kudos to Bill Erickson, Jeff Godin, and Jason Stephenson in particular for removing the remnants of Craftsman, script-based circulation rules, and JSPac from Evergreen 2.9.

ALA Annual 2015 schedule, with bonus mod_proxy hackery

My ALA Annual this year is going to focus on five hashtags: #mashcat, #privacy, #nisoprivacy, #kohails, and #evgils.

#mashcat is for Mashcat, which an effort to build links between library systems and library metadata folks. We’ve had some recent success with Twitter chats, and I’ve made up some badge ribbons. If you’d like one, tweet at me (@gmcharlt)!

#privacy and #nisoprivacy are for patron privacy. My particular interest in using our technology to better protect it. I’ll be running the LITA Patron Privacy Technologies Interest Group meeting on Saturday, (where I look forward to Alison Macrina’s update on Let’s Encrypt). I’ll also be participating in the face-to-face meeting on Monday and Tuesday for the NISO project to create a consensus framework for patron privacy in digital library and information systems.

#kohails and #evgils are for Koha and Evergreen, both of which I hack on and which MPOW supports – so one of the things I’ll also be doing is wearing my vendor hat while boothing and meeting.

Here’s my conference schedule so far, although I hope to squeeze in a Linked Data program as well:

In the title of the post, I promised mod_proxy hackery. Not typical for an ALA schedule post? Well, the ALA scheduler website allows you to choose you make your schedule public. If you do that, you can embed the schedule in a blog post using an iframe.

Here’s the HTML that the scheduler suggests:

There’s a little problem with that suggestion, though: my blog is HTTPS-only. As a consequence, an HTTP iframe won’t be rendered by the browser.

What if I change the embedded URL to “”? Still doesn’t work, as the SSL certificate returned is for, which doesn’t match *cough*

Rather than do something simple, such as using copy-and-paste, I ended up configuring Apache to set up a reverse proxy. That way, my webserver can request my schedule from ALA’s webserver (as well as associated CSS), then present it to the web browser over HTTPS. Here’s the configuration I ended up with, with a bit of help from Stack Overflow:

This is a bit ugly (and I’ll be disabling the reverse proxy after the conference is over)… but it works for the moment, and also demonstrates how one might make a resolutely HTTP-only service on your intranet accessible over HTTPS publicly.

Onward! I look forward to meeting friends old and new in San Francisco!

Exercises involving a MARC record for an imaginary book, inspired by a recent AUTOCAT thread

Consider the following record, inspired by the discussion on AUTOCAT that was kicked off by this query:

100 1_ ‡a Smith, June, ‡d 1977-
245 00 ‡a Regarding events in Ferguson / ‡c June Smith.
260 _1 ‡a New York : ‡b Hope Press, ‡c 2017.
300 __ ‡a 371 p. : ‡b ill. ; ‡c 10 x 27 cm
336 __ ‡a text ‡2 rdacontent
337 __ ‡a unmediated ‡2 rdamedia
338 __ ‡a volume ‡2 rdacarrier
650 _0 ‡a United States ‡x History ‡y Civil War, 1861-1865.
650 _0 ‡a Police brutality ‡z Missouri ‡z Ferguson ‡y 2014.
650 _0 ‡a Ferguson (Mo.) Riot, 2015.
650 _0 ‡a Reconstruction (U.S. history, 1865-).
650 _0 ‡a Race riots ‡z Missouri ‡z Ferguson ‡y 2014.
650 _0 ‡a Demonstrations ‡z Missouri ‡z Ferguson ‡y 2014.
653 20 ‡a #BlackLivesMatter
651 _0 ‡a Ferguson (Mo.) ‡x History ‡y 21st century.
650 _0 ‡a Political violence ‡z Missouri ‡z Ferguson ǂx History 
       ‡y 21st century.
650 _0 ‡a Social conflict ‡z Missouri ‡z Ferguson.
650 _0 ‡a Civil rights demonstrations ‡z Missouri ‡z Saint Louis County.
650 _0 ‡a Social conflict ‡z Missouri ‡z Saint Louis.
650 _0 ‡a Protest movements ‡z Missouri ‡z Ferguson.
650 _0 ‡a Protest movements ‡z Missouri ‡z Saint Louis.
650 _0 ‡a Militarization of police ‡z Missouri ‡z Ferguson.
650 _0 ‡a Militarization of police ‡z United States.
651 _0 ‡a Ferguson (Mo.) ‡z Race relations.
653 _0 ‡a 2014 Ferguson unrest
650 _0 ‡a African Americans ‡x Civil rights ‡x History.
650 _0 ‡a African Americans ‡x Crimes against ‡x History.
650 _0 ‡a Police brutality ‡z United States.
650 _0 ‡a Police ‡x Complaints against ‡z Missouri ‡z Ferguson.
650 _0 ‡a Police-community relations ‡z Missouri ‡z Ferguson.
650 _0 ‡a Discrimination in criminal justice administration ‡z Missouri
       ‡z Ferguson.
650 _0 ‡a United States ‡x Race relations ‡x History.

Some exercises for the reader:

  1. Identify the subject headings that detract from the neutrality of this record. Show your work.
  2. Identify the subject headings whose absence lessens the accuracy or neutrality of this record. Show your work.
  3. Of the headings that detract from the neutrality of this record, identify the ones that are inaccurate. Show your work.
  4. Adopt the perspective of someone born in 1841 and repeat exercises 1-3. Show your work.
  5. Adopt the perspective of someone born in 2245 and repeat exercises 1-3. Show your work.
  6. Repeat exercises 1-3 in the form of a video broadcast over the Internet.
  7. Repeat exercises 1-3 as a presentation to your local library board.

I acknowledge with gratitude the participants in the AUTOCAT thread who grappled with the question; many of them suggested subject headings used in this record.

Desiderata for the next Librarian of Congress

The current Librarian of Congress, James Billington, has announced that he will retire on 1 January 2016.  I wish him well – but I also think it’s past time for a change at LC.  Here are my thoughts on how that change should be embodied by Billington’s successor.

The next Librarian of Congress should embrace a vision of LC as the general national library of the United States and advocate for it being funded accordingly.  At present LC’s mission is expressed as:

The Library’s mission is to support the Congress in fulfilling its constitutional duties and to further the progress of knowledge and creativity for the benefit of the American people.

Of course, Congress should continue to have access to the best research resources available, and I think it important that LC qua research library remain grounded by serving that unique patron population – but LC’s mission should emphasize its services to everybody who find themselves in the U.S.:

The Library’s mission is to further the progress of knowledge and creativity for the benefit of the American people, present and future, and to support the Congress in fulfilling its constitutional duties.

Having LC be unapologetically and completely committed to being a national library first is risky.  For one thing, it means asking for more funding in a political climate that does not encourage such requests. By removing the fallback excuse of “LC is ultimately just Congress’ research library”, it also means that LC perforce cannot not evade its leadership responsibilities in the national and international library communities.

However, there are opportunities for a Library of Congress that sees its patron base as consisting of all who find themselves on U.S soil: even broader support than it enjoys now and the ability to act as a library of last resort when other institutions fail our memory.

The next Librarian of Congress should be willing and able to put LC’s technology programs back on track. This does not require that the next Librarian be a technologist. It certainly doesn’t require that they be uncritically enthusiastic about technology – but they must be informed, able to pick a good CIO, and able to see past puffery to envision where and how technology can support LC’s mission.

In particular, research and development in library and information technology is an area where the Library of Congress is uniquely able to marshal federal government resources, both to support its own collections and to provide tools that other libraries can use and build upon.

I wonder what the past 20 years or so would have been like if LC had considered technology and R&D worthy of strong leadership and investment. Would Linked Open Data – or even something better – have taken off ten years ago? Would there be more clarity in library software? What would have things been like had LC technologists been more free to experiment and take risks?

I hope that LC under Billington’s successor will give us a taste of what could have been, then surpass it.

The next Librarian of Congress should be a trained librarian or archivist. This isn’t about credentials per se – see Daniel Ransom piece on the “Real Librarians” of Congress – although possession of an MLS or an archivists’ certificate wouldn’t hurt.  Rather, I’d like to see candidates who are already participating in the professional discourse and who have informed opinions on library technology and libraries as community nuclei (and let’s shoot for the moon: who can speak intelligently on metadata issues!).

Of possibly more import: I hope to see candidates who embody library values, and who will help LC to resist the enclosure of the information commons.

What I would prefer not to see is the appointment of somebody whose sole professional credential is an MBA: the Library of Congress is not just another business to be run by a creature of the cult of the gormless general-purpose manager.  I think it would also be a mistake to appoint somebody who is only a scholar, no matter how distinguished: unlike the Poet Laureate, the Librarian of Congress has to see to the running of a large organization.

Finally, the next Librarian of Congress should not attain that position via the glass elevator.  There are plenty of folks who are not white men who can meet all of my desiderata – or any other reasonable set of desiderata short of walking on water – and I hope that the President will keep the demographics of the library profession (and those we serve!) in mind when making a choice.

Forth to Hood River!

Tomorrow I’m flying out to Hood River, Oregon, for the 2015 Evergreen International Conference.

I’ve learned my lesson from last year — too many presentations at one conference make Galen a dull boy — but I will be speaking a few times:

Hiding Deep in the Woods: Reader Privacy and Evergreen (Thursday at 4:45)

Protecting the privacy of our patrons and their reading and information seeking is a core library value – but one that can be achieved only through constant vigilance. We’ll discuss techniques for keeping an Evergreen system secure from leaks of patron data; policies on how much personally identifying information to keep, and for how long; and how to integrate Evergreen with other software securely.

Angling for a new Staff Interface (Friday at 2:30)

The forthcoming web-based staff interface for Evergreen uses a JavaScript framework called AngularJS. AngularJS offers a number of ways to ease putting new interfaces together quickly such as tight integration of promises/deferred objects, extending HTML via local directives, and an integrated test framework – and can help make Evergreen UI development (even more) fun. During this presentation, which will include some hands-on exercise, Bill, Mike and Galen will give an introduction to AngularJS with a focus on how it’s used in Evergreen. By the end of the session, attendees have gained knowledge that they can immediately apply to working on Evergreen’s web staff interface. To perform the exercises, attendees are expected to be familiar with JavaScript .

Jane in the Forest: Starting to do Linked Data with Evergreen (Saturday at 10:30)

Linked Data has been on the radar of librarians for years, but unless one is already working with RDF triple-stores and the like, it can be a little hard to see how the Linked Data future will look like for ILSs. Adapting some of the ideas of the original Jane-athon session at ALA Midwinter 2015 in Chicago, we will go through an exercise of putting together small sets of RDA metadata as RDF… then seeing how that data can be used in the Evergreen. By the end, attendees will have learned a bit not just about the theory of Linked Data, but how working with it can work in practice.

I’m looking forward to hearing other presentations and the keynote by Joseph Janes, but more than that, I’m looking forward to having a chance to catch up with friends and colleagues in the Evergreen community.

How long does it take to change the data, part I: confidence

A few days ago, I asked the following question in the Mashcat Slack: “if you’re a library data person, what questions do you have to ask of library systems people and library programmers?”

Here is a question that Alison Hitchens asked based on that prompt:

I’m not sure it is a question, but a need for understanding what types of data manipulations etc. are easy peasy and would take under hour of developer time and what types of things are tricky — I guess an understanding of the resourcing scope of the things we are asking for, if that makes sense

That’s an excellent question – and one whose answer heavily depends on the particulars of the data change needed, the people requesting it, the people who are to implement it, and tools that are available.  I cannot offer a magic box that, when fed specifics and given a few turns of its crank, spits out a reliable time estimate.

However, I can offer up a point of view: asking somebody how long it takes to change some data is asking them to take the measure of their confidence and of their constraints.

In this post I’ll focus on the matter of confidence.  If you, a library data person, are asking me, a library systems person (or team, or department, or service provider), to change a pile of data, I may be perfectly confident in my ability to so.  Perhaps it’s a routine record load that for whatever reason cannot be run directly by the catalogers but for which tools and procedures already exist.  In that case, answering the question of how long it would take to do it might be easy (ignoring, for the moment, the matter of fitting the work onto the calendar).

But when asked to do something new, my confidence could start out being quite low.  Here are some of the questions I might be asking myself:

Am I confident that I’m getting the request from the right person?  Am I confident that the requester has done their homework?

Ideally, the requester has the authority to ask for the change, knows why the change is wanted, has consulted with the right data experts within the organization to verify that the request makes sense, and has ensured that all of the relevant stakeholders have signed off on the request.

If not, then it will take me time to either get the requester to line up the political ducks or to do so myself.

Am I confident that I understand the reason for the change?

If I know the reason for the change – which presumably is rooted in some expected benefit to the library’s users or staff – I may be able to suggest better approaches.  After all, sometimes the best way to do a data change is to change no data at all, and instead change displays or software configuration options.  If data does need to be changed, knowing why can make it easier for me to suss out some of the details or ask smarter questions.

If the reason for the change isn’t apparent, it will take me time to work with the requester and other experts and stakeholders until I have enough understanding of the big picture to proceed (or to be told to do it because the requester said so – but that has its own problems).

Am I confident that I understand the details of the requested change?

Computers are stupid and precise, so ultimately any process and program I write or use to effect the change has to be stupid and precise.

Humans are smart and fuzzy, so to bring a request down to the level of the computer, I have to analyze the problem until I’m confident that I’ve broken it down enough. Whatever design and development process I follow to do the analysis – waterfall, agile, or otherwise – it will take time.

Am I confident in the data that I am to change?

Is the data to be changed nice, clean and consistent?  Great! It’s easier to move a clean data set from one consistent state to another consistent state than it is to clean up a messy batch of data.

The messier the data, the more edge cases there are to consider, the more possible exceptions to worry about – the longer the data change will take.

Am I confident that I have the technical knowledge to implement the change?

Relevant technical knowledge can include knowledge of any update tools provided by the software, knowledge of programming languages that can use system APIs, knowledge of data manipulation and access languages such as SQL and XSLT, knowledge of the underlying DBMS, and so forth.

If I’m confident in my knowledge of the tools, I’ll need less time to figure out how to put them together to deal with the data change.  If not, I’ll need time to teach myself, enlist the aid of colleagues who do have the relevant knowledge, or find contractors to do the work.

Am I confident in my ability to predict any side-effects of the change?

Library data lives in complicated silos. Sometimes, a seemingly small change can have unexpected consequences.  As a very small example, Evergreen actually cares about the values of indicators in the MARC21 856 field; get them wrong, and your electronic resource URLs disappear from public catalog display.

If I’m familiar with the systems that store and use the data to be changed and am confident that side-effects of the change will be minimal, great! If not, it may take me some time to investigate the possible consequences of the change.

Am I confident in my ability to back out of the change if something goes wrong?

Is the data change difficult or awkward to undo if something is amiss?  If so, it presents an operational risk, one whose mitigation is taking more time for planning and test runs.

Am I confident that I know how often requests for similar data changes will be made in the future?

If the request is a one-off, great! If the request is the harbinger of many more like it – or looks that way – I may be better off writing a tool that I can use to make the data change repeatedly.  I may be even better off writing a tool that the requester can use.

It may take more time to write such a tool than it would to just handle the request as a one-off, in which case it will take time to decide which direction to take.

Am I confident in the organization?

Do I work for a library that can handle mistakes well?  Where if the data change turns out to be misguided, is able to roll with the punches?  Or do I work for an unhealthy organization where a mistake means months of recriminations? Or where the catalog is just one of the fronts in a war between the public and technical services departments?

Can I expect to get compensated for performing the data change successfully? Or am I effectively being treated as if were the stupid, over-precise computer?

If the organization is unhealthy, I may need to spend more time than ought to be necessary to protect my back – or I may end up spending a lot of time not just implementing data changes, but data oscillations.

The pattern should be clear: part of the process of estimating how long it might take to effect a data change is estimating how much confidence I have about the change.  Generally speaking, higher confidence means less time would be needed to make the change – but of course, confidence is a quality that cannot be separated from the people and organizations who might work on the change.

In the extreme – but common – case, if I start from a state of very low confidence, it will take me time to reach a sufficient degree of confidence to make any time estimate at all.  This is why I like a comment that Owen Stephens made in the Slack:

Perhaps this is part of the answer to [Alison]: Q: Always ask how long it will take to investigate and get an idea of how difficult it is.

In the next post, I discuss how various constraints can affect time estimates.

Preserving the usefulness of the Hugo Awards as a selection tool for libraries

The Hugo Awards have been awarded by the World Science Fiction Convention for decades, and serve to recognize the works of authors, editors, directors – fans and professionals – in the genres of science fiction and fantasy.  The Hugos are unique in being a fan-driven award that has as much process – if not more – as juried awards.

That process has two main steps.  First, there’s a nomination period where members of Worldcon select works to appear on the final ballot. Second, members of the upcoming Worldcon vote on the final ballot and the awards are given out at the convention.

Typically, rather more folks vote on the final ballot than nominate – and that means that small, organized groups of people can unduly influence the nominations.  However, there’s been surprisingly few attempts to actually do that.

Until this year.

Many of the nominations this year match the slates of two groups, the “Sad Puppies” and the “Rabid Puppies.”  Not only that, some of the categories contain nothing but Puppy nominations.

The s.f. news site File 770 has a comprehensive collection of back-and-forth about the matter, but suffice it so say that the Puppy slates are have a primarily political motivation – and one, in the interests of full disclosure, that I personally despise.

There are a lot of people saying smart things about the situation, so I’ll content myself with the following observation:

Slate nominations and voting destroy the utility of the Hugo Award lists for librarians who select science fiction and fantasy.

Why? Ideally, the Hugo process ascertains the preferences of thousands of Worldcon members to arrive at a general consensus of science fiction and fantasy that is both good and generally appealing.  As it happens, that’s a pretty useful starting point for librarians trying to round out collections or find new authors that their patrons might like – particularly for those librarians who are not themselves fans of the genre.

However, should slate voting become a successful tactic, the Hugo Awards are in danger of ending up simply reflecting which factions in fandom are best able to game the system.  The results of that… are unlikely to be all that useful for librarians.

Here’s my suggestion for librarians who are fans of science fiction and fantasy and who want to help preserve a collection development tool: get involved.  In particular:

  1. Join Worldcon. A $40 supporting membership suffices to get voting privileges.
  2. Vote on the Hugos this year. I won’t tell you who the vote for, but if you agree with me that slate nominations are a problem, consider voting accordingly.
  3. Next year, participate in the nomination process. Don’t participate in nomination slates; instead, nominate those works that you think are worthy of a Hugo – full stop.