Author Archives:

Libraries, the Ada Initiative, and a challenge

I am a firm believer in the power of open source to help libraries build the tools we need to help our patrons and our communities.

Our tools focus our effort. Our effort, of course, does not spring out of thin air; it’s rooted in people.

One of the many currencies that motivates people to contribute to free and open source projects is acknowledgment.

Here are some of the women I’d like to acknowledge for their contributions, direct or indirect, to projects I have been part of. Some of them I know personally, others I admire from afar.

  • Henriette Avram – Love it or hate it, where would we be without the MARC format? For all that we’ve learned about new and better ways to manage metadata, Avram’s work at the LC started the profession’s proud tradition of sharing its metadata in electronic format.
  • Ruth Bavousett – Ruth has been involved in Koha for years and served as QA team member and translation manager. She is also one of the most courageous women I have the privilege of knowing.
  • Karen Coyle – Along with Diane Hillmann, I look to Karen for leadership in revamping our metadata practices.
  • Nicole Engard – Nicole has also been involved in Koha for years as documentation manager. Besides writing most of Koha’s manual, she is consistently helpful to new users.
  • Katrin Fischer – Katrin is Koha’s current QA manager, and has and continues to perform a very difficult job with grace and less thanks than she deserves.
  • Ruth Frasur – Ruth is director of the Hagerstown Jefferson Township Public Library in Indiana, which is a member of Evergreen Indiana. Ruth is one of the very few library administrators I know who not only understands open source, but actively contributes to some of the nitty-gritty work of keeping the software documented.
  • Diane Hillmann – Another leader in library metadata.
  • Kathy Lussier – As the Evergreen project coordinator at MassLNC, Kathy has helped to guide that consortium’s many development contributions to Evergreen.  As a participant in the project and a member of the Evergreen Oversight Board, Kathy has also supplied much-needed organizational help – and a fierce determination to help more women succeed in open source.
  • Liz Rea – Liz has been running Koha systems for years, writing patches, maintaing the project’s website, and injecting humor when most needed – a true jill of all trades.

However, there are unknowns that haunt me. Who has tried to contribute to Koha or Evergreen, only to be turned away by an knee-jerk “RTFM” or simply silence? Who might have been interested, only to rightly judge that they didn’t have time for the flack they’d get? Who never got a chance to go to a Code4Lib conference while her male colleague’s funding request got approved three years in a row?

What have we lost? How many lines of code, pages of documentation, hours of help have not gone into the tools that help us help our patrons?

The ideals of free and open source software projects are necessary, but they’re not sufficient to ensure equal access and participation.

The Ada Initiative can help. It was formed to support women in open technology and culture, and runs workshops, assists communities in setting up and enforcing codes of conduct, and promotes ensuring that women have access to positions of influence in open culture projects.

Why is the Ada Initiative’s work important to me? For many reasons, but I’ll mention three. First, because making sure that everybody who wants to work and play in the field of open technology has a real choice to do so is only fair. Second, because open source projects that are truly welcoming to women are much more likely to be welcoming to everybody – and happier, because of the effort spent on taking care of the community. Third, because I know that I don’t know everything – or all that much, really – and I need exposure to multiple points of view to be effective building tools for libraries.

Right now, folks in the library and archives communities are banding together to raise money for the Ada Initiative. I’ve donated, and I encourage others to do the same. Even better, several folks, including Bess SadlerAndromeda Yelton, Chris Bourg, and Mark Matienzo are providing matching donations up to a total of $5,120.

Go ahead, make a donation by clicking below, then come back. I’ll wait.

Donate to the Ada Initiative

Money talks – but whether any given open source community is welcoming, both of new people and of new ideas, depends on its current members.

Therefore, I would also like to extend a challenge to men (including myself — accountability matters!) working in open source software projects in libraries. It’s a simple challenge, summarized in a few words: “listen, look, lift up, and learn.”

ListenListening is hard.  A coder in a library open source project has to listen to other coders, to librarians, to users – and it is all too easy to ignore or dismiss approaches that are unfamiliar.  It can be very difficult to learn that something you’ve poured a lot of effort into may not work well for librarians – and it can be even harder to hear that your are stepping on somebody’s toes or thoughtlessly stomping on their ideas.

What to do? Pay attention to how you communicate while handling bugs and project correspondence. Do you prioritize bugs filed by men? Do you have a subtle tendency to think to yourself, “oh, she’s just not seeing the obvious thing right in front of her!” if a women asks a question on the mailing list about functionality she’s having trouble with? If so, make an effort to be even-handed.

Are you receiving criticism? Count to ten, let your hackles down, and try to look at it from your critic’s point of view.

Be careful about nitpicking.  Many a good idea has died after too much bikeshedding – and while that happens to everybody, I have a gut feeling that it’s more likely to happen if the idea is proposed by a woman.

Is a women colleague confiding in you about concerns she has with community or workplace dynamics? Listen.

Look. Look around you — around your office, the forums, the IRC channels, and Stack Exchanges you frequent. Do you mostly see men who look like yourself?  If so, do what you can to broaden your perspective and your employer’s perspective. Do you have hiring authority? Do you participate in interview panels? You can help who surrounds you.

Remember that I’m talking about library technology here — even if the 70% of the employees of the library you work for are women, if the systems department only employs men, you’re missing other points of view.

Do you have no hiring authority whatsoever? Look around the open source communities you participate in. Are there proportionally far more men participating openly than the gender ratio in librarianship as a whole?  If so, you can help change that by how you choose to participate in the community.

Lift upThis can take many forms.  In some cases, you can help lift up women in library technology by getting out of the way – in other words, by removing or not supporting barriers to participation such as sexist language on the mailing list or by calling out exclusionary behavior by other men (or yourself!).

Sometimes, you can offer active assistance – but ask first! Perhaps a woman is ready to assume a project leadership role or is ready to grow into it. Encourage her – and be ready to support her publicly. Or perhaps you may have an opportunity to mentor a student – go for it, but know that mentoring is hard work.

But note — I’m not an authority on ways to support women in technology.  One of the things that the Ada Initiative does is run Ally Skills workshops that teach simple techniques for supporting women in the workplace and online.  In fact, if you’re coming to Atlanta this October for the DLF Forum, one is being offered there.

Learn. Something I’m still learning is just the sheer amount of crap that women in technology put up with. Have you ever gotten a death threat or a rape threat for something you said online about the software industry? If you’re a guy, probably not. If you’re Anita Sarkeesian or Quinn Norton, it’s a different story entirely.

If you’re thinking to yourself that “we’re librarians, not gamers, and nobody has ever gotten a death threat during a professional dispute with the possible exception of the MARC format” – that’s not good enough. Or if you think that no librarian has ever harassed another over gender – that’s simply not true. It doesn’t take a death threat to convince a women that library technology is too hostile for her; a long string of micro-aggressions can suffice. Do you think that librarians are too progressive or simply too darn nice for harassment to be an issue? Read Ingrid Henny Abrams’ posts about the results of her survey on code of conduct violations at ALA.

This is why the Ada Initiative’s anti-harassment work is so important – and to learn more, including links to sample policies, a good starting point is their own conference policies page. (Which, by the way, was quite useful when the Evergreen Project adopted its own code of conduct). Another good starting point is the Geek Feminism wiki.

And, of course, you could do worse than to go to one of the ally skills workshops.

If you choose to take up the challenge, make a note to come back in a year and write down what you’ve learned, what you’ve listened to and seen, and how you’ve helped to lift up others. It doesn’t have to be public – though that would be nice – but the important thing is to be mindful.

Finally, don’t just take my word for it – remember that I’m not an authority on supporting women in technology. Listen to the women who are.

Update: other #libs4ada posts

Taking the ALA Statement of Appropriate Conduct seriously

After I got back from this year’s ALA Annual Conference (held in the City of It’s Just a Dry Heat), I saw some feedback regarding B.J. Novak’s presentation at the closing session, where he reportedly marred a talk that by many accounts was quite inspiring with a tired joke alluding to a sexual (and sexist) stereotype about librarians.

Let’s suppose a non-invited speaker, panel participant, or committee member had made a similar joke. Depending on the circumstances, it may or may not have constituted “unwelcome sexual attention” per the ALA Statement of Appropriate Conduct at ALA Conferences, but, regardless, it certainly would not have been in the spirit of the statement’s request that “[s]peakers … frame discussions as openly and inclusively as possible and to be aware of how language or images may be perceived by others.” Any audience member would have been entitled to call out such behavior on the spot or raise the issue with ALA conference services.

The statement of appropriate conduct is for the benefit of all participants: “… members and other attendees, speakers, exhibitors, staff and volunteers…”. It does not explicitly exclude any group from being aware of it and governing their behavior accordingly.

Where does Novak fit in? I had the following exchange with @alaannual on Twitter:

A key aspect of many of the anti-harassment policies and codes of conduct that have been adopted by conferences and conventions recently is that the policy applies to all event participants. There is no reason to expect that an invited keynote speaker or celebrity will automatically not cross lines — and there have been several incidents where conference headliners have erred (or worse).

I am disappointed that when the Statement of Appropriate Conduct was adopted in late 2013, it apparently was not accompanied by changes to conference procedures to ensure that invited speakers would be made aware of it. There’s always room for process improvement, however, so for what it’s worth, here are my suggestions to ALA for improving the implementation of the Statement of Appropriate Conduct:

  • Update procedures to ensure that all conference speakers are made aware of the Statement.
  • Update speaker agreements for invited speakers to require that they read and abide by the Statement.
  • Make available a point of contact for all speakers who can answer questions regarding the Statement and how it applies to presentations. This should not be construed as a request that ALA review the content of presentations beforehand, just that ALA provide an individual who can help speakers interpret the Statement in case of doubt.
  • Ensure that the exhibitors’ manual and the exhibitors’ portal (exhibitors.ala.org) prominently link to the Statement. This does not appear to be the case at the moment, although this may be due to the exhibitors’ portal switching over to the upcoming Midwinter.
  • If this is not already the case, ensure that exhibitor agreements incorporate the Statement.
  • Discuss the Statement periodically and adjust it based on feedback from conference attendees and emerging best practices from other conferences. Speaking of feedback, the Magpie Librarian is conducting a survey (that closes today) to gather information about code of conduct violations at past ALA conferences.

I invite feedback, either here or directly to ALA.

Duplicate holidays, and a question

One can, in fact, have too many holidays.

Koha uses the DateTime::Set Perl module when (among other things) calculating the next day the library is open. Unfortunately, the more special holidays you have in a Koha database, the more time DateTime::Set takes to initialize itself — and the time appears to grow faster than linearly with the number of holidays.

Jonathan Druart partially addressed this with his patch for bug 11112 by implementing some lazy initialization and caching for Koha::Calendar, but that doesn’t make DateTime::Set‘s constructor itself any faster.

Today I happened to be working on a Koha database that turned out to have duplicate rows in the special_holidays table. In other words, for a given library, there might be four rows all expressing that the library is closed on 15 August 2014. That database contains hundreds of duplicates, which results in an extra 1-3 seconds per circulation operation.

The duplication is not apparent in the calendar editor, alas.

So here’s my first question: has anybody else seen this in their Koha database? The following query will turn up duplicates:

And my second question: assuming that this somehow came about during normal operation of Koha (as opposed to duplicate rows getting directly loaded into the database), does anybody have any ideas how this happened?

Conservation of enthusiasm

One of the tightropes I must walk on as the current release manager for Koha is held taut by the tension between the necessity of maintaining boundaries with the code and the necessity of acknowledging that the code is not the first concern.

Boundaries matter. Not all code is equal: some is better, some is worse, none is perfect. Some code belongs in Koha. Some code belongs in Koha for lack of a better alternative at the time. Some code does not belong in Koha. Some code will stand the test of time; some code will test our time and energy for years.

The code is not primary. It is no great insight to point out that the code does not write itself; it certainly does not document itself nor pay its own way. Nor does it get to partake in that moment of fleeting joy when things just work, when the code gets out of the way of the librarian and the patron.

What is primary? People and their energy.

Enthusiasm is boundless. It has kept some folks working on Koha for years, beyond the impetus of mere paycheck or even approbation.

Enthusiasm is limited. Anybody volunteering passion for a free software project has a question to answer: is there something better to do with my time? If the answer turns into “no”… well, there are many ways in this world to contribute to happiness, personal or shared.

Caviling can be costly — possibly, beyond measure. One “RTFM” can eliminate an entire manual’s worth of help down the road.

On the other hand, the impulse to tweak, to provide feedback, to tune a new idea, can come from the best of intentions. Passion is not enough by itself; experience matters, can guide new effort.

It’s a tightrope we all walk. But the people must come first.

My meditation: what ways of interacting among ourselves conserves enthusiasm, and thereby grows it? And how do we avoid destroying it needlessly?

Koha Dev Tidbit #2: Pinpoint the difference

This morning I reviewed and pushed the patch for Koha bug 11174. The patch, by Zeno Tajoli, removes one character each from two files.

One character? That should be easy to eyeball, right?

Not quite — the character in question was part of a parameter name in a very long URL. I don’t know about you, but it can take me a while to spot such a difference.

Here is an example. Can you spot the exact difference in less than 2 seconds?

$ git diff --color

diff --git a/INSTALL b/INSTALL
index ffe69ae..e92b1a3 100644
--- a/INSTALL
+++ b/INSTALL
@@ -94,7 +94,7 @@ Use the packaged version or install from CPAN
  sudo make upgrade
 
 Koha 3.4.x or later  no longer stores items in biblio records.
-If you are upgrading from an older version ou will need to do the
+If you are upgrading from an older version you will need to do the
 following two steps, they can take a long time (several hours) to
 complete for large databases

Now imagine doing this if the change occurs in the 100th character of a line that is 150 characters long.

Fortunately, git diff, as well as other commands like git show that display diffs, accepts several switches that let you display the differences in terms of words, not lines. These switches include --word-diff and --color-words. For example:

$ git diff --color-words

diff --git a/INSTALL b/INSTALL
index ffe69ae..e92b1a3 100644
--- a/INSTALL
+++ b/INSTALL
@@ -94,7 +94,7 @@ Use the packaged version or install from CPAN
  sudo make upgrade
 
Koha 3.4.x or later  no longer stores items in biblio records.
If you are upgrading from an older version ouyou will need to do the
following two steps, they can take a long time (several hours) to
complete for large databases

The difference is much easier to see now — at least if you’re not red-green color-blind. You can change the colors or not use colors at all:

$ git diff --word-diff

diff --git a/INSTALL b/INSTALL
index ffe69ae..e92b1a3 100644
--- a/INSTALL
+++ b/INSTALL
@@ -94,7 +94,7 @@ Use the packaged version or install from CPAN
 sudo make upgrade

Koha 3.4.x or later  no longer stores items in biblio records.
If you are upgrading from an older version [-ou-]{+you+} will need to do the
following two steps, they can take a long time (several hours) to
complete for large databases

Going back to the bug I mentioned, --word-diff wasn’t quite enough, though. By default, Git considers words to be delimited by whitespace, but the patch in question removed a character from the middle of a very long URL. To make the change pop out, I had to tell Git to highlight single-character changes. One way to do this is the --word-diff-regex or by passing the regex to --color-words. Here’s the final example:

$ git diff --color-words=.

diff --git a/INSTALL b/INSTALL
index ffe69ae..e92b1a3 100644
--- a/INSTALL
+++ b/INSTALL
@@ -94,7 +94,7 @@ Use the packaged version or install from CPAN
  sudo make upgrade
 
Koha 3.4.x or later  no longer stores items in biblio records.
If you are upgrading from an older version you will need to do the
following two steps, they can take a long time (several hours) to
complete for large databases

And there we have it — the difference, pinpointed.

Notes from Code4Lib BC: gluing together an ugly string to emit RDF

This afternoon I’m sitting in the new bibliographic environment breakout session at Code4Lib BC. After taking a look at Mark Jordan’s easyLOD, I decided to play around with putting together a web service for Koha that emits RDF when fed a bib ID. Unlike Magnus Enger’s semantikoha prototype, which uses a Ruby library to convert MARC to RDF, I was trying for an approach that used only Perl (plus XS).

There were are of building blocks available. Putting them together turned out to be a tick more convoluted than I expected.

The Library of Congress has published an XSL stylesheet for converting MODS to RDF. Converting MARC(XML) to MODS is readily done using other stylesheets, also published by LC.

The path seemed clear for a quick-and-dirty prototype — make a copy of svc/bib, copy it to opac/svc/bib and take out the bits for doing updates (we’re not quite ready to make cataloging that collaborative!), and write a few lines to apply two XSLT transformations.

The code was quickly written — but it didn’t work. XML::LibXSLT, which Koha uses to handle XSLT, complained about the modsrdf.xsl stylesheet. Too new! That stylesheet is written in XSLT 2.0, but libxslt, the C library that XML::LibXSLT is based on, only supports XSLT 1.

As it turns out, Perl modules that can handle XSLT are rather thin on the ground. What I ended up doing was:

Installing XML::Saxon::XSLT2, which required…

Installing Saxon-HE, a Java XML and XSLT processor that supports XSLT 2.0, which required…

Installing Inline::Java, which required…

Installing a JDK (I happened to choose OpenJDK).

After all that (and a quick tweak to the modsrdf.xsl stylesheet, I ended up with the following code that did the trick:

This works… but is not satisfying. Making Koha require a JDK just for XSLT 2.0 support is a bit much, for one thing, and it would likely be rather slow if used in production. It’s a pity that there’s still no broad support for XSLT 2.0.

A dead end, most likely, but instructive nonetheless.

Notes from Code4Lib BC: Accessibility

Peach Arch.  Photo by Daniel Means.  Licensed under CC-BY-SA and available at http://www.flickr.com/photos/supa_pedro/389603266.

Peach Arch. Photo by Daniel Means. Licensed under CC-BY-SA and available at Flickr.

There is nothing quite like the sense of sheer glee you get when you’re waiting at the border… and have been waiting at the border for a while… and then a new customs inspection lane is opened up. Zoom!

Marlene and I left Seattle this morning to go to the Code4Lib BC conference in Vancouver. Leaving in the morning meant that we missed the lightning talks, and arrived after the breakout sessions had started. Fortunately, folks were quick to welcome us, and I soon fell into the accessibility session.

Accessibility has been on my mind lately, but it’s an area that I’m starting mostly from ground zero with. I knew that designing accessible systems is a Good Idea, I knew about the existence some of the jargon and standards, and I knew that I didn’t know much else — certainly none of the specifics.

Cynthia Ng very kindly shared some pointers with me. For example, it is helpful to know that the Section 508 guidelines is essentially a subset of WCAG 1.0. This is exactly the sort of shortcut (through an apparently intimidating forest) that an expert can effortlessly give to a newbie — and having opportunities to learn from the experts is one of the reasons why I like going to conferences.

The accessibility breakout session charged itself with putting together a list of resources and best practices for accessibility and universal design. As I mentioned above, we arrived in the middle of the breakout session time, but a couple hours was more than enough time to get initial exposure to a lot of ideas and resources. It was exhilarating.

In no particular order, here is a list of various things that I’ll be following up on:

  • The Accessibility Project
  • Guerilla testing
  • The 5 second test
  • Swim lane diagrams
  • The Paciello Group Blog
  • Be careful about putting things in the right sidebar of a three-column layout — a lot of users have been trained by web advertising to completely ignore that region.  Similarly, a graphic with moving parts can get ignored if it looks too much like an ad.
  • The Code4Lib BC accessibility group’s notes
  • Having consistency of branding and look and feel can improve usability — but that can be a challenge when integrating a lot of separate systems (particularly if a library and a vendor have different ideas about whose branding should be foremost).
  • Integrating one’s content strategy with one’s accessibility strategy.  To paraphrase a point that Cynthia made a few times, putting out too much text is a problem for any user.
  • As with so much of software design, iterate early and often. The time to start thinking about accessibility is when you’re 20% of the way through a project, not when you’re 80% done.
  • Standards can help, but only up to a point.  A website could pass an automated WCAG compliance test with flying colors but not actually be usable by anyone.

And there’s another day of conference yet!  I’m quite happy we made the drive up.

Here’s a general question to the world: what reading material do you recommend for folks like me who want to learn more about writing accessible web software?

Koha Dev Tidbit #1: control of space

Space doesn’t matter, except when it does.

The other day Koha bug 11308 was filed reporting a problem with the public catalog search RSS feed. This affected just the new Bootstrap theme.

The bug report noted that when clicking on the RSS feed icon, the page rendered “not like an rss feed should”. That means different things to different web browsers, but we can use an RSS feed validation service like validator.w3.org to see what feed parsers are likely to think.

Before the bug was fixed, the W3C validator reported this:

This feed does not validate.
line 2, column 0: XML parsing error: :2:0: XML or text declaration not at start of entity [help]
<?xml version=’1.0′ encoding=’utf-8′ ?>

Of course, at first glance, the XML declaration looks just fine — the key bit is that it is starting at the second line of the response.

Space matters — XML requires that if an XML declaration is present, it must be the very first thing in the document.

Let’s take a look at the patch, written by Chris Cormack, that fixes the bug:

This patch moves the [% USE Koha %] Template Toolkit directive from before the XML declaration to after it. [% USE Koha %] loads a custom Template Toolkit module called “Koha”; further down in the template there is a use of Koha.Preference() to check the value of a system preference.

But why should importing a TT module add a blank line? By default, Template Toolkit will include all of the whitespace present in the template. Since there is a newline after the [% USE Koha %] directive, that newline is included in the response.

Awkward, when spaces matter.

However, Template Toolkit does have a way to chomp whitespace before or after template directives.

This means that an alternative fix could be something like this:

Adding a single hyphen here means that whitespace after the TT directive should be chomped — in other words, not included in the response.

Most of the time, extra whitespace doesn’t matter for the HTML emitted by Koha. But when space matters… you can use TT to control it.

How to participate in the KohaCon 2013 hackfest from afar

The next few days will be pretty intense for me, as I’ll be joining friends old and new for the hackfest of the 2013 Koha Conference. Hackfest will be an opportunity for folks to learn things, including how to work with Koha’s code, how and why librarians do the things they do — and how and why developers do the things they do. Stuff will be broken, stuff will be built up again, new features will be added, bugs will be fixed, and along the way, I will be cutting another alpha release of Koha 3.14.

Unfortunately, not everybody will be able to be sitting inside the conference room in Reno for the next three days. How can one participate from afar? Lots of ways:

  • Read the koha-devel mailing list and join the conversation. I will, at minimum, post a summary to koha-devel each day.
  • Follow the #kohacon13 hashtag on Twitter. Tweet to us using that hashtag if you have a question or request.
  • Look for blog posts from hackfest.
  • Join the #koha IRC channel.
  • Keep an eye on changes on the Koha wiki, particularly the roundtable notes and hackfest wishlist pages. If you’ve got additions, corrections, or clarifications to offer, please feel free to let us know or to edit the wiki pages directly.
  • Watch the Koha dashboard for patches to test and to see the progress made during hackfest.
  • Test and sign off on patches. BibLibre’s sandboxes make that super-duper simple.

Hackfest isn’t just for folks who know their way around the code — if you know about library practice, or have time to test things, or can write up documentation, you can help too!

We may also try setting up a Google hangout. Because Google Hangout has a limit on the number of simultaneous users, if you’re interested in joining one, please let me know. If you have suggestions for other ways that folks can participate remotely, please let us know that as well.

Happy hacking!

Introducing the next new MARC editor, vim

Sometimes an idea that’s been staring you in the face has to jump up and down and wave its hands to get attention.

I was working with Katrin Fischer, Koha’s QA manager, who had just finished putting together a fresh Koha testing environment on her laptop so that she can do patch review during KohaCon’s hackfest. She mentioned wishing that something like MarcEdit were on her laptop so that she could quickly edit some records for testing. While MarcEdit could be run under WINE or Mono or in a Windows virtual machine, inspiration struck me: with a little help, vim makes a perfectly good basic MARC editor.

Here’s how — if you start with a file of MARC records, you can convert them to a text file using yaz-marcdump:

The resulting text file will look something like this:

To edit the records on the command line, you can use vim (or whatever your favorite text editor is). When you’re done, to convert them back to MARC, use

To avoid mangling special characters, it’s helpful to use UTF8 as the character encoding. yaz-marcdump can also be used to convert a MARC file to UTF8. For example, if the original MARC file uses the MARC-8 encoding, you could do:

Not particularly profound, perhaps — and the title of this post is a bit tongue-in-cheek — but I know that this technique will save me a bit of time.