A cat who has decided to take up more space in the world.
A cat who has decided to take up more space in the world.

Sixteen years is long enough, surely, to get to know a cat.

Nope.

Amelia had always been her mother’s child. She had father and sister too, but LaZorra was the one Mellie always cuddled up to and followed around. Humans were of dubious purpose, save for our feet: from the scent we trod back home Mellie seemed to learn all she needed of the outside world.

Her father, Erasmus, left us several years ago; while Mellie’s sister mourned, I’m not sure Rasi’s absence made much of an impression on our clown princess — after all, LaZorra remained, to provide orders and guidance and a mattress.

Where Zorri went, Mellie followed — and thus a cat who had little use for humans slept on our bed anyway.

Recently, we lost both LaZorra and Sophia, and we were afraid: afraid that Amelia’s world would close in on her. We were afraid that she would become a lost cat, waiting alone for comfort that would never return.

The first couple days after LaZorra’s passing seemed to bear our fears out. Amelia kept to her routine and food, but was isolated. Then, some things became evident.

Our bed was, in fact, hers. Hers to stretch out in, space for my legs be damned.

Our feet turned out not to suffice; our hands were required too. For that matter, for the first time in her life, she started letting us brush her.

And she enjoyed it!

Then she decided that we needed correction — so she began vocalizing, loudly and often.

And now we have a cat anew: talkative and demanding of our time and attention, confident in our love.

Sixteen years is not long enough to get to know a cat.

A picture is worth a thousand words:

Downloads of Koha Debian packages in past 52 weeks
Click to get larger image.

This represents the approximate geographic distribution of downloads of the Koha Debian packages over the past year. Data was taken from the Apache logs from debian.koha-community.org, which MPOW hosts. I counted only completed downloads of the koha-common package, of which there were over 25,000.

Making the map turned out to be an opportunity for me to learn some Python. I first adapted a Python script I found on Stack Overflow to query freegeoip.net and get the latitude and longitude corresponding to each of the 9,432 distinct IP addresses that had downloaded the package.

I then fed the results to OpenHeatMap. While that service is easy to use and is written with GPL3 code, I didn’t quite like the fact that the result is delivered via an Adobe Flash embed.  Consequently, I turned my attention to Plotly, and after some work, was able to write a Python script that does the following:

  1. Fetch the CSV file containing the coordinates and number of downloads.
  2. Exclude as outliers rows where a given IP address made more than 100 downloads of the package during the past year — there were seven of these.
  3. Truncate the latitude and longitude to one decimal place — we need not pester corn farmers in Kansas for bugfixes.
  4. Submit the dataset to Plotly with which to generate a bubble map.

Here’s the code:

#!/usr/bin/python

# adapted from example found at https://plot.ly/python/bubble-maps/

import plotly.plotly as py
import pandas as pd

df = pd.read_csv('http://example.org/koha-with-loc.csv')
df.head()

# scale factor the size of the buble
scale = 3

# filter out rows where an IP address did more than
# one hundred downloads
df = df[df['value'] <= 100]

# truncate latitude and longitude to one decimal
# place
df['lat'] = df['lat'].map('{0:.1f}'.format)
df['lon'] = df['lon'].map('{0:.1f}'.format)

# sum up the 'value' column as 'total_downloads'
aggregation = {
    'value' : {
        'total_downloads' : 'sum'
    }
}

# create a DataFrame grouping by the truncated coordinates
df_sub = df.groupby(['lat', 'lon']).agg(aggregation).reset_index()


coords = []
pt = dict(
    type = 'scattergeo',
    lon = df_sub['lon'],
    lat = df_sub['lat'],
    text = 'Downloads: ' + df_sub['value']['total_downloads'],
    marker = dict(
        size = df_sub['value']['total_downloads'] * scale,
        color = 'rgb(91,173,63)', # Koha green
        line = dict(width=0.5, color='rgb(40,40,40)'),
        sizemode = 'area'
    ),
    name = '')
coords.append(pt)

layout = dict(
        title = 'Koha Debian package downloads',
        showlegend = True,
        geo = dict(
            scope='world',
            projection=dict( type='eckert4' ),
            showland = True,
            landcolor = 'rgb(217, 217, 217)',
            subunitwidth=1,
            countrywidth=1,
            subunitcolor="rgb(255, 255, 255)",
            countrycolor="rgb(255, 255, 255)"
        ),
    )

fig = dict( data=coords, layout=layout )
py.iplot( fig, validate=False, filename='koha-debian-downloads' )

An interactive version of the bubble map is also available on Plotly.

The tragedy of keeping house with cats is that their lives are so short in comparison to our own.

On Friday, Marlene and I put Sophie to rest; today, LaZorra. Four years ago, we lost Erasmus; before that, Scheherazade and Jennyfur. At the moment, we have just one, Amelia. It was a relief that she got a clean bill of health on Saturday… but she is nonetheless sixteen years old. The inexorability of time weighs heavily on me today.

I have no belief that there is any continuation of thought or spirt or soul after the cessation of life; the only persistence I know of for our cats is in the realm of story. And it is not enough: I am not good enough with words to capture and pin down the moment of a cat sleeping and purring on my chest or how the limbs of our little feline family would knot and jumble together.

Words are not nothing, however, so I shall share some stories about the latest to depart.

2016-03-11 07.54.55

LaZorra was named after the white “Z” on her back, as if some bravo had decided to mark her before she entered this world. LaZorra was a cat of great brain, while her brother Erasmus was not. We would joke that LaZorra had claimed not only her brain cells, but those of her daughters Sophia and Amelia. (Who were also Erasmsus’ children; suffice it to say that I thought I had more time to spay LaZorra than was actually the case).

Although she was a young mother, LaZorra was a good one. Scheherazade was alive at the time and also proved to be a good auntie-cat.

Very early on, a pattern was set: Sophie would cuddle with her father Rasi; Mellie with her mother Zorrie. LaZorra would cuddle with me; as would Erasmus; per the transitive property, I ended up squished.

But really, it took only one cat to train me. For a while LaZorra had a toy that she would drag to me when she wanted me to play with her. I always did; morning, afternoon, evening, at 2 in the morning…

“NO!”

Well, that was Marlene reminding me that once I taught a cat that I could be trained to play with her at two a.m. that there would be no end of it—nor any rest for us—so I did not end up being perfectly accommodating.

But I came close. LaZorra knew that she was due love and affection; that her remit included unlimited interference with keyboards and screens. And in the end, assistance when she could no longer make even the slight jump to the kitchen chair.

Sophia

When we lost Erasmus to cancer, Marlene and I were afraid that Sophie would inevitably follow. For her, Rasi was her sun, moon, and stars. We had Erasmus euthanized at home so that the others would know that unlike the many trips for chemo, that this time he was not coming back. Nonetheless, Sophie would often sit at the door, waiting for her daddy to come back home.

She never stopped doing that until we moved.

It was by brush and comb, little by little as she camped out on the back of the couch, that I showed her that humans might just possibly be good for something (though not as a substitute for her daddy-cat). It is such a little thing, but I hold it as one of my personal accomplishments that I helped her look outward again.

Eventually those little scritches on the back of the couch became her expected due: we learned that we were to pay the Sophie-toll every time we passed by her.

Both LaZorra and Sophie were full of personality—and thus, they were often the subjects of my “Dear Cat” tweets. I’ll close with a few of them.

Butter to LaZorra was as mushrooms to hobbits:

At times, she was a little too clever for her own good:

Sophie was the only cat I’ve known to like popping bubblewrap:

Sophie apparently enjoyed the taste of cables:

LaZorra was the suitcase-inspector-in-chief:

And, of course, they could be counted on to help with computation:

They both departed this world with pieces of our hearts in their claws.

There’s now a group of people taking a look at whether and how to set up some sort of ongoing fiscal entity for the annual Code4Lib conference.  Of course, one question that comes to mind is why go to the effort? What makes the annual Code4Lib conference so special?

There are lot of narratives out there about how the Code4Lib conference and the general Code4Lib community has helped people, but for this post I want to focus on the conference itself. What does the conference do that is unique or uncommon? Is there anything that it does that would be hard to replicate under another banner? Or to put it another way, what makes Code4Lib a good bet for a potential fiscal host — or something worth going to the effort of forming a new non-profit organization?

A few things that stand out to me as distinctive practices:

  • The majority of presentations are directly voted upon by the people who plan to attend (or who are at least invested enough in Code4Lib as a concept to go to the trouble of voting).
  • Similarly, keynote speakers are nominated and voted upon by the potential attendees.
  • Each year potential attendees vote on bids by one or more local groups for the privilege of hosting the conference.
  • In principle, most any aspect of the structure of the conference is open to discussion by the broader Code4Lib community — at any time.
  • Historically, any surplus from a conference has been given to the following year’s host.
  • Any group of people wanting to go to the effort can convene a local or regional Code4Lib meetup — and need not ask permission of anybody to do so.

Some practices are not unique to Code4Lib, but are highly valued:

  • The process for proposing a presentation or a preconference is intentionally light-weight.
  • The conference is single-track; for the most part, participants are expected to spend most of each day in the same room.
  • Preconferences are inexpensive.

Of course, some aspects of Code4Lib aren’t unique. The topic area certainly isn’t; library technology is not suffering any particular lack of conferences. While I believe that Code4Lib was one of the first libtech conferences to carve out time for lightning talks, many conferences do that nowadays. Code4Lib’s dependence on volunteer labor certainly isn’t unique, although putting aside keynote speakers) Code4Lib may be unique in having zero paid staff.

Code4Lib’s practice of requiring local hosts to bootstrap their fiscal operations from ground zero might be unique, as is the fact that its planning window does not extend much past 18 months. Of course, those are both arguably misfeatures that having fiscal continuity could alleviate.

Overall, the result has been a success by many measures. Code4Lib can reliably attract at least 400 or 500 attendees. Given the notorious registration rush each fall, it could very likely be larger. With its growth, however, come substantially higher expectations placed on the local hosts, and rather larger budgets — which circles us right back to the question of fiscal continuity.

I’ll close with a question: what have I missed? What makes Code4Lib qua annual conference special?

Update 2016-06-29: While at ALA Annual, I spoke with someone who mentioned another distinctive aspect of the conference: the local host is afforded broad latitude to run things as they see fit; while there is a set of lore about running the event and several people who have been involved in multiple conferences, there is no central group that dictates arrangements.  For example, while a couple recent conferences have employed a professional conference organizer, there’s nothing stopping a motivated group from doing all of the work on their own.

Tomorrow we will drive to Orlando, as next week I’m attending two conferences: the Perl Conference (YAPC::NA) and the American Library Association’s Annual 2016 conference.

A professional concern shared by my colleagues in software development and libraries is the difficult problem of naming. Naming things, naming concepts, naming people (or better yet, using the names they tell us to use).

Names have power; names can be misused.

In light of what happened in Orlando on 12 June, the very least we can do is to choose what names we use carefully. What did happen? That morning, a man chose to kill 49 people and injure 53 others at a gay bar called the Pulse. A gay bar that was holding a Latin Night. Most of those killed were Latinx; queer people of color, killed in a spot that for many felt like home. The dead have names.

Names are not magic spells, however. There is no one word we can utter that will undo what happened at the Pulse nor immediately construct a perfect bulwark against the tide of hate. The software and library professions may be able to help reduce hate in the long run… but I offer no platitudes today.

Sometimes what is called for is blood, or cold hard cash. If you are attending YAPC:NA or ALA Annual and want to help via some means identified by those conferences, here are options:

I will close with this: many of our LGBT colleagues will feel pain from the shooting at a level more visceral than those of us who are not LGBT — or Latinx — or people of color. Don’t be silent about the atrocity, but first, listen to them; listen to the folks in Orlando who know what specifically will help the most.

The question of what Code4Lib wants to be when it grows up seems to be perennial, and the latest iteration of the discussion is upon us. Quoting Christina Salazar:

… I really do think it’s time to reopen the question of formalizing Code4Lib IF ONLY FOR THE PURPOSES OF BEING THE FIDUCIARY AGENT for the annual conference.

I agree — we need to discuss this. The annual main conference has grown from a hundred or so in 2006 to 440 in 2016. Given the notorious rush of folks racing to register to attend each fall, it is not unreasonable to think that a conference in the right location that offered 750 seats — or even 1,000 — would still sell out. There are also over a dozen regional Code4Lib groups that have held events over the years.

With more attendees comes greater responsibilities — and greater financial commitments. Furthermore, over the years the bar has (appropriately) been raised on what is counted as the minimum responsibilities of the conference organizers. It is no longer enough to arrange to keep the bandwidth high, the latency low, and the beer flowing. A conference host that does not consider accessibility and representation is not living up to what Code4Lib qua group of thoughtful GLAM tech people should be; a host that does not take attendee safety and the code of conduct seriously is being dangerously irresponsible.

Running a conference or meetup that’s larger than what can fit in your employer’s conference room takes money — and the costs scale faster than linearly.  For recent Code4Lib conferences, the budgets have been in the low- to-middle- six figures.

That’s a lot of a money — and a lot of antacids consumed until the hotel and/or convention center minimums are met. The Code4Lib community has been incredibly lucky that a number of people have voluntarily chosen to take this stress on — and that a number of institutions have chosen to act as fiscal hosts and incur the risk of large payouts if a conference were to collapse.

To disclose: I am a member of the committee that worked on the erstwhile bid to host the 2017 conference in Chattanooga. I think we made the right decision to suspend our work; circumstances are such that many attendees would be faced with the prospect of traveling to a state whose legislature is actively trying to make it more dangerous to be there.

However, the question of building or finding a long-term fiscal host for the annual Code4Lib conference must be considered separately from the fate of the 2017 Chattanooga bid. Indeed, it should have been discussed before conference hosts found themselves transferring five-figure sums to the next year’s host.

Of course, one option is to scale back and cease attempting to organize a big international conference unless some big-enough institution happens to have the itch to backstop one. There is a lot of life in the regional meetings, and, of course, many, many people who will never get funding to attend a national conference but who could attend a regional one.

But I find stepping back like that unsatisfying. Collectively, the Code4Lib community has built an annual tradition of excellent conferences. Furthermore, those conference have gotten better (and bigger) over the years without losing one of the essences of Code4Lib: that any person who cares to share something neat about GLAM technology can have the respectful attention of their peers. In fact, the Code4Lib community has gotten better — by doing a lot of hard work — about truly meaning “any person.”

Is Code4Lib a “do-ocracy”? Loaded question, that. But this go around, there seems to be a number of people who are interested in doing something to keep the conference going in the long run. I feel we should not let vague concerns about “too much formality” or (gasp! horrors!) “too much library organization” stop the folks who are interested from making a serious go of it.

We may find out that forming a new non-profit is too much uncompensated effort. We may find out that we can’t find a suitable umbrella organization to join. Or we may find out that we can keep the conference going on a sounder fiscal basis by doing the leg-work — and thereby free up some people’s time to hack on cool stuff without having to pop a bunch of Maalox every winter.

But there’s one in argument against “formalizing” in particular that I object to. Quoting Eric Lease Morgan:

In the spirit of open source software and open access publishing, I suggest we
earnestly try to practice DIY — do it yourself — before other types of
formalization be put into place.

In the spirit of open source? OK, clearly that means that we should immediately form a non-profit foundation that can sustain nearly USD 16 million in annual expenses. Too ambitious?  Let’s settle for just about a million in annual expenses.

I’m not, of course, seriously suggesting that Code4Lib aim to form a foundation that’s remotely in the same league as the Apache Software Foundation or the Mozilla Foundation. Nor do I think Code4Lib needs to become another LITA — we’ve already got one of those (though I am proud, and privileged, to count myself a member of both).  For that matter, I do think it is possible for a project or group effort to prematurely spend too much time adopting the trappings of formal organizational structure and thus forget to actually do something.

But the sort of “DIY” (and have fun unpacking that!) mode that Morgan is suggesting is not the only viable method of “open source” organization. Sometimes open source projects get bigger. When that happens, the organizational structure always changes; it’s better if that change is done openly.

The Code4Lib community doesn’t have to grow larger; it doesn’t have to keep running a big annual conference. But if we do choose to do that — let’s do it right.

Consider the phrase “Cataloging and coding as applied empathy”.  Here are some implications of those six words:

  • Catalogers and coders share something: what we build is mainly for use by other people, not ourselves. (Yes, programmers often try to eat our own dogfood, and catalogers tend to be library users, but that’s mostly not what we’re paid for.)
  • Consideration of the needs of our users is needed to do our jobs well, and to do right by our users.
  • However: we cannot rely on our users to always tell us what to do:
    • sometimes they don’t know what it is possible to want;
    • sometimes they can’t articulate what they want in a way that lends itself to direct translation to code or taxonomy;
    • it is rarely their paid job to tell us what they want, and how to build it.
  • Waiting for users to tell exactly us what to do can be a decision… to do nothing. Sometimes doing nothing is the best thing to do; often it’s not.
  • Therefore, catalogers and coders need to develop empathy.
  • Applied empathy: our catalogs and our software in some sense embody our empathy (or lack thereof).
  • Applied empathy: empathy can be a learned skill.

Is “applied empathy” a useful framework for discussing how to serve our users? I don’t know, so I’d like to chat about it.  I will be moderating a Mashcat Twitter chat on Thursday, 12 May 2016, at 20:30 UTC (time converter). Do you have questions to suggest? Please add them to the Google doc for this week’s chat.

I offer up two tendentious lists. First, some problems in the domain of library software that are natural to work on, and in the hopeful future, solve:

  • Helping people find stuff. On the one hand, this surely comes off as simplistic; on the other hand, it is the core problem we face, and has been the core problem of library technology from the very moment that a library’s catalog grew too large to stay in the head of one librarian.  There are of course a number of interesting sub-problems under this heading:
    • Helping people produce and maintain useful metadata.
    • Usefully aggregating metadata.
    • Helping robots find stuff (presumably with the ultimate purpose of helping people to find stuff).
    • Artificial intelligence. By this I’m not suggesting that library coders should be aiming to have an ILS kick off the Singularity, but there’s plenty of room for (e.g.) natural language processing to assist in the overall task of helping people find stuff.
  • Helping people evaluate stuff. “Too much information, little knowledge, less wisdom” is one way of describing the glut of bits infesting the Information Age. Libraries can help and should help—even though pitfalls abound.
  • Helping people navigate software and information resources. This includes UX for library software, but also a lot of other software that librarians, like it or not, find themselves helping patrons use. There are some areas of software engineering where the programmer can assume that the user is expert in the task that the software assists with; library software isn’t one of them.
  • Sharing stuff. What is Evergreen if not a decade-long project in figuring out ways to better share library materials among more users? Sharing stuff is not a solved problem even for digital stuff.
  • Keeping stuff around. This is an increasingly difficult problem. Time was, you could leave a pile of books sitting around and reasonably expect that at least a few would still exist five hundred years hence. Digital stuff never rewards that sort of carelessness.
  • Protecting patron privacy. This nearly ended up in the unnatural list—a problem can be unnatural but nonetheless crucial to work on. However, since there’s no reason to expect that people will stop being nosy about what other people are reading—and for that nosiness to sometimes turn into persecution—here we are.
  • Authentication. If the library keeps any transaction information on behalf of a patron so that they can get to it later, the software had better be trying to make sure that only the correct patron can see it. Of course, one could argue that library software should never store such information in the first place (after, say, a loan is returned), but I think there can be an honest conflict with patrons’ desires to keep track of what they used in the past.

Second, some distinctly unnatural problems that library technologists all too often must work on:

  • Digital rights management. If Ambrose Bierce were alive, I would like to think that he might define DRM in a library context thus: “Something that is ineffective in its stated purpose—and cannot possible be effective—but which serves to compromise libraries’ commitment to patron privacy in the pursuit of a misunderstanding about what will keep libraries relevant.”
  • Walled garden maintenance. Consider EZproxy. It takes the back of a very small envelope to realize that hundreds of thousands of person-hours have been expended fiddling with EZproxy configuration files for the sake of bolstering the balance sheets of Big Journal. Is this characterization unfair? Perhaps. Then consider this alternative formulation: the opportunity cost imposed by time spent maintaining or working around barriers to the free exchange of academic publications is huge—and unlike DRM for public library ebooks, there isn’t even a case (good, bad, or indifferent) to be made that the effort results in any concrete financial compensation to the academics who wrote the journal articles that are being so carefully protected.
  • Authorization. It’s one thing to authenticate a patron so that they can get at whatever information the library is storing on their behalf. It’s another thing to spend time coding authentication and authorization systems as part of maintaining the walled gardens.

The common element among the problems I’m calling unnatural? Copyright; in the particular, the current copyright regime that enforces the erection of barriers to sharing—and which we can imagine, if perhaps wistfully, changing to the point where DRM and walled garden maintenance need not occupy the attention of the library programmer, who then might find more time to work on some of the natural problems.

Why is this on my mind? I would like to give a shout-out to (and blow a raspberry at) an anonymous publisher who had this to say in a recent article about Sci-Hub:

And for all the researchers at Western universities who use Sci-Hub instead, the anonymous publisher lays the blame on librarians for not making their online systems easier to use and educating their researchers. “I don’t think the issue is access—it’s the perception that access is difficult,” he says.

I know lots of library technologists who would love to have more time to make library software easier to use. Want to help, Dear Anonymous Publisher? Tell your bosses to stop building walls.

From a security alert 1 from Langara College:

Langara was recently notified of a cyber security risk with Pearson online learning which you may be using in your classes. Pearson does not encrypt user names or passwords for the services we use, which puts you at risk. Please note that they are an external vendor; therefore, this security flaw has no direct impact on Langara systems.

This has been a problem since at least 20112; it is cold comfort that at least one Pearson service has a password recovery page that outright says that the user’s password will be emailed to them in clear text3.

There have been numerous tweets, blog posts, and forum posts about this issue over the years. In at least one case4, somebody complained to Pearson and ended up getting what reads like a canned email stating:

Pearson must strike a reasonable balance between support methods that are accessible to all users, and the risk of unauthorized access to information in our learning applications. Allowing customers to retrieve passwords via email was an industry standard for non-financial applications.

In response to the changing landscape, we are developing new user rights management protocols as part of a broader commitment to tighten security and safeguard customer accounts, information, and product access. Passwords will no longer be retrievable; customers will be able to reset passwords through secure processes.

This is a risible response for many reasons; I can only hope that they actually follow through with their plan to improve the situation in a timely fashion. Achieving the industry standard for password storage as of 1968 might be a good start5.

In the meantime, I’m curious whether there are any libraries who are directly involved in the acquisition of Pearson services on behalf of their school or college. If so, might you have a word with your Pearson rep?

Adapted from an email I sent to the LITA Patron Privacy Interest Group’s mailing list. I encourage folks interested in library patron privacy to subscribe; you do not have to be a member of ALA to do so.

Footnotes

1. Pearson Cyber Security Risk
2. Report on Plain Text Offenders
3. Pearson account recovery page
4. Pearson On Password Security
5. Wilkes, M V. Time-sharing Computer Systems. New York: American Elsevier Pub. Co, 1968. Print.. It was in this book that Roger Needham first proposed hashing passwords.