{"id":712,"date":"2013-11-29T15:21:05","date_gmt":"2013-11-29T23:21:05","guid":{"rendered":"http:\/\/galencharlton.com\/blog\/?p=712"},"modified":"2013-11-29T15:21:05","modified_gmt":"2013-11-29T23:21:05","slug":"notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf","status":"publish","type":"post","link":"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/","title":{"rendered":"Notes from Code4Lib BC: gluing together an ugly string to emit RDF"},"content":{"rendered":"<p>This afternoon I&#8217;m sitting in the new bibliographic environment breakout session at Code4Lib BC.  After taking a look at Mark Jordan&#8217;s <a href=\"https:\/\/github.com\/mjordan\/easyLOD\">easyLOD<\/a>, I decided to play around with putting together a web service for Koha that emits RDF when fed a bib ID.  Unlike Magnus Enger&#8217;s <a href=\"https:\/\/github.com\/MagnusEnger\/semantikoha\">semantikoha<\/a> prototype, which uses a Ruby library to convert MARC to RDF, I was trying for an approach that used only Perl (plus XS).<\/p>\n<p>There were are of building blocks available.  Putting them together turned out to be a tick more convoluted than I expected.<\/p>\n<p>The Library of Congress has published an <a href=\"http:\/\/www.loc.gov\/standards\/mods\/modsrdf\/xsl-files\/modsrdf.xsl\">XSL stylesheet<\/a> for converting MODS to RDF.  Converting MARC(XML) to MODS is readily done using <a href=\"http:\/\/www.loc.gov\/standards\/mods\/mods-conversions.html\">other stylesheets<\/a>, also published by LC.<\/p>\n<p>The path seemed clear for a quick-and-dirty prototype &#8212; make a copy of <code>svc\/bib<\/code>, copy it to <code>opac\/svc\/bib<\/code> and take out the bits for doing updates (we&#8217;re not quite ready to make cataloging <strong>that<\/strong> collaborative!), and write a few lines to apply two XSLT transformations.<\/p>\n<p>The code was quickly written &#8212; but it didn&#8217;t work.  <code>XML::LibXSLT<\/code>, which Koha uses to handle XSLT, complained about the <code>modsrdf.xsl<\/code> stylesheet.  Too new!  That stylesheet is written in XSLT 2.0, but <code>libxslt<\/code>, the C library that <code>XML::LibXSLT<\/code> is based on, only supports XSLT 1.<\/p>\n<p>As it turns out, Perl modules that can handle XSLT are rather thin on the ground.  What I ended up doing was:<\/p>\n<p>Installing <a href=\"https:\/\/metacpan.org\/pod\/XML::Saxon::XSLT2\">XML::Saxon::XSLT2<\/a>, which required&#8230;<\/p>\n<p>Installing <a href=\"http:\/\/sourceforge.net\/projects\/saxon\/files\/Saxon-HE\/\">Saxon-HE<\/a>, a Java XML and XSLT processor that supports XSLT 2.0, which required&#8230;<\/p>\n<p>Installing <a href=\"https:\/\/metacpan.org\/pod\/release\/PATL\/Inline-Java-0.53\/Java.pod\">Inline::Java<\/a>, which required&#8230;<\/p>\n<p>Installing a JDK (I happened to choose <a href=\"http:\/\/openjdk.java.net\/\">OpenJDK<\/a>).<\/p>\n<p>After all that (and a quick tweak to the <code>modsrdf.xsl<\/code> stylesheet, I ended up with the following code that did the trick:<\/p>\n<pre class=\"lang:perl\">\r\n#!\/usr\/bin\/perl\r\n\r\nBEGIN {\r\n    $ENV{'PERL_INLINE_DIRECTORY'} = '\/tmp\/inline';\r\n}\r\n\r\nuse Modern::Perl;\r\n\r\nuse CGI;\r\nuse C4::Biblio;\r\nuse C4::Templates;\r\nuse XML::Saxon::XSLT2;\r\n\r\nmy $query = new CGI;\r\nbinmode STDOUT, ':encoding(UTF-8)';\r\n\r\n# do initial validation\r\nmy $path_info = $query->path_info();\r\n\r\nmy $biblionumber = undef;\r\nif ($path_info =~ m!^\/(\\d+)$!) {\r\n    $biblionumber = $1;\r\n} else {\r\n    print $query->header(-type => 'text\/xml', -status => '400 Bad Request');\r\n}\r\n\r\n# are we retrieving or updating a bib?\r\nif ($query->request_method eq \"GET\") {\r\n    fetch_rdf($query, $biblionumber);\r\n}\r\n\r\nexit 0;\r\n\r\nsub fetch_rdf {\r\n    my $query = shift;\r\n    my $biblionumber = shift;\r\n    my $record = GetMarcBiblio($biblionumber);\r\n    if  (defined $record) {\r\n        print $query->header(-type => 'text\/xml');\r\n        my $xml = $record->as_xml_record();\r\n        my $base = join('\/',\r\n                        C4::Context->config('opachtdocs'),\r\n                        C4::Context->preference('opacthemes'),\r\n                        C4::Templates::_current_language()\r\n                       );\r\n        $xml = transform($xml, \"$base\/xslt\/MARC21slim2MODS3-3.xsl\");\r\n        $xml = transform($xml, \"$base\/xslt\/modsrdf.xsl\");\r\n        print $xml;\r\n    } else {\r\n        print $query->header(-type => 'text\/xml', -status => '404 Not Found');\r\n    }\r\n}\r\n\r\nsub transform {\r\n    my $xmlrecord = shift;\r\n    my $xslfilename = shift;\r\n\r\n    open my $fh, '<', $xslfilename;\r\n    my $trans = XML::Saxon::XSLT2->new($fh);\r\n    return $trans->transform($xmlrecord);\r\n\r\n}\r\n<\/pre>\n<p>This works&#8230; but is not satisfying.  Making Koha require a JDK just for XSLT 2.0 support is a bit much, for one thing, and it would likely be rather slow if used in production.  It&#8217;s a pity that there&#8217;s still no broad support for XSLT 2.0.<\/p>\n<p>A dead end, most likely, but instructive nonetheless.<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-712\" class=\"share-twitter sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\"><span>Twitter<\/span><\/a><\/li><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>More<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\"><span>Tumblr<\/span><\/a><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\"><span>Reddit<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-print\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-print sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/\" target=\"_blank\" title=\"Click to print\"><span>Print<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>This afternoon I&#8217;m sitting in the new bibliographic environment breakout session at Code4Lib BC. After taking a look at Mark Jordan&#8217;s easyLOD, I decided to&#8230;<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-712\" class=\"share-twitter sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\"><span>Twitter<\/span><\/a><\/li><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>More<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\"><span>Tumblr<\/span><\/a><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\"><span>Reddit<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-print\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-print sd-button share-icon\" href=\"https:\/\/galencharlton.com\/blog\/2013\/11\/notes-from-code4lib-bc-gluing-together-an-ugly-string-to-emit-rdf\/\" target=\"_blank\" title=\"Click to print\"><span>Print<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"Notes from Code4Lib BC: gluing together an ugly string to emit RDF #c4lbc","jetpack_is_tweetstorm":false},"categories":[4,6],"tags":[48],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3gJ9y-bu","_links":{"self":[{"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/posts\/712"}],"collection":[{"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/comments?post=712"}],"version-history":[{"count":8,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/posts\/712\/revisions"}],"predecessor-version":[{"id":720,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/posts\/712\/revisions\/720"}],"wp:attachment":[{"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/media?parent=712"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/categories?post=712"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/galencharlton.com\/blog\/wp-json\/wp\/v2\/tags?post=712"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}