Microsoft Word - December_ITAL_gonzales_final.docx Linking  Libraries  to  the  Web:     Linked  Data  and  the  Future  of  the   Bibliographic  Record       Brighid  M.  Gonzales       INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014           10   ABSTRACT   The  ideas  behind  Linked  Data  and  the  Semantic  Web  have  recently  gained  ground  and  shown  the   potential  to  redefine  the  world  of  the  web.  Linked  Data  could  conceivably  create  a  huge  database  out   of  the  Internet  linked  by  relationships  understandable  by  both  humans  and  machines.  The  benefits  of   Linked  Data  to  libraries  and  their  users  are  potentially  great,  but  so  are  the  many  challenges  to  its   implementation.  The  BIBFRAME  Initiative  provides  the  possible  framework  that  will  link  library   resources  with  the  web,  bringing  them  out  of  their  information  silos  and  making  them  accessible  to   all  users.   INTRODUCTION   For  many  years  now  the  MARC  (MAchine-­‐Readable  Cataloging)  format  has  been  the  focus  of   rampant  criticisms  across  library-­‐related  literature,  and  though  an  increasing  number  of  diverse   metadata  formats  for  libraries,  archives,  and  museums  have  been  developed,  no  framework  has   shown  the  potential  to  be  a  viable  replacement  for  the  long-­‐established  and  widely  used   bibliographic  format.  Over  the  past  decade,  web  technologies  have  been  advancing  at  a   progressively  rapid  pace,  outpacing  MARC’s  ability  to  keep  up  with  the  potential  these   technologies  can  offer  to  libraries.  Standing  by  the  MARC  format  leaves  libraries  in  danger  of  not   being  adequately  prepared  to  meet  the  needs  of  modern  users  in  the  information  environments   they  currently  frequent  (increasingly,  search  engines  such  as  Google).   New  technological  developments  such  as  the  ideas  behind  Linked  Data  and  the  Semantic  Web  have   the  potential  to  bring  a  host  of  benefits  to  libraries  and  other  cultural  institutions  by  allowing   libraries  and  their  carefully  cultivated  resources  to  connect  with  users  on  the  web.  Though  there   remains  a  host  of  obstacles  to  their  implementation,  Linked  Data  has  much  to  offer  libraries  if  they   can  find  ways  to  leverage  this  technology  for  their  own  uses.  Libraries  are  slowly  finding  ways  to   take  advantage  of  the  opportunities  Linked  Data  present,  including  initiatives  such  as  the   Bibliographic  Framework  Initiative,  known  as  BIBFRAME,  which  may  have  the  potential  to  be  the   bibliographic  replacement  for  MARC  that  the  information  community  has  long  needed.  Such  a   change  may  help  libraries  not  only  to  stay  current  with  the  modern  information  world  and  stay   relevant  in  the  minds  of  users,  but  also  reciprocally  create  a  richer  world  of  data  available  to   information  seekers  on  the  web.   Brighid  Gonzales  (brighidmgonzales@gmail.com),  a  recent  MLIS  recipient  from  the  School  of   Library  and  Information  Science,  San  Jose  State  University,  is  winner  of  the  2014  LITA/Ex  Libris   Student  Writing  Award.     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       11   The  Limitations  of  MARC   Much  has  been  written  over  the  years  about  the  issues  and  shortcomings  of  the  MARC  format.   Nonetheless,  MARC  formatting  has  been  widely  used  by  libraries  around  the  world  since  the   1960s,  when  it  was  first  created.  This  long-­‐established  and  ubiquitous  usage  has  resulted  in   countless  legacy  bibliographic  records  that  currently  exist  in  the  MARC  format.  To  lose  this   carefully  crafted  data  or  to  expend  the  finances,  time,  and  manual  effort  required  to  convert  all  of   this  legacy  data  into  a  new  format  may  be  a  cause  for  reservation  in  the  community.   But  the  fact  remains  that  in  spite  of  its  widespread  use,  there  are  many  issues  with  the  MARC   format  that  make  it  a  candidate  for  replacement  in  the  world  of  bibliographic  data.   Andresen  describes  several  different  versions  of  MARC  that  have  largely  been  wrapped  together  in   the  community’s  mind,  reminding  us  that  “although  MARC21  is  often  described  as  an  international   standard,  it  is  only  used  in  a  limited  number  of  countries.”1  In  actuality,  what  we  often  refer  to   simply  as  MARC  could  be  MARC21,  UKMARC,  UNIMARC  or  even  danMARC2.2  This  lack  of  a  unified   standard  has  long  been  an  issue  with  this  particular  format.   Then  there  is  MARC’s  notorious  inflexibility.  Originally  created  for  the  description  of  printed   materials,  MARC’s  rigidly  defined  standards  can  make  it  unsuited  for  the  description  of  digital,   visual,  or  multimedia  resources.  Andresen  writes  that  “the  lack  of  flexibility  means  that  local   additions  might  hinder  exchange  between  local  systems  and  union  catalogue  systems.”3  Tennant   has  also  expressed  frustration  with  MARC’s  inflexibility,  particularly  its  inability  to  express   hierarchical  relationships.  Tennant  posits  that  where  the  MARC  format  is  “flat,”  expressing   relationships  involving  hierarchy,  such  as  in  a  table  of  contents,  “would  be  a  breeze  in  XML,”  which   is  the  format  he  recommends  moving  toward  for  its  greater  extensibility.4  MARC’s  rigidity  may  also   be  a  reason  why  the  format  is  not  generally  used  outside  of  the  library  environment;  thus   information  contained  in  MARC  format  cannot  be  exchanged  with  information  from  nonlibrary   environments.5   Inconsistencies,  errors,  and  localized  practices  are  also  issues  frequently  cited  in  detailing  MARC’s   inherent  shortcomings.  With  shared  cataloging,  inconsistencies  may  be  less  common,  but  there   remains  the  fact  that  with  any  number  of  individual  catalogers  creating  records,  the  potential  for   error  is  still  great.  And  any  localized  changes  can  also  create  inconsistency  in  records  from  library   to  library.  Tennant  gives  as  an  example  recording  the  editor  of  a  book,  which  “should  be  encoded  in   a  700  field,  with  a  $e  subfield  that  specifies  the  person  is  the  editor.  But  the  $e  subfield  is   frequently  not  encoded,  thus  leaving  one  to  guess  the  role  of  the  person  encoded  in  the  700  field.”6   When  it  comes  to  issues  with  MARC  in  the  modern  computing  environment,  however,  one  of  the   biggest  and  seemingly  insurmountable  problems  is  its  inability  to  express  the  relationships   between  entities.  Andresen  points  out  that  it  is  “difficult  to  handle  relations  between  data  that  are   described  in  different  fields,”7  while  Tennant  writes  that  “relationships  among  related  titles  are   problematic.”8  Alemu  et  al.  also  write  of  MARC’s  “document-­‐centric”  structure,  which  prevents  it     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   12   from  recognizing  relationships  between  entities  that  might  be  possible  in  a  more  “actionable  data-­‐ centric  format.”9   Though  Tennant  advocates  the  embrace  of  XML-­‐based  formats  as  a  way  to  transition  from  MARC,   Breeding  writes  that  even  MARCXML  “cannot  fully  make  intelligible  the  quirky  MARC  coding  in   terms  of  semantic  relationships.”10  Alemu  et  al.  also  note  that  MARC  may  continue  to  be  widely   used  mainly  because  alternatives,  including  XML,  have  not  yet  been  found  to  be  an  adequate   replacement.11   It  is  clear  that  if  libraries  and  their  carefully  crafted  bibliographic  records  are  to  remain  relevant   and  viable  in  today’s  modern  computing  world,  a  more  modern  metadata  format  that  addresses   these  issues  will  be  required.  Clearly  needed  is  a  more  flexible  and  extensible  format  that  allows   for  the  expression  of  relationships  between  points  of  data  and  the  ability  to  link  that  data  to  other   related  information  outside  of  the  presently  insular  library  catalog.   Linked  Data  and  the  Semantic  Web   Linked  Data  works  as  the  framework  behind  the  Semantic  Web,  an  idea  by  World  Wide  Web   inventor  Tim  Berners-­‐Lee,  which  would  turn  the  Internet  into  something  closer  to  one  large   database  rather  than  simply  a  disparate  collection  of  documents.  Since  the  Internet  is  often  the   first  place  users  turn  to  for  information,  libraries  should  take  advantage  of  the  concepts  behind   Linked  Data  to  both  put  their  resources  out  on  the  web,  where  they  can  be  found  by  users,  and  in   turn  bring  those  users  back  to  the  library  through  the  lure  of  authoritative,  high-­‐quality  resources.   In  the  world  of  Linked  Data,  the  relationships  between  data,  not  just  the  documents  in  which  they   are  contained,  are  made  explicit  and  readable  by  both  humans  and  machines.  With  the  ability  to   “understand”  and  interpret  these  semantically  explicit  connections,  computers  will  have  the  power   to  lead  users  to  a  web  of  related  data  based  on  a  single  information  search.  Underpinning  the   Semantic  Web  are  the  web-­‐specific  standards  XML  and  RDF  (Resource  Description  Framework).   These  work  as  universal  languages  for  semantically  labeling  data  in  such  a  way  that  both  a  person   and  a  computer  can  interpret  their  meaning  and  then  distinguishing  the  relationships  between  the   various  data  sources.   These  relationships  are  expressed  using  RDF,  “a  flexible  standard  proposed  by  the  W3C  to   characterize  semantically  both  resources  and  the  relationships  which  hold  between  them.”12  Baker   notes  that  RDF  supports  “the  process  of  connecting  dots—of  creating  “knowledge”—by  providing   a  linguistic  basis  for  expressing  and  linking  data.”13  RDF  is  organized  into  triples,  expressing   meaning  as  subject,  verb,  and  object  and  detailing  the  relationships  between  them.  An  example  is   The  Catcher  in  the  Rye  is  written  by  J.  D.  Salinger,  where  The  Catcher  in  the  Rye  acts  as  the  subject,  J.   D.  Salinger  is  the  object  and  the  “verb”  is  written  by  expresses  the  semantic  relationship  between   the  two,  naming  J.  D.  Salinger  as  the  author  of  The  Catcher  in  the  Rye.  By  using  this  framework,   computers  can  link  to  other  RDF-­‐encoded  data,  leading  users  to  other  works  written  by  J.  D.   Salinger,  other  adaptations  of  The  Catcher  in  the  Rye,  and  other  related  data  sources  from  around     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       13   the  web.   RDF  gives  machines  the  ability  to  “understand”  the  semantic  meaning  of  things  on  the  web  and  the   nature  of  the  relationships  between  them.  In  this  way  it  can  make  connections  for  people,  leading   them  to  related  information  they  may  not  have  otherwise  found.  The  use  of  XML  allows  developers   to  create  their  own  tags,  adding  an  explicit  semantic  structure  to  their  documents  that  they  can   exploited  using  RDF.   The  Semantic  Web  is  based  on  four  rules  explicated  by  web  inventor  Tim  Berners-­‐Lee.  The  rules   for  the  Semantic  Web  are  as  follows:   1. Use  URIs  (uniform  resource  identifiers)  as  names  for  things.   2. Use  HTTP  URIs  so  that  people  can  look  up  those  names.   3. When  someone  looks  up  a  URI,  provide  useful  information,  using  the  standards  (RDF*,   SPARQL).   4. Include  links  to  other  URIs  so  that  they  can  discover  more  things.14   URIs  act  as  a  permanent  signpost  for  things,  both  on  and  off  the  web.  Using  consistent  URIs  allows   data  to  be  linked  between  and  back  to  certain  places  on  the  web  without  the  worry  of  broken  or   dead  links.  RDF  triples  map  the  relationships  between  each  thing,  which  can  then  be  linked  to   more  things,  opening  up  a  wide  world  of  interrelated  data  for  users.     The  concept  behind  Linked  Data  would  allow  for  the  integration  of  library  data  and  data  from   other  resources,  whether  from  “scientific  research,  government  data,  commercial  information,  or   even  data  that  has  been  crowd-­‐sourced.”15  However,  to  create  an  open  web  of  data  facilitated  by   Linked  Data  theories,  open  standards  such  as  RDF  must  be  used,  making  data  interoperable  with   resources  from  various  communities.  This  interoperability  is  key  to  being  able  to  mix  library   resources  with  those  from  other  parts  of  the  web.   Interoperability  helps  to  make  “data  accessible  and  available,  so  that  they  can  be  processed  by   machines  to  allow  their  integration  and  their  reuse  in  different  applications.”16  In  this  way,   machines  would  be  able  to  understand  the  relationships  and  connections  between  data  contained   within  documents  and  thus  lead  users  to  related  data  they  may  not  have  otherwise  found.  Using   Linked  Data  would  bring  carefully  crafted  and  curated  library  data  out  of  the  information  silos  in   which  they  have  long  been  enclosed  and  connect  them  with  the  rest  of  the  web  where  users  can   more  easily  find  them.   Benefits  For  Libraries   Libraries  and  their  users  have  much  to  gain  from  participation  in  the  Linked  Data  movement.  In  an   age  when  Google  is  often  the  first  place  users  turn  when  searching  for  information,  freeing  library   data  from  their  insulated  databases  and  getting  them  out  onto  the  web  where  the  users  are  can   help  make  library  resources  both  relevant  and  available  for  users  who  may  not  make  the  library     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   14   the  first  place  they  look  for  information.  This  can  lead  not  only  to  increased  use  by  library  patrons   and  nonpatrons  (who  would  now  be  potential  library  patrons)  alike,  but  also  to  increased  visibility   for  the  library.  Creating  and  using  Linked  Data  technologies  also  opens  the  door  for  libraries  to   share  metadata  and  other  information  in  a  way  previously  limited  by  MARC.  Libraries  also  have   the  potential  to  add  to  the  richness  of  data  that  is  available  on  the  web,  creating  a  reciprocal   benefit  with  the  Semantic  Web  itself.   Coyle  writes  that  “every  minute  an  untold  number  of  new  resources  is  added  to  our  digital  culture,   and  none  of  these  is  under  the  bibliographic  control  of  the  library.”17  Indeed,  the  World  Wide  Web   is  a  participatory  environment  where  anyone  can  create,  edit  or  manipulate  information  resources.   Libraries  still  consider  themselves  the  province  of  quality,  reliable  information,  but  users  don’t   necessarily  go  to  libraries  when  searching  and  don’t  necessarily  have  the  Internet  acumen  to   distinguish  between  authoritative  information  and  questionable  resources.  Coyle  also  notes  that   “the  push  to  move  libraries  in  the  direction  of  linked  data  is  not  just  a  desire  to  modernize  the   library  catalog;  it  represents  the  necessity  to  transform  the  library  catalog  from  a  separate,  closed   database  to  an  integration  with  the  technology  that  people  use  for  research.”18  Using  Linked  Data,   libraries  can  still  create  the  rich,  reliable,  authoritative  data  they  are  known  for  while  also  making   it  available  on  the  web,  where  potentially  anyone  can  find  it.   Much  has  been  written  about  libraries’  information  silos,  and  many  researchers  are  finding  in   Linked  Data  the  possibility  to  free  this  information.  For  the  information  contained  in  the  library   catalog  to  be  significantly  more  usable  it  “must  be  integrated  into  the  web,  queryable  from  it,  able   to  speak  and  to  understand  the  language  of  the  web.”19  Alemu  et  al.  write  that  linking  library  data   to  the  web  “would  allow  users  to  navigate  seamlessly  between  disparate  library  databases  and   external  information  providers  such  as  other  libraries,  and  search  engines.”20  Users  are  likely  to   find  the  world  of  Linked  Data  immeasurably  more  useful  than  individually  searching  library   databases  one-­‐by-­‐one  or  relying  on  Google  search  results  for  the  information  they  need.   Linked  Data  also  allows  for  the  possibility  of  serendipity  in  information  searching,  of  finding   information  one  didn't  even  know  they  were  looking  for,  something  akin  to  browsing  the  library   shelves.21  Linked  Data  “allows  for  the  richer  contextualization  of  sources  by  making  connections   not  only  within  collections  but  also  to  relevant  outside  sources.”22  Tillett  adds  that  Linked  Data   would  allow  for  “mashups  and  pathways  to  related  information  that  may  be  of  interest  to  the  Web   searcher—either  through  showing  them  added  facets  they  may  wish  to  consider  to  refine  their   search  or  suggesting  new  directions  or  related  resources  they  may  also  like  to  see.”23   The  use  of  Linked  Data  is  not  just  beneficial  to  users  though.  Libraries  are  also  likely  to  see   increased  benefits  in  the  sharing  of  metadata  and  other  resources.  Alemu  et  al.  write  that  “making   library  metadata  available  for  re-­‐use  would  eliminate  unnecessary  duplication  of  data  that  is   already  available  elsewhere,  through  reliable  sources.24  Tillett  also  writes  about  the  reduced  cost   to  libraries  for  storage  and  data  in  a  linked  data  environment  where  “libraries  do  not  need  to   replicate  the  same  data  over  and  over,  but  instead  share  it  mutually  with  each  other  and  with     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       15   others  using  the  Web,”  reducing  costs  and  expanding  information  accessibility.25  Byrne  and   Goddard  also  note  that  “having  a  common  format  for  all  data  would  be  a  huge  boon  for   interoperability  and  the  integration  of  all  kinds  of  systems.”26   In  addition  to  the  reduced  cost  of  shared  resources,  something  with  which  libraries  are  already   very  familiar,  the  linking  of  data  from  libraries  to  one  another  and  to  the  web  would  also  allow  for   an  increased  richness  in  overall  data.  From  metadata  that  may  need  to  be  changed  or  updated   periodically  to  user-­‐generated  metadata  that  is  more  likely  to  include  current,  up-­‐to-­‐date   terminology,  the  “mixed  metadata”  approach  allowed  by  Linked  Data  would  be  “better  situated  to   provide  a  richer  and  more  complete”  description  of  various  resources  that  could  more  accurately   provide  for  the  variety  of  interpretation  and  terminology  possible  in  their  description.27   A  New  Bibliographic  Framework   One  of  the  most  important  ways  libraries  are  moving  toward  the  world  of  Linked  Data  is  with  the   Bibliographic  Framework  Initiative,  known  as  BIBFRAME,  which  was  announced  by  the  Library  of   Congress  in  2011.  Since  then,  though  BIBFRAME  is  still  in  development,  rapid  progress  has  been   made  that  suggests  that  BIBFRAME  may  be  the  long-­‐awaited  replacement  for  the  MARC  format   that  could  free  library  bibliographic  information  from  its  information  silos  and  allow  it  to  be   integrated  with  the  wider  web  of  data.   The  BIBFRAME  model  comprises  four  classes:  Creative  Work,  Instance,  Authority,  and  Annotation.   In  this  model,  Creative  Work  represents  the  “conceptual  essence”  of  the  item.  Instance  is  the   “material  embodiment”  of  the  Creative  Work.  Authority  is  a  resource  that  defines  relationships   reflected  by  the  Creative  Work  and  Instance,  such  as  People,  Places,  Topics,  and  Organizations.   Annotation  relates  the  Creative  Work  with  other  information  resources,  which  could  be  library   holdings  information,  cover  art,  or  reviews.28  These  are  similar  in  a  way  to  the  FRBR  (Functional   Requirements  for  Bibliographic  Records)  model,  which  uses  Work,  Expression,  Manifestation,  and   Item.29  Indeed,  BIBFRAME  is  built  with  RDA  (Resource  Description  and  Access)  as  an  important   source  for  content,  which  was  in  turn  built  around  the  principles  in  FRBR.  Despite  this,  BIBFRAME   “aims  to  be  independent  of  any  particular  set  of  cataloging  rules.”30   Realizing  the  vast  amounts  of  information  that  is  still  recorded  in  MARC  format,  the  BIBFRAME   initiative  is  also  working  on  a  variety  of  tools  that  will  help  to  transform  legacy  MARC  records  into   BIBFRAME  resources.31  These  tools  will  be  essential  as  “the  conversion  of  MARC  records  to   useable  Linked  Data  is  a  complicated  process.”32  Where  MARC  allowed  for  libraries  to  share   bibliographic  records  without  each  having  to  constantly  reinvent  the  wheel,  BIBFRAME  will  allow   library  metadata  to  be  “shared  and  reused  without  being  transported  and  replicated.”33   BIBFRAME  would  support  the  Linked  Data  model  while  also  incorporating  emerging  content   standards  such  as  FRBR  and  RDA.34  The  BIBFRAME  initiative  is  committed  to  compatibility  with   existing  MARC  records  but  would  eventually  replace  MARC  as  a  bibliographic  framework  “agnostic   to  cataloging  rules”35  rather  than  intertwined  with  them  as  MARC  was  with  AACR2.  Also  unlike     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   16   MARC,  which  is  rigidly  structured  and  not  amenable  to  incorporation  with  web  standards,   BIBFRAME  would  enable  library  metadata  to  be  found  on  the  web,  freeing  it  from  the  information   silos  that  have  contained  it  for  decades.  Whereas  MARC  is  not  very  web-­‐compatible,  “BIBFRAME  is   built  on  XML  and  RDF,  both  ‘native’  schemas  for  the  internet.  The  web-­‐friendly  nature  of  these   schemas  allows  for  the  widest  possible  indexing  and  exposure  for  the  resources  held  in   libraries.”36   Backed  by  the  Library  of  Congress,  BIBFRAME  already  has  a  great  deal  of  support  throughout  the   information  community,  though  it  is  not  yet  at  the  stage  of  implementation  for  most  libraries.   However,  half  a  dozen  libraries  and  other  institutions  are  acting  as  “Early  Experimenters”  working   to  implement  and  experiment  with  BIBFRAME  to  assist  in  the  development  process  and  get  the   framework  library  ready.  Participating  institutions  include  the  British  Library,  George  Washington   University,  Princeton  University,  Deutsche  National  Bibliothek,  National  Library  of  Medicine,  OCLC,   and  the  Library  of  Congress.37  Though  not  yet  fully  realized,  BIBFRAME  seems  to  offer  a   substantial  step  toward  the  implementation  of  Linked  Data  to  connect  library  bibliographic   materials  with  other  resources  on  the  web.   The  Challenges  Ahead   The  road  to  widespread  use  of  the  Semantic  Web,  Linked  Data,  and  even  possible  implementations   such  as  BIBFRAME  is  not  without  obstacles.  For  one,  knowledge  and  awareness  is  a  major  concern,   as  well  as  the  intimidating  thought  of  transitioning  away  from  MARC,  a  standard  that  has  been  in   widespread  use  for  as  long  as  many  of  the  professionals  using  it  have  been  alive.  There  is  also  the   challenge  and  significant  resources  required  for  converting  huge  stores  of  legacy  data  from  MARC   format  to  a  new  standard.  In  addition,  Linked  Data  has  its  own  set  of  specific  concerns,  such  as   legality  and  copyright  issues  involved  in  the  sharing  of  information  resources,  as  well  as  the   willingness  of  institutions  to  share  metadata  that  they  may  have  invested  a  great  deal  of  time  and   money  in  creating.   Many  organizations  may  be  hesitant  to  make  the  move  toward  Linked  Data  without  a  clear  sign  of   success  from  other  institutions.  Chudnov  writes  that  “a  new  era  of  information  access  where   library-­‐provided  resources  and  services  rose  swiftly  to  the  top  of  ambient  search  engines’  results   and  stayed  there”  is  what  may  be  necessary,  as  well  as  “tools  and  techniques  that  make  it  easier  to   put  content  online  and  keep  it  there.”38  Byrne  and  Goddard  also  note  that  “Linked  Data  becomes   more  powerful  the  more  of  it  there  is.  Until  there  is  enough  linking  between  collections  and   imaginative  uses  of  data  collections  there  is  a  danger  librarians  will  see  linked  data  as  simply   another  metadata  standard,  rather  than  the  powerful  discovery  tool  it  will  underpin.”39    Alemu  et  al.  concur  that  making  Linked  Data  easy  to  create  and  put  online  is  necessary  before   potential  implementers  will  begin  to  use  it.  “It  is  imperative  that  the  said  technologies  be  made   relatively  easy  to  learn  and  use,  analogous  to  the  simplicity  of  creating  HTML  pages  during  the   early  days  of  the  web.”40  The  potential  learning  curve  involved  in  Linked  Data  may  be  a  great   barrier  to  its  potential  use.  Tennant  writes  in  an  article  about  moving  away  from  MARC  to  a  more     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       17   modern  bibliographic  framework  that  users  “must  dramatically  expand  our  understanding  of  what   it  means  to  have  a  modern  bibliographic  infrastructure,  which  will  clearly  require  sweeping   professional  learning  and  retooling.”41   Even  without  considering  ease-­‐of-­‐use  difficulties  or  the  challenges  in  teaching  practitioners  an   entirely  new  bibliographic  system,  the  fact  remains  that  transitioning  away  from  MARC  toward  any   new  bibliographic  infrastructure  system  will  require  a  great  deal  of  resources,  time  and  effort.   “There  are  literally  billions  of  records  in  MARC  formats;  an  attempt  at  making  the  slightest  move   away  from  it  would  have  huge  implications  in  terms  of  resources.”42  Breeding  also  writes  of  the   potential  trauma  involved  in  shifting  away  from  MARC,  which  is  currently  integral  to  many  library   automation  systems.43  A  shift  to  anything  else  would  require  not  just  the  cooperation  of  libraries   but  also  of  vendors,  who  may  see  no  reason  to  create  systems  compatible  with  anything  other  than   MARC.  As  Tennant  writes,  “Anyone  who  has  ever  been  involved  with  migrating  from  one  integrated   library  system  to  another  knows,  even  moving  from  one  system  based  on  MARC/AACR2  to  another   can  be  daunting.”44  Moving  from  a  MARC/AACR2-­‐based  system  to  one  based  on  an  entirely  new   framework  may  be  more  of  a  challenge  than  many  libraries  would  like  to  take  on.   A  move  to  something  such  as  BIBFRAME  may  be  fraught  with  even  more  difficulty,  though  it  is   impossible  to  say  before  such  an  implementation  has  been  fully  realized.  Library  system  software   is  not  yet  compatible  with  BIBFRAME,  and  as  Kroeger  writes,  “Most  libraries  will  not  be  able  to   implement  BIBFRAME  because  their  systems  do  not  support  it,  and  software  vendors  have  little   incentive  to  develop  BIBFRAME  integrated  library  systems  without  reasonable  certainty  of  library   implementation  of  BIBFRAME.”45  This  catch-­‐22  situation  may  be  difficult  to  remedy  without  a   large  cooperative  effort  between  libraries,  vendors,  and  the  entire  information  community.   Another  potential  obstacle  to  BIBFRAME  implementation  that  Kroeger  suggests  is  the  possible   difficulty  in  providing  interoperability  with  all  of  the  many  other  metadata  standards  currently  in   existence.46  This  is  an  issue  that  Tennant  also  considers  in  his  recommendations  that  a  new   bibliographic  infrastructure  compatible  with  modern  library  and  information  needs  must  be   versatile,  extensible,  and  especially  interoperable  with  other  metadata  schemes  currently  in  use.47   XML  has  proven  to  be  useful  for  a  wide  variety  of  metadata  schemas,  but  BIBFRAME  would  need  to   be  able  to  make  library  data  held  in  a  huge  variety  of  metadata  standards  available  for  use  on  the   web.   Another  issue,  cited  by  Byrne  and  Goddard,  is  that  of  privacy.  “Librarians,  with  their  long  tradition   of  protecting  the  privacy  of  patrons,  will  have  to  take  an  active  role  in  linked  data  development  to   ensure  rights  are  protected.”48  Issues  of  copyright  and  ownership,  something  libraries  already   grapple  with  in  the  licensing  of  various  library  journals,  databases,  and  other  electronic  resources,   may  be  insurmountable.  “Libraries  no  longer  own  much  of  the  content  they  provide  to  users;   rather  it  is  subscribed  to  from  a  variety  of  vendors.  Not  only  does  that  mean  that  vendors  will  have   to  make  their  data  available  in  linked  data  formats  for  improvements  to  federated  search  to   happen,  but  a  mix  of  licensed  and  free  content  in  a  linked  data  environment  would  be  extremely     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   18   difficult  to  manage.”49  Again,  overcoming  obstacles  such  as  these  would  require  intense   negotiation  and  cooperation  between  libraries  and  vendors.  A  sustainable  and  viable  move  to  a   Linked  Data  environment  would  need  to  be  a  cooperative  effort  between  all  involved  parties  and   would  have  to  have  the  full  support  and  commitment  of  everyone  involved  before  it  could  begin  to   move  forward.   Moving  Libraries  toward  Linked  Data   Making  the  move  toward  the  use  of  Linked  Data  and  modern  bibliographic  implementations  such   as  BIBFRAME  will  require  a  great  deal  of  cooperation,  sharing,  learning,  and  investigation,  but   libraries  are  already  starting  to  look  toward  a  linked  future  and  what  it  will  take  to  get  there.   Libraries  will  need  to  begin  incorporating  the  principles  of  Linked  Open  Data  in  their  own  catalogs   and  online  resources  as  well  as  publishing  and  sharing  as  much  data  as  possible.  Libraries  also   need  to  put  forth  a  concerted  effort  to  encourage  vendors  to  move  toward  library  systems  which   can  accommodate  a  linked  data  environment.   Alemu  et  al.  write  that  cooperation  and  collaboration  between  all  of  the  involved  stakeholders  will   be  a  crucial  piece  to  the  transfer  of  library  metadata  from  catalog  to  web.  In  the  process,  and  as   part  of  this  cooperative  effort,  libraries  will  have  to  wholeheartedly  adopt  the  RDF/XML  format,   something  Alemu  et  al.  deem  “mandatory.”50  This  would  support  the  “conceptual  shift  from   perceiving  library  metadata  as  a  document  or  record  to  what  Coyle  (2010)  terms  as  actionable   metadata,  i.e.,  one  that  is  machine-­‐readable,  mash-­‐able  and  re-­‐combinable  metadata.”51   Chudnov  adds  that  libraries  will  need  to  follow  “steady  URL  patterns”  for  as  much  of  their   resources  as  possible,  one  of  the  key  rules  of  Linked  Data.  52  He  also  notes  that  we  will  know  we   have  made  progress  on  the  implementation  of  Linked  Data  when  “link  hubs  at  smaller  libraries   (aka  catalogs  and  discovery  systems)  cross  link  between  local  holdings,  authorities,  these  national   authority  files,  and  peer  libraries  that  hold  related  items,”  though  the  real  breakthrough  will  come   when  “the  big  national  hubs  add  reciprocal  links  back  out  to  smaller  hub  sites.”53  Before  this  can   happen,  however,  libraries  must  make  sure  that  all  of  their  own  holdings  link  to  each  other,  from   the  catalog  to  items  in  online  exhibits.  Chudnov  also  advocates  adding  user-­‐generated  knowledge   into  the  mix  by  allowing  users  to  make  new  connections  between  resources  when  and  where  they   can.54   Borst,  Fingerle,  and  Neubert,  in  their  conference  report  from  2009,  write  that  libraries  and   projects  using  linked  data  need  to  regard  the  catalog  as  a  network,  publish  their  data  as  Linked   Data  using  the  Semantic  Web  standards  laid  out  by  Tim  Berners-­‐Lee,  and  link  to  external  URIs.55   They  also  suggest  libraries  use  and  help  to  further  develop  open  standards  that  are  already   available  rather  than  rely  on  in-­‐house  developments.56  In  their  final  recommendation,  they  write   that  while  libraries  need  to  publish  their  data  as  open  Linked  Data  on  the  web,  they  should  also  try   to  do  so  with  the  “least  possible  restrictions  imposed  by  licences  in  order  to  ensure  widest  re-­‐ usability.”57     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       19   CONCLUSION   The  theories  behind  Linked  Data  and  the  Semantic  Web  are  still  in  the  process  of  being  drawn  out,   but  it  is  clear  that  at  this  point  they  are  more  than  hypotheticals.  Linked  Data  is  the  possible  future   of  the  web  and  how  information  will  be  organized,  searched  for,  discovered,  and  retrieved.  As   search  algorithms  continue  to  improve  and  users  continue  to  turn  to  them  first  (and  sometimes   entirely)  for  their  information  needs,  libraries  will  need  to  make  major  changes  to  ensure  the  data   they  have  painstaking  created  and  curated  over  the  decades  remains  relevant  and  reachable  to   users  on  the  web.  Linked  Data  provides  the  opportunity  for  libraries  to  integrate  their   authoritative  data  with  user-­‐generated  data  from  the  web,  creating  a  rich  network  of  reliable,   current,  far-­‐reaching  resources  that  will  meet  users’  needs  wherever  they  are.   Libraries  have  always  been  known  to  embrace  technology  to  stay  at  the  forefront  of  user  needs   and  provide  unique  and  irreplaceable  user  services.  To  stay  current  with  shifts  in  modern   technology  and  user  behavior,  libraries  need  to  be  a  driving  force  in  the  implementation  of  Linked   Data,  embrace  Semantic  Web  standards,  and  take  full  advantage  of  the  benefits  and  opportunities   they  present.  Ultimately,  libraries  can  leverage  the  advantages  created  by  Linked  Data  to  construct   a  better  information  experience  for  users,  keeping  libraries  both  a  relevant  and  more  highly  valued   part  of  information  retrieval  in  the  twenty-­‐first  century.   REFERENCES     1.   Leif  Andresen,  “After  MARC—What  Then?”  Library  Hi  Tech  22,  no.  1  (2004):  41.   2.     Ibid.,  40-­‐51.   3.     Ibid.,  43.     4.     Roy  Tennant,  “MARC  Must  Die,”  Library  Journal  127,  no.  17  (2002):  26–28,   http://lj.libraryjournal.com/2002/10/ljarchives/marc-­‐must-­‐die/#_.     5.     Andresen,  “After  MARC—What  Then?”   6.   Tenant,  “MARC  Must  Die.”   7.     Andresen,  “After  MARC—What  Then?”,  43.   8.     Tenant,  “MARC  Must  Die.”     9.     Getaneh  Alemu  et  al.,  “Linked  Data  for  Libraries:  Benefits  of  a  Conceptual  Shift  From  Library-­‐ Specific  Record  Structures  to  RDF-­‐based  Data  Models,”  New  Library  World  113,  no.  11/12   (2012):  549-­‐570,  http://dx.doi.org/10.1108/03074801211282920.     10.    Marshall  Breeding,  “Linked  Data:  The  Next  Big  Wave  or  Another  Tech  Fad?,”  Computers  in   Libraries  33,  no.  3  (2013):  20-­‐22,  http://www.infotoday.com/cilmag/.     11.    Alemu  et  al.,  “Linked  Data  for  Libraries.”     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   20     12.    Mauro  Guerrini  and  Tiziana  Possemato,  “Linked  Data:  A  New  Alphabet  for  the  Semantic  Web,”   Italian  Journal  of  Library  &  Information  Science  4,  no.  1  (2013):  79-­‐80,     http://dx.doi.org/10.4403/jlis.it-­‐6305.   13.    Tom  Baker,  “Designing  Data  for  the  Open  World  of  the  Web,”  Italian  Journal  of  Library  &   Information  Science  4,  no  1  (2013):  64,  http://dx.doi.org/10.4403/jlis.it-­‐6308.   14.    Tim  Berners-­‐Lee,  “Linked  Data,”  W3.org,  last  modified  June  18,   2009,  http://www.w3.org/DesignIssues/LinkedData.html.     15.    Karen  Coyle,  “Library  Linked  Data:  An  Evolution,”  Italian  Journal  of  Library  &  Information   Science  4,  no  1  (2013):  58,  http://dx.doi.org/10.4403/jlis.it-­‐5443.   16.    Gianfranco  Crupi,  “Beyond  the  Pillars  of  Hercules:  Linked  Data  and  Cultural  Heritage,”  Italian   Journal  of  Library  &  Information  Science  4,  no.  1  (2013),  36,    http://dx.doi.org/10.4403/jlis.it-­‐ 8587.     17.    Coyle,  “Library  Linked  Data:  An  Evolution,”  56.       18.    Ibid.,  56-­‐57.     19.    Crupi,  “Beyond  the  Pillars  of  Hercules,”  35.       20.   Alemu  et  al.,  “Linked  Data  for  Libraries,”  562.   21.    Ibid.   22.   Thea  Lindquistet  al.,  “Using  Linked  Open  Data  to  Enhance  Subject  Access  in  Online  Primary   Sources,”  Cataloging  &  Classification  Quarterly  51  (2013):  913-­‐928,   http://dx.doi.org/10.1080/01639374.2013.823583.   23.    Barbara  Tillett,  “RDA  and  the  Semantic  Web,  Linked  Data  Environment,”  Italian  Journal  of   Library  &  Information  Science  4,  no.  1  (2013):  140,  http://dx.doi.org/10.4403/jlis.it-­‐6303.     24.    Alemu  et  al.,  “Linked  Data  for  Libraries.”     25.    Tillett,  “RDA  and  the  Semantic  Web,  Linked  Data  Environment,”  140.     26.    Gillian  Byrne  and  Lisa  Goddard,  “The  Strongest  Link:  Libraries  and  Linked  Data,”  D-­‐Lib   Magazine  16,  no.  11/12  (2010),  http://dx.doi.org/10.1045/november2010-­‐byrne.   27.    Alemu  et  al.,  “Linked  Data  for  Libraries,”  560.   28.    Library  of  Congress,  Bibliographic  Framework  as  a  Web  of  Data:  Linked  Data  Model  and   Supporting  Services,  (Washington,  DC:  Library  of  Congress,  November  21  2012),   http://www.loc.gov/bibframe/pdf/marcld-­‐report-­‐11-­‐21-­‐2012.pdf.   29.    Barbara  Tillett,  “What  is  FRBR?  A  Conceptual  Model  for  the  Bibliographic  Universe,”  Library  of     LINKING  LIBRARIES  TO  THE  WEB  |  GONZALES       21     Congress,  2003,    http://www.loc.gov/cds/downloads/FRBR.PDF.   30.    “BIBFRAME  Frequently  Asked  Questions,”  Library  of  Congress,   http://www.loc.gov/bibframe/faqs/#q04.   31.    Ibid.   32.    Lindquist  et  al.,  “Using  Linked  Open  Data  to  Enhance  Subject  Access  in  Online  Primary   Sources,”  923.   33.    Alan  Danskin,  “Linked  and  Open  Data:  RDA  and  Bibliographic  Control.”  Italian  Journal  of   Library  &  Information  Science  4,  no.  1  (2013):  157,  http://dx.doi.org/10.4403/jlis.it-­‐5463.   34.    Erik  T.  Mitchell,  “Three  Case  Studies  in  Linked  Open  Data.”  Library  Technology  Reports  49,  no.  5   (2013):  26-­‐43.  http://www.alatechsource.org/taxonomy/term/106.   35.    Angela  Kroeger,  “The  Road  to  BIBFRAME:  The  Evolution  of  the  Idea  of  Bibliographic  Transition   into  a  Post  MARC  Future,”  Cataloging  &  Classification  Quarterly  51  (2013):  881,   http://dx.doi.org/10.1080/01639374.2013.823584.   36.    Jason  W.  Dean,  “Charles  A.  Cutter  and  Edward  Tufte:  Coming  to  a  Library  near  You,  via   BIBFRAME,”  In  the  Library  with  the  Lead  Pipe,  December  4,  2013,   http://www.inthelibrarywiththeleadpipe.org/2013/charles-­‐a-­‐cutter-­‐and-­‐edward-­‐tufte-­‐ coming-­‐to-­‐a-­‐library-­‐near-­‐you-­‐via-­‐bibframe/  .     37.    “BIBFRAME  Frequently  Asked  Questions,”  Library  of  Congress,   http://www.loc.gov/bibframe/faqs/#q04.   38.    Daniel  Chudnov,  “What  Linked  Data  Is  Missing,”  Computers  in  Libraries  31,  no.  8  (2011):  35-­‐ 36,http://www.infotoday.com/cilmag.   39.    Byrne  and  Goddard,  “The  Strongest  Link:  Libraries  and  Linked  Data.”   40.    Alemu  et  al.,  “Linked  Data  for  Libraries,”  557.   41.    Roy  Tennant,  “A  Bibliographic  Metadata  Infrastructure  for  the  Twenty-­‐First  Century,”  Library   Hi  Tech  22,  no.  2  (2004):  175-­‐181,  http://dx.doi.org/10.1108/07378830410524602.   42.    Alemu  et  al.,  “Linked  Data  for  Libraries,”  556.     43.    Breeding,  “Linked  Data.”   44.    Tennant,  “A  Bibliographic  Metadata  Infrastructure  for  the  Twenty-­‐First  Century.”   45.    Kroeger,  “The  Road  to  BIBFRAME,”  884-­‐885.   46.    Ibid.   47.    Tennant,  “A  Bibliographic  Metadata  Infrastructure  for  the  Twenty-­‐First  Century.”     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  DECEMBER  2014   22     48.    Byrne  and  Goddard,  “The  Strongest  Link:  Libraries  and  Linked  Data.   49.    Ibid.   50.    Alemu  et  al.,  “Linked  Data  for  Libraries.”   51.    Ibid.,  563.     52.    Chudnov,  “What  Linked  Data  is  Missing.”   53.    Ibid.   54.    Ibid.   55.    Timo  Borst,  Birgit  Fingerle,  and  Joachim  Neubert,  “How  Do  Libraries  Find  Their  Way  onto  the   Semantic  Web?”  Liber  Quarterly  19,  no  3/4  (2010):  336–43,   http://liber.library.uu.nl/index.php/lq/article/view/7970/8271.     56.    Ibid.   57.   Ibid.,  342-­‐343.