Archive for the ‘Metadata’ Category

MARCOut to MARCIn(gest)

Thursday, September 20th, 2007

We’ve been working with DPL to batch ingest more than 120,000 archival digital images (tiffs) from Western History and Geaneology (WHG) into the ADR, along with the associated MARC files extracted from DPL’s CARL ILS…

Keith has written a batch ingest utility that can transform raw MARC to MARC XML and then generate “sidecars” of cross-walked metadata (MODS, DC, etc.) for ingest along with the tiff into Fedora, and then indexing in Fez. The utility also reports orphaned records and files along with any malformed data. Up next is to broaden to other schemas and formats…

Challenges to date:

  • How Fez handles displaying and editing multiple repeating fields that do not include attributes (i.e. <title> vs <title type=”alternative”>
  • Estimating real-time processing speeds (i.e. How long, per image, does it take to get from CD at DPL to published object in ADR)
  • How to handle the lack of mapping of the local call number field (099) to MODS in the LC crosswalk

We’ve got over 1500 objects ingested, indexed, and access controlled at the moment in our production environment…a little over 1%…but it’s a start!