Bagging Meeting Notes 2016-04-19

From aptrust
Jump to: navigation, search

Present: Andrew Diamond, Tim DiLauro, Jamie Little, Steve Morris, Mike Marttila, Terry Brady  (remote), Nathan Tallman

  • Is it possibly for APTrust to use reverse domain naming for institution identifiers (Java packet style)?
    • would allow for semantic bag naming, identifying the bag source within the institution
      • e.g. edu.jhu.library.colleciton.id
    • most use domain.edu for institution identifier, but not all
      • Andrew is looking into identifier conventions, can be changed
  • When restoring bags, it's not always clear which tag-files were added by the institution and which were added by APTrust.
    • An institution is interested in using tag files to reference encryption keys for encrypted content (but not the actual key).
    • Andrew cautioned of single-point of failure if the key or tag file is deleted/not included.
    • Nathan also asked if original owners/file permissions are retained and restored. Andrew will look into it.
  • APTrust bag specification maintenance
    • Andrew updates the wiki whenever there are new features or changes. (Andrew has been making changes if several members are requesting a change or bring up the same issue.)
      • Would be good to have a larger group should weigh-in on proposed changes. Who is the appropriate group?
    • Our bag specification is more restrictive than LOC bag-it specification. We should probably avoid this.
  • Restrictions to POSIX naming conventions for files and directories in bags are problematic for some members.
    • Forces institution to rename files for long-term preservation -- sometimes at odds with the mission
    • Issue stems from encoded URL entities for Fluctis -- might be possible to double encode URL entities.
      • Who makes this call?
  • When bag access level is set to private, to whom is it private? What is the difference between private and institution access levels?
    • Andrew wasn't sure, perhaps private to institutional admin? Will look into it.
  • At what level are institutions bagging their content? Item/work level, collection level, or platform level?
    • Fluctis should be able to handle it all. (Though platform-level Fedora content presents the POSIX issue and there are current performance issues with bags containing a lot of individual files.)
    • Each institution might define item/work, collection, and platform differently.
    • Would be good to document pros/cons of each approach, get the 20K foot-view of what the bag looks like, and document "gotcha" concerns.
  • Other bagging practices it would be helpful to know how others are handling
    • Unprocessed digital archives
    • What's included
      • Master files only? Master and access files?
    • Metadata
      • Inside the data directory? As a tag file?
      • How to update metadata only? (It's possible to upload a partial bag, including only metadata to update current bags with new metadata, without having to re-transfer content itself.)
    • Namespacing of bags
    • Partner tools -- how are people using them?
  • Nathan will take lead in drafting a survey of bagging practices, incorporating above points.
  • Nathan was appointed working group chair.