Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-466

Add ability to export/import entire Community/Collection/Item structure (for easier backups, migrations, etc.)

    Details

    • Type: New Feature
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 1.7.0
    • Component/s: DSpace API
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      4
    • Documentation Status:
      In Comments

      Description

      This comes out of a requirement for DSpace integration with DuraCloud (http://www.duraspace.org/duracloud.php). One of these requirements is to be able to essentially "backup" local DSpace contents into the cloud (as a type of offsite backup), and "restore" those contents at a later time.

      Essentially, we'd like a way to be able to export the entire hierarchy (i.e. bitstreams, metadata and relationships between Communities/Collections/Items) into a relatively standard format (e.g. METS or similar structured packaging format). This entire hierarchy should also be able to be re-imported into DSpace in the same format, to allow for "roundtripping" of that content (essentially a restore of that content in the same or different DSpace installation).

      Perceived benefits to DSpace community:

      • Would allow folks to more easily move entire Communities or Collections between DSpace instances.
      • Would allow for a potentially more consistent backup of this hierarchy (e.g. to DuraCloud, or just to your own local backup system), rather than relying on synchronizing a backup of your DB (metadata/relationships) and assetstore (bitstreams).
      • Would provide a way for people to more easily get their data out of DSpace (whatever the purpose may be).
      • Would provide a relatively standard format for people to migrate entire hierarchies (Communities/Collections) into DSpace (from another system).

      Known Issues:

      • Exporting/Importing the Community/Collection/Item hierarchy technically doesn't cover all the "content" held in DSpace. There are also Groups, EPeople and permissions/rights (which would get you closer to a full export/import of all DSpace content). However, concentrating on just the hierarchy of Community/Collection/Item seems like a good first step.

      This is related to (and a partial subset of) MIT's AIP Prototype: http://jira.dspace.org/jira/browse/DS-465 However, the AIP prototype currently does not make it very easy to re-import the exported AIPs for Communities or Collections. So, this feature would extend on the AIP prototype's current packagers/crosswalks to allow for an full export and import of an entire DSpace hierarchy, or just a set of Communities, Collections or Items.

      My current plan is to build off of the subset of the AIP prototype (essentially the packagers, crosswalks and related changes) which begins to allow for this roundtripping of Communities and Collections. I'll be adding a new SVN sandbox area for this work (so that others can help out, if it interests them). If anyone has comments, suggestions or feedback on this idea, or would like to be involved in this project, definitely let me know (or add comments to this issue).

      This work is being prototyped in the SVN Sandbox at:
      http://scm.dspace.org/svn/repo/sandbox/aip-external-1_6-prototype/

      More details on this project available on the Wiki at:
      http://wiki.dspace.org/confluence/display/DSPACE/AipBackupRestorePrototype

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tdonohue Tim Donohue
                Reporter:
                tdonohue Tim Donohue
              • Votes:
                1 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: