Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-2411

DSpace AIP ingester is "memory intensive" for large batch ingest sets

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.3, 3.3, 4.2, 5.0
    • Fix Version/s: 5.0
    • Component/s: DSpace API
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      2
    • Documentation Status:
      Not Required

      Description

      When performing an AIP ingest on a large batch set of AIPs, each created/restored DSpaceObject is kept in memory. This causes the ingest process to be very memory intensive for large restorations.

      The offending class is the AbstractDSpaceIngester, which stores a DSpaceObject for every object that it successfully restores/replaces/ingests:

      https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/content/packager/AbstractPackageIngester.java#L81

      Past versions of DSpace also had this problem...for example in DSpace 4.x:
      https://github.com/DSpace/DSpace/blob/dspace-4_x/dspace-api/src/main/java/org/dspace/content/packager/AbstractPackageIngester.java#L63

      Since DSpaceObjects can be large, the more successfully ingested content the more memory will be necessary to complete the ingestion.

      PR coming shortly to fix this issue and instead just store the successfully ingested object's identifier (i.e. Handle), rather than the entire object.

        Attachments

          Activity

            People

            • Assignee:
              tdonohue Tim Donohue
              Reporter:
              tdonohue Tim Donohue
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: