Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-3286

Slow batch operations due to Hibernate caching

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.1, 6.2
    • Component/s: None
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      8
    • Documentation Status:
      Needed

      Description

      As Tom Desair noted in DS-3086, all batch operations in DSpace (those which employ long-running database transactions) are susceptible to memory over-use due to Hibernate caching. This can manifest itself as a large slowdown after processing a few hundred items in batch operations.

      I have confirmed that this is the case with the dspace ingest command, which ingests a directory full of items in Simple Archive Format. This currently works by doing all the work in one giant database transaction. After a couple hundred items, there is a very noticeable slowdown in processing, to the point where ingesting several thousand items would be impractical.

      I have also confirmed that by using the new Context.enableBatchMode, Context.getCacheSize, and Context.commit methods (commit has the effect of clearing the Hibernate cache as well as committing the underlying db connection), the problem goes away with ingest.

      It is very likely that this problem also exists for several, if not all of the following:

      • Discovery reindexing ( except the -i option, which already employs the new methods to avoid memory over-use...see the PR for DS-3271 )
      • Curation tasks
      • Item import/export
      • CSV batch export and modification.
      • Possibly others

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                cwilper Chris Wilper
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: