Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-3672

Optimizations of database access for batch operations are fragile

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Volunteer Needed (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 6.0, 6.1, 6.2, 7.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      3
    • Documentation Status:
      Needed

      Description

      Recent issues have uncovered deeper problems with the way DSpace interacts with Hibernate, particularly in the optimization of batch operations.  Various losses of cache coherence have been seen, typically ending in "no-update UPDATE" exceptions (StaleStateException) or "no Session" complaints.

      DSpace tries to manage Hibernates caches to avoid building up huge inventories of objects that are fully processed but not yet committed.  But we've tried to do this close to the Session and far from the code that understands how batches are processed.  We may be better off to let the persistence layer manage its caches in detail, and have the batch code simply limit the amount of work done before commitment, by breaking work into multiple commits if necessary.  Many blog posts and StackOverflow articles point to this pattern.

      Some resources:

       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              mwood Mark H. Wood
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated: