Uploaded image for project: 'Islandora'
  1. Islandora
  2. ISLANDORA-1751

Retroactive checksum application times out on very large collections

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.x-1.8
    • Component/s: Checksum
    • Labels:
      None

      Description

      While Islandora Checksum technically applies retroactive checksums using the Drupal Batch API, it doesn't actually batch through them - meaning that it gathers a complete list of items to apply checksums to, and applies them in a single batch definition. This causes timeouts or memory limits to be reached on exceptionally large collections.

      Steps to Reproduce
      With checksumming enabled, retroactively apply checksums to a collection with a massive number of objects. If the amount of time it takes to retroactively apply checksums is greater than PHP's max execution time, the batch will time out and fail. If the size of the batch definition array is greater than PHP's memory limit, it can also be expected to fail.

      The fix
      To fix, we should apply checksums in batch chunks of, say, 10. We should also create a drush script that uses this functionality; that way, we can take advantage of Drush's Batch API magic. We should also complete ISLANDORA-1660 in conjunction with this; the extra time it takes to generate checksums due to derivative generation and hook firing is negatively impacting the amount of time the retroactive batch takes.

        Attachments

          Activity

            People

            • Assignee:
              daitken Daniel Aitken
              Reporter:
              daitken Daniel Aitken
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: