Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-2823

Discovery doesn't let you LIMIT the size of full text indexed

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 4.0, 4.1, 4.2, 4.3, 5.0, 5.1, 5.2, 5.3
    • Fix Version/s: None
    • Component/s: Discovery, DSpace API, Solr
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      7
    • Documentation Status:
      Not Required

      Description

      This bug was discovered in a DSpaceDirect production site.

      Discovery allows no means to limit the amount of full text that a given site wishes to index for search/browse. The legacy Lucene search engine supported the "search.maxfieldlength" setting in dspace.cfg:
      https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/dspace.cfg#L297

      However, Discovery ignores the "search.maxfieldlength" setting and always indexes the full text of the files in the TEXT bundle:
      https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L1373

      This means that if a site attempts to store large text-based files in DSpace, the Solr index may grow rapidly. This could cause memory issues if many of these documents are loaded into memory at once (but DS-2832 solved some of those issues already)

      Nonetheless, Discovery should provide a way to limit the size of the full text that is indexed, for sites that need this capability or wish to better control the size of their Solr Index.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tdonohue Tim Donohue
                Reporter:
                tdonohue Tim Donohue
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: