Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-849

create a non-Porter Stemming analyzer for DSpace

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2, 1.7.0
    • Fix Version/s: 1.8.0
    • Component/s: Documentation, DSpace API
    • Labels:
      None
    • Attachments:
      3
    • Comments:
      4
    • Documentation Status:
      Needed

      Description

      For some use cases for DSpace, the index produced by the standard search analyzer (org.dspace.search.DSAnalyzer) produces unsatisfactorily imprecise results. Creating an alternate analyzer, which omits PorterStemFilter, will be helpful in those use cases. See these threads for more of the backstory:

      http://comments.gmane.org/gmane.comp.db.dspace.user/13404
      http://comments.gmane.org/gmane.comp.db.dspace.user/13407
      http://comments.gmane.org/gmane.comp.db.dspace.user/13420
      http://comments.gmane.org/gmane.comp.db.dspace.user/13427

      I'm attaching a patch, but it's more of a kit. You must first copy [dspace-src]/dspace-api/src/main/java/org/dspace/search/DSAnalyzer.java to [dspace-src]/dspace-api/src/main/java/org/dspace/search/DSNonStemmingAnalyzer.java, then you can apply the patch.

      After patching, you must alter your dspace.cfg file, uncommenting and changing the search.analyzer line so that it reads:

      search.analyzer = org.dspace.search.DSNonStemmingAnalyzer

      Then, do the following:

      • stop Tomcat (taking down your DSpace instance)
      • re-index all content in your DSpace by running:
        [dspace]/bin/dspace index-init
      • start Tomcat
      • test

      All credit for this work goes to Tim Donohue and Stuart Yeates, I just put the pieces together into this patch and ticket.

        Attachments

          Activity

            People

            • Assignee:
              tdonohue Tim Donohue
              Reporter:
              hardyoyo Hardy Pottinger
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: