Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-2442

Extracted TEXT is always publicly searchable, even if the source file is restricted

    Details

    • Type: Bug
    • Status: Volunteer Needed (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.2, 5.0
    • Fix Version/s: None
    • Component/s: Discovery, DSpace API
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      5
    • Documentation Status:
      Not Required

      Description

      When running the "media-filter" on an Item with an access restricted bitstream (file), the extracted TEXT bitstream will be access restricted properly. HOWEVER, the extracted TEXT will be indexed (by Discovery) as publicly searchable.

      This means that when you anonymously search content within DSpace, you'll sometimes get results back from access restricted files.

      See this dspace-tech thread for more info:
      http://dspace.2283337.n4.nabble.com/Possible-bug-in-restricted-PDFs-extracted-text-indexing-tc4676153.html

      It seems that Discovery is just not paying attention to any access restrictions on Bitstreams, it only checks access restrictions on Items.

      This bug is also briefly mentioned in DS-2256, which is only loosely related.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tdonohue Tim Donohue
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: