Uploaded image for project: 'VIVO'
  1. VIVO
  2. VIVO-15

Trap for characters that cause search indexing to abort

    Details

    • Attachments:
      1
    • Comments:
      7

      Description

      UF had trouble doing a complete reindex with VIVO 1.5.1 and eventually tracked down a formfeed character that had been pasted into VIVO in 2011 from a PDF. The same error can't be triggered again by pasting into an editing form, but the bad data can be re-introduced by uploading an n-triples file, and still causes the indexing error.

      This error may well predate 1.5 since they had been noticing that merged organizations had not been removed from the search index for some time while still running 1.4.1, so the index may not have been updating successfully.

      The bad data has been removed, but to avoid this problem again it would be helpful to trap for characters that break the indexing, whether a bug in VIVO, in Jena, or in Solr. If VIVO could at least catch the exception and ignore the record rather than abort the indexing process that would be a big improvement.

        Attachments

          Activity

            People

            • Assignee:
              bdc34 Brian Caruso
              Reporter:
              jc55 Jon Corson-Rikert
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1 week, 1 day
                1w 1d
                Remaining:
                Time Spent - 7 hours Remaining Estimate - 1 week, 1 hour
                1w 1h
                Logged:
                Time Spent - 7 hours Remaining Estimate - 1 week, 1 hour
                7h