Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-3673

Robots/Crawlers: Pull latest botlist from COUNTER github instead of managing our own list

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: More Details Needed (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      3
    • Documentation Status:
      Needed

      Description

      We have been working together with COUNTER to move the management of their botlist to Github. The result is available at: https://github.com/atmire/COUNTER-Robots

      We are proposing to get rid of the list that we manage in the DSpace codebase at:

      https://github.com/DSpace/DSpace/blob/master/dspace/config/spiders/agents/example 

      and rely on the COUNTER one instead.

      Few questions that we should decide on together:

      Specific version VS master of the list

      If we pull the master of the list, you always get the latest changes, however, it can be hard to track which version of the list you're using exactly.

      If we would lock down on a specific tag, we have a clear spec on which version of the list we're using, but then we risk of not having the latest bots.

      I think we should let this one depend on what COUNTER will advise, not 100% clear at this point, but likely they will advise some version/tag of the list, and update their recommendation in a specific frequency (annually?)

      Update at build, at tomcat restart or during tomcat runtime

      If we decide to track master, we could opt to:

      • Load the list at build time
      • Load the list at tomcat restart
      • Have tomcat check every X hours or days for updates while it's running

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                bram Bram Luyten (Atmire)
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: