365+  Bots-Spiders-Scrapers

#1
stryder Offline
Since some people thing it's alright to scrape, I'm just going to point out that there is actually conditions here for those that intend to use Bots, Spiders and/or Scrapers.

Any Bots, Spiders or Scrapers that want to legitimate access the site are required to identify themselves by their USER_AGENT. This is so the can be placed into the Spider/bot group and be seen as accessing the site.

There is subtle differences between those that are legitimate and those that are not. Legitimate has a cut down version of the sites Theme, removing some links that bots sometimes otherwise get lost in loops over. (No point following a bunch of self-inceptive navigation links that go no where.)

Bots that use a USER_AGENT string to identify themselves can be found listed HERE!

Any Bots, Spiders or Scrapers that attempt to camouflage as potential Human users will likely find themselves Throttled or Banned outright.
Any Bots, Spiders or Scrapers that don't pay attention to the Robots.txt of the site and/or attempt to pull more than 10 pages a minute will likely be Throttled or Banned outright.




Users browsing this thread: 1 Guest(s)