ArchiveBot

Last updated 1 hour ago.

Non-CompliantCrawler

What is ArchiveBot?

About

ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC file, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive's Wayback Machine (or other archive sites). NOTE: This bot is NOT run by the Internet Archive! Learn more: https://github.com/ArchiveTeam/ArchiveBot

See how often ArchiveBot visits your website by setting up Spyglasses analytics. Set up tracking

Did you find ArchiveTeam ArchiveBot/20170106.02 (wpull 2.0.2) in your logs?

If you've seen ArchiveTeam ArchiveBot/20170106.02 (wpull 2.0.2) in your website logs, it indicates that ArchiveBot has been visiting your site. This agent string is one of the known identifiers for this bot.

ArchiveTeam ArchiveBot/20170106.02 (wpull 2.0.2)

Track and manage ArchiveBot visits to your website with Spyglasses' real-time bot detection. Start tracking

Expected Behavior

Web crawlers visit websites on a regular schedule to index content for search engines or other services. They typically follow a consistent crawling pattern and respect robots.txt directives.

Should I Block ArchiveBot?

This bot is marked as non-compliant, which may mean it doesn't respect robots.txt or engages in aggressive crawling behavior. You may want to consider blocking it if it's causing issues for your site.

Recommended Solution

Instead of manually managing robots.txt rules, use Spyglasses to automatically detect and manage ArchiveBot traffic with real-time analytics and flexible blocking rules.

Get Automated Bot Management

How Do I Block ArchiveBot?

You can block this bot or limit its access by setting user agent token rules in your website's robots.txt file. Use Spyglasses analytics to check whether it's actually following your rules.

User Agent Tokens

ArchiveBotShould match instances of this bot

robots.txt

# robots.txt
# This should block ArchiveBot

User-agent: ArchiveBot
Disallow: /

Instead of doing this manually, use Spyglasses to keep your rules updated automatically with the latest AI agents and crawlers. Set up automatic bot management

Manage ArchiveBot Traffic with Spyglasses

Get real-time alerts when bots visit your site, automatically generate robots.txt rules, and integrate bot traffic data with your existing analytics tools.

Start Free Trial