Nutch

Last updated about 4 hours ago.

Non-Compliant

What is Nutch?

About

Nutch is a highly extensible, highly scalable, matured, production-ready Web crawler which enables fine grained configuration and accomodates a wide variety of data acquisition tasks.

Operator

Apache Software Foundation

Documentationnutch.apache.org

See how often Nutch visits your website by setting up Spyglasses analytics. Set up tracking

Did you find Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/605.1.16 (KHTML, like Gecko; compatible; Friendly_Crawler/2.0) Chrome/120.0.6099.217 Safari/605.1.15/Nutch-1.20-SNAPSHOT in your logs?

If you've seen Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/605.1.16 (KHTML, like Gecko; compatible; Friendly_Crawler/2.0) Chrome/120.0.6099.217 Safari/605.1.15/Nutch-1.20-SNAPSHOT in your website logs, it indicates that Nutch has been visiting your site. This agent string is one of the known identifiers for this bot.

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/605.1.16 (KHTML, like Gecko; compatible; Friendly_Crawler/2.0) Chrome/120.0.6099.217 Safari/605.1.15/Nutch-1.20-SNAPSHOT

Track and manage Nutch visits to your website with Spyglasses' real-time bot detection. Start tracking

Did you find NutchCVS/0.7.1 (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org) in your logs?

If you've seen NutchCVS/0.7.1 (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org) in your website logs, it indicates that Nutch has been visiting your site. This agent string is one of the known identifiers for this bot.

NutchCVS/0.7.1 (Nutch; http://lucene.apache.org/nutch/bot.html; nutch-agent@lucene.apache.org)

Track and manage Nutch visits to your website with Spyglasses' real-time bot detection. Start tracking

Did you find istellabot-nutch/Nutch-1.10 in your logs?

If you've seen istellabot-nutch/Nutch-1.10 in your website logs, it indicates that Nutch has been visiting your site. This agent string is one of the known identifiers for this bot.

istellabot-nutch/Nutch-1.10

Track and manage Nutch visits to your website with Spyglasses' real-time bot detection. Start tracking

Expected Behavior

This bot visits websites for various purposes including content analysis, data collection, or automated tasks. Its behavior may vary depending on its specific function and configuration.

Should I Block Nutch?

This bot is marked as non-compliant, which may mean it doesn't respect robots.txt or engages in aggressive crawling behavior. You may want to consider blocking it if it's causing issues for your site.

Recommended Solution

Instead of manually managing robots.txt rules, use Spyglasses to automatically detect and manage Nutch traffic with real-time analytics and flexible blocking rules.

Get Automated Bot Management

How Do I Block Nutch?

You can block this bot or limit its access by setting user agent token rules in your website's robots.txt file. Use Spyglasses analytics to check whether it's actually following your rules.

User Agent Tokens

NutchShould match instances of this bot

robots.txt

# robots.txt
# This should block Nutch

User-agent: Nutch
Disallow: /

Instead of doing this manually, use Spyglasses to keep your rules updated automatically with the latest AI agents and crawlers. Set up automatic bot management

Manage Nutch Traffic with Spyglasses

Get real-time alerts when bots visit your site, automatically generate robots.txt rules, and integrate bot traffic data with your existing analytics tools.

Start Free Trial