Documentation

About SpyglassesBot

Technical documentation for the Spyglasses web crawler — what it does, how to identify it, and how to control access.

Overview

SpyglassesBot is the web crawler operated by Spyglasses, an AI visibility analytics platform. It fetches publicly available web content to help site owners understand how their pages appear to AI assistants like ChatGPT, Claude, and Perplexity.

SpyglassesBot respects robots.txt directives and is designed to be a well-behaved crawler. It supports Cloudflare Web Bot Auth for cryptographic request verification.

What SpyglassesBot Does

SpyglassesBot accesses web pages on behalf of Spyglasses users for the following purposes:

AI Visibility Reports

Crawls a sample of pages from a site to assess how visible the content is to AI assistants. This includes checking whether pages return meaningful content (vs. JavaScript-only rendering), whether Cloudflare or other bot protection is blocking access, and how well the content is structured for AI consumption.

AI Access Checking

Fetches a site's robots.txt to determine which AI crawlers are allowed or blocked. Also checks whether Cloudflare bot protection is preventing access to the site entirely.

Sitemap Import

Fetches and parses sitemaps (typically at /sitemap.xml or as referenced in robots.txt) to discover pages that belong to a property registered with Spyglasses.

AI Readiness Audits

Crawls pages to evaluate technical SEO factors that affect AI visibility — content structure, metadata, schema markup, and accessibility to headless fetchers.

FAQ Generation

Fetches page content to generate FAQ suggestions based on the site's existing content and AI assistant behavior.

Brand Consistency Monitoring

Fetches pages from competitor sites (that the user has specified) to monitor how brands are being represented and cited across the web.

Citation Resolution

Resolves redirect chains in URLs cited by AI assistants to determine the final destination and verify that citations point to the correct content.

How to Identify SpyglassesBot

SpyglassesBot identifies itself via the User-Agent HTTP header. All user-agent strings follow a consistent format:

Mozilla/5.0 (compatible; SpyglassesBot/1.0; +https://spyglasses.io/docs/help/spyglasses-bot) <Purpose>

The specific user agents are:

User AgentPurpose
...SpyglassesBot/1.0;...) AI Visibility ReportAI Visibility Reports
...SpyglassesBot/1.0;...) AI Access Checkerrobots.txt and AI access analysis
...SpyglassesBot/1.0;...) Sitemap FetcherSitemap discovery and import
...SpyglassesBot/1.0;...) AI Readiness AuditAI Readiness Audits
...SpyglassesBot/1.0;...) FAQ GenerationFAQ generation
...SpyglassesBot/1.0;...) Brand Consistency CheckerBrand monitoring
...SpyglassesBot/1.0;...) Citation ResolverCitation redirect resolution

All SpyglassesBot user agents can be matched with a single substring check for SpyglassesBot. The robots.txt token is SpyglassesBot.

Crawling Behavior

  • Respects robots.txt: SpyglassesBot honors all User-Agent: * and User-Agent: SpyglassesBot directives in robots.txt, including Disallow, Allow, and Crawl-delay.
  • On-demand only: SpyglassesBot does not continuously crawl the web. It only fetches pages when a Spyglasses user requests an analysis of their own site or a site they are monitoring.
  • Rate limited: Requests are rate-limited and use timeouts to avoid overloading target servers.
  • No indexing: SpyglassesBot does not build a search index. Content is fetched, analyzed, and the analysis results are stored — the raw HTML is not retained.

Controlling Access

Allow SpyglassesBot

No special configuration is needed if your robots.txt allows all crawlers (User-Agent: * with no relevant Disallow rules).

Block SpyglassesBot

Add the following to your robots.txt:

User-agent: SpyglassesBot
Disallow: /

Allow SpyglassesBot through Cloudflare

If your site uses Cloudflare bot protection, SpyglassesBot supports Web Bot Auth — a cryptographic verification protocol that proves requests genuinely come from Spyglasses. This means Cloudflare can verify our identity without relying on user-agent strings alone.

Our public signing key is hosted at:

https://spyglasses.io/.well-known/http-message-signatures-directory

If SpyglassesBot is registered as a Cloudflare verified bot, requests will be automatically allowed through Cloudflare's bot protection without any configuration on your part.

To manually allow SpyglassesBot in Cloudflare:

  1. Go to SecurityWAFCustom Rules
  2. Create a rule matching User Agent contains "SpyglassesBot"
  3. Set the action to Skip → All remaining custom rules

For more details, see Cloudflare Is Blocking Access to Your Site.

Verification

SpyglassesBot requests originate from Spyglasses infrastructure. You can verify the bot's identity through:

  1. User-Agent header: All requests include a user-agent string containing Spyglasses.
  2. Web Bot Auth signatures: When communicating with Cloudflare-protected sites, requests include Signature, Signature-Input, and Signature-Agent headers per RFC 9421.

Our Web Bot Auth key directory is at:

https://spyglasses.io/.well-known/http-message-signatures-directory

Contact

If you have questions or concerns about SpyglassesBot, or if you believe it is behaving unexpectedly: