# About SpyglassesBot

> Technical documentation for the Spyglasses web crawler — what it does, how to identify it, and how to control access.

## Overview

**SpyglassesBot** is the web crawler operated by [Spyglasses](https://spyglasses.io), an AI visibility analytics platform. It fetches publicly available web content to help site owners understand how their pages appear to AI assistants like ChatGPT, Claude, and Perplexity.

SpyglassesBot respects `robots.txt` directives and is designed to be a well-behaved crawler. It supports [Cloudflare Web Bot Auth](https://developers.cloudflare.com/bots/reference/bot-verification/web-bot-auth/) for cryptographic request verification.

<a href="https://webbotauth.net/check?domain=www.spyglasses.io" title="Verify www.spyglasses.io on WebBotAuth.net" target="_blank" rel="noopener"><img src="https://webbotauth.net/badge/www.spyglasses.io.svg" alt="Web Bot Auth: Grade A, verified by WebBotAuth.net" height="28" /></a>

## What SpyglassesBot Does

SpyglassesBot accesses web pages on behalf of Spyglasses users for the following purposes:

### AI Visibility Reports

Crawls a sample of pages from a site to assess how visible the content is to AI assistants. This includes checking whether pages return meaningful content (vs. JavaScript-only rendering), whether Cloudflare or other bot protection is blocking access, and how well the content is structured for AI consumption.

### AI Access Checking

Fetches a site's `robots.txt` to determine which AI crawlers are allowed or blocked. Also checks whether Cloudflare bot protection is preventing access to the site entirely.

### Sitemap Import

Fetches and parses sitemaps (typically at `/sitemap.xml` or as referenced in `robots.txt`) to discover pages that belong to a property registered with Spyglasses.

### AI Readiness Audits

Crawls pages to evaluate technical SEO factors that affect AI visibility — content structure, metadata, schema markup, and accessibility to headless fetchers.

### FAQ Generation

Fetches page content to generate FAQ suggestions based on the site's existing content and AI assistant behavior.

### Brand Consistency Monitoring

Fetches pages from competitor sites (that the user has specified) to monitor how brands are being represented and cited across the web.

### Citation Resolution

Resolves redirect chains in URLs cited by AI assistants to determine the final destination and verify that citations point to the correct content.

## How to Identify SpyglassesBot

SpyglassesBot identifies itself via the `User-Agent` HTTP header. All user-agent strings follow a consistent format:

```
Mozilla/5.0 (compatible; SpyglassesBot/1.0; +https://spyglasses.io/docs/help/spyglasses-bot) <Purpose>
```

The specific user agents are:

| User Agent | Purpose |
|---|---|
| `...SpyglassesBot/1.0;...) AI Visibility Report` | AI Visibility Reports |
| `...SpyglassesBot/1.0;...) AI Access Checker` | robots.txt and AI access analysis |
| `...SpyglassesBot/1.0;...) Sitemap Fetcher` | Sitemap discovery and import |
| `...SpyglassesBot/1.0;...) AI Readiness Audit` | AI Readiness Audits |
| `...SpyglassesBot/1.0;...) FAQ Generation` | FAQ generation |
| `...SpyglassesBot/1.0;...) Brand Consistency Checker` | Brand monitoring |
| `...SpyglassesBot/1.0;...) Citation Resolver` | Citation redirect resolution |

All SpyglassesBot user agents can be matched with a single substring check for **`SpyglassesBot`**. The `robots.txt` token is **`SpyglassesBot`**.

## Crawling Behavior

- **Respects robots.txt**: SpyglassesBot honors all `User-Agent: *` and `User-Agent: SpyglassesBot` directives in `robots.txt`, including `Disallow`, `Allow`, and `Crawl-delay`.
- **On-demand only**: SpyglassesBot does not continuously crawl the web. It only fetches pages when a Spyglasses user requests an analysis of their own site or a site they are monitoring.
- **Rate limited**: Requests are rate-limited and use timeouts to avoid overloading target servers.
- **No indexing**: SpyglassesBot does not build a search index. Content is fetched, analyzed, and the analysis results are stored — the raw HTML is not retained.

## Controlling Access

### Allow SpyglassesBot

No special configuration is needed if your `robots.txt` allows all crawlers (`User-Agent: *` with no relevant `Disallow` rules).

### Block SpyglassesBot

Add the following to your `robots.txt`:

```
User-agent: SpyglassesBot
Disallow: /
```

### Allow SpyglassesBot through Cloudflare

If your site uses Cloudflare bot protection, SpyglassesBot supports **Web Bot Auth** — a cryptographic verification protocol that proves requests genuinely come from Spyglasses. This means Cloudflare can verify our identity without relying on user-agent strings alone.

Our public signing key is hosted at:

```
https://spyglasses.io/.well-known/http-message-signatures-directory
```

If SpyglassesBot is registered as a Cloudflare verified bot, requests will be automatically allowed through Cloudflare's bot protection without any configuration on your part.

To manually allow SpyglassesBot in Cloudflare:

1. Go to **Security** → **WAF** → **Custom Rules**
2. Create a rule matching `User Agent contains "SpyglassesBot"`
3. Set the action to **Skip** → All remaining custom rules

For more details, see [Cloudflare Is Blocking Access to Your Site](/docs/help/cloudflare-is-blocking-access-to-your-site).

## Verification

SpyglassesBot requests originate from Spyglasses infrastructure. You can verify the bot's identity through:

1. **User-Agent header**: All requests include a user-agent string containing `Spyglasses`.
2. **Web Bot Auth signatures**: When communicating with Cloudflare-protected sites, requests include `Signature`, `Signature-Input`, and `Signature-Agent` headers per [RFC 9421](https://www.rfc-editor.org/rfc/rfc9421).

Our Web Bot Auth key directory is at:

```
https://spyglasses.io/.well-known/http-message-signatures-directory
```

We also received an A grade from [WebBotAuth.net](https://webbotauth.net), a neutral site that evaluates bots for compliance with the Web Bot Auth specification.

<a href="https://webbotauth.net/check?domain=www.spyglasses.io" title="Verify www.spyglasses.io on WebBotAuth.net" target="_blank" rel="noopener"><img src="https://webbotauth.net/badge/www.spyglasses.io.svg" alt="Web Bot Auth: Grade A, verified by WebBotAuth.net" height="28" /></a>

## Contact

If you have questions or concerns about SpyglassesBot, or if you believe it is behaving unexpectedly:

- **Email**: [support@spyglasses.io](mailto:support@spyglasses.io)
- **Website**: [https://spyglasses.io](https://spyglasses.io)
- **Documentation**: [https://spyglasses.io/docs](https://spyglasses.io/docs)
