Documentation

Technical SEO for AI

A guide to optimizing your technical infrastructure for AI assistant discoverability and citability.

Technical SEO for AI

Technical SEO for AI ensures that AI assistants like ChatGPT, Claude, and Perplexity can discover, access, and accurately cite your content. This guide covers the technical requirements and best practices.

Core Requirements

Before optimizing for AI, ensure these fundamental requirements are met:

1. Server-Side Rendering (SSR)

Why it matters: AI assistants cannot execute JavaScript. Your content must be in the initial HTML response.

Check if you have SSR:

curl https://yoursite.com | grep "your content text"

If you can find your content in the raw HTML, you have SSR. If not, see Is Your Site Visible to AI Assistants?

Solutions by framework:

  • React: Next.js, Remix, or React Server Components
  • Vue: Nuxt.js
  • Angular: Angular Universal
  • Svelte: SvelteKit

2. Allow AI Crawlers

Why it matters: AI assistants respect robots.txt. If you block them, they can't access your content.

Check your robots.txt:

curl https://yoursite.com/robots.txt

Allow these user agents:

  • ChatGPT-User (ChatGPT)
  • Claude-User (Claude)
  • Perplexity-User (Perplexity)
  • Google-Extended (Gemini)

See robots.txt Configuration for details.

3. Fast Load Times

Why it matters: AI crawlers have timeout limits. Slow pages may be abandoned before content loads.

Target metrics:

  • Time to First Byte (TTFB): < 600ms
  • First Contentful Paint (FCP): < 1.8s
  • Largest Contentful Paint (LCP): < 2.5s

Quick wins:

  • Enable compression (gzip/brotli)
  • Optimize images (WebP, proper sizing)
  • Minimize render-blocking resources
  • Use CDN for static assets

4. Clean HTML Structure

Why it matters: AI assistants parse HTML to understand content hierarchy and meaning.

Best practices:

  • Use semantic HTML5 elements (<article>, <section>, <header>)
  • Proper heading hierarchy (h1 → h2 → h3)
  • Descriptive link text (not "click here")
  • Alt text for all images
  • Clear content structure

Structured Data Implementation

Structured data (schema.org markup) helps AI assistants understand your content context.

Priority: Organization Schema

Add to your homepage to define your brand:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Company Name",
  "url": "https://yourcompany.com",
  "logo": "https://yourcompany.com/logo.png",
  "description": "Clear one-sentence description of what you do",
  "foundingDate": "2020-01-01",
  "sameAs": [
    "https://twitter.com/yourcompany",
    "https://linkedin.com/company/yourcompany"
  ],
  "contactPoint": {
    "@type": "ContactPoint",
    "telephone": "+1-555-555-5555",
    "contactType": "Customer Service"
  }
}
</script>

What to include:

  • Exact brand name (as you want AI to use it)
  • Primary domain URL
  • High-quality logo (1200x630px recommended)
  • One-sentence description of your category
  • Social media profiles

Priority: Product/Service Schema

Add to product or service pages:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "Product Name",
  "applicationCategory": "BusinessApplication",
  "description": "Clear description of what this product does",
  "offers": {
    "@type": "Offer",
    "price": "29.00",
    "priceCurrency": "USD",
    "priceValidUntil": "2025-12-31",
    "availability": "https://schema.org/InStock"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.5",
    "reviewCount": "127"
  },
  "featureList": [
    "Key feature 1",
    "Key feature 2",
    "Key feature 3"
  ]
}
</script>

What to include:

  • Product/service name
  • Clear category classification
  • Current pricing (if public)
  • Key features list
  • Ratings/reviews if available

Priority: FAQ Schema

Add to pages with Q&A content:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is [your product]?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Clear, concise answer in 50-70 words."
      }
    },
    {
      "@type": "Question",
      "name": "How much does it cost?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Pricing information in plain language."
      }
    }
  ]
}
</script>

Best practices:

  • 50-70 word answers
  • Educational tone (not promotional)
  • Plain text (no HTML in answer text)
  • Natural question phrasing

Other Useful Schema Types

Article Schema (for blog posts):

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Article Title",
  "author": {
    "@type": "Person",
    "name": "Author Name"
  },
  "datePublished": "2025-01-15",
  "dateModified": "2025-01-20",
  "description": "Article summary"
}

Breadcrumb Schema (for navigation):

{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    {
      "@type": "ListItem",
      "position": 1,
      "name": "Home",
      "item": "https://example.com"
    },
    {
      "@type": "ListItem",
      "position": 2,
      "name": "Products",
      "item": "https://example.com/products"
    }
  ]
}

Local Business Schema (for location-based businesses):

{
  "@context": "https://schema.org",
  "@type": "LocalBusiness",
  "name": "Business Name",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "123 Main St",
    "addressLocality": "City",
    "addressRegion": "ST",
    "postalCode": "12345"
  },
  "geo": {
    "@type": "GeoCoordinates",
    "latitude": "40.7128",
    "longitude": "-74.0060"
  },
  "openingHours": "Mo-Fr 09:00-17:00"
}

Content Formatting for AI

How you format content affects AI comprehension and citability.

Use Semantic HTML

AI assistants understand HTML semantics:

<!-- Good: Semantic structure -->
<article>
  <header>
    <h1>Main Topic</h1>
  </header>
  <section>
    <h2>Subtopic</h2>
    <p>Content organized logically...</p>
  </section>
</article>
 
<!-- Bad: Generic divs -->
<div class="article">
  <div class="title">Main Topic</div>
  <div class="content">Content...</div>
</div>

Heading Hierarchy

Use proper heading levels:

<!-- Good: Logical hierarchy -->
<h1>Product Overview</h1>
  <h2>Key Features</h2>
    <h3>Feature 1</h3>
    <h3>Feature 2</h3>
  <h2>Pricing</h2>
    <h3>Starter Plan</h3>
    <h3>Pro Plan</h3>
 
<!-- Bad: Skipping levels -->
<h1>Product Overview</h1>
  <h4>Key Features</h4>
  <h2>Pricing</h2>

Lists and Tables

Use structured elements for organized information:

<!-- Features list -->
<ul>
  <li>Feature with clear description</li>
  <li>Another feature explained simply</li>
</ul>
 
<!-- Pricing table -->
<table>
  <thead>
    <tr>
      <th>Plan</th>
      <th>Price</th>
      <th>Features</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Starter</td>
      <td>$29/mo</td>
      <td>Basic features</td>
    </tr>
  </tbody>
</table>

Use meaningful link text:

<!-- Good: Descriptive -->
<a href="/pricing">View pricing plans and features</a>
 
<!-- Bad: Generic -->
<a href="/pricing">Click here</a>

Image Alt Text

Provide context for images:

<!-- Good: Descriptive alt text -->
<img src="dashboard.png" 
     alt="Product dashboard showing analytics overview with graphs and metrics" />
 
<!-- Bad: Generic or missing -->
<img src="dashboard.png" alt="screenshot" />

Content Writing for AI Citability

Clear, Concise Language

AI assistants prefer straightforward content:

Good:

"Spyglasses analyzes how AI assistants like ChatGPT discover and cite your content. Our reports show your brand consistency score, competitive positioning, and technical accessibility."

Avoid:

"Spyglasses is a revolutionary, cutting-edge solution that empowers businesses to leverage next-generation AI-powered insights for optimizing their digital presence in the emerging landscape of artificial intelligence."

Answer Questions Directly

Structure content around common questions:

<section>
  <h2>What is Spyglasses?</h2>
  <p>Spyglasses is an AI visibility analytics platform. We show you how 
     AI assistants like ChatGPT, Claude, and Perplexity discover, understand, 
     and cite your content.</p>
</section>
 
<section>
  <h2>How does it work?</h2>
  <p>We crawl your website, query major AI platforms with relevant questions, 
     and analyze how accurately AI assistants represent your brand.</p>
</section>

Specific Over General

Provide concrete information:

Good:

"Plans start at $49/month for up to 3 properties. Enterprise plans available for larger teams."

Avoid:

"Flexible pricing to fit your needs. Contact us to learn more."

Update Dates and Version Info

Help AI understand content freshness:

<article>
  <time datetime="2025-01-24">Last updated: January 24, 2025</time>
  <p>Current pricing as of January 2025...</p>
</article>

Technical Best Practices

robots.txt Configuration

Allow AI crawlers while protecting sensitive areas:

# Allow AI assistants
User-agent: ChatGPT-User
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Perplexity-User
Allow: /

User-agent: Google-Extended
Allow: /

# Protect admin areas
User-agent: *
Disallow: /admin/
Disallow: /api/private/
Allow: /

See robots.txt Configuration for complete guide.

Sitemap Optimization

Help AI crawlers discover your content:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2025-01-24</lastmod>
    <changefreq>monthly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://example.com/product</loc>
    <lastmod>2025-01-24</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

Best practices:

  • Include all important pages
  • Keep lastmod dates current
  • Set appropriate priorities
  • Update when content changes
  • Submit to search engines

Canonical URLs

Prevent duplicate content issues:

<link rel="canonical" href="https://example.com/original-page" />

Use canonical tags when:

  • Same content exists on multiple URLs
  • You have pagination
  • You have parameter-based URLs
  • You syndicate content

Meta Tags

While less important than structured data, good meta tags help:

<head>
  <title>Concise Page Title | Brand Name</title>
  <meta name="description" content="Clear 150-160 character summary of page content" />
  <meta name="author" content="Company Name" />
  
  <!-- Open Graph for social sharing -->
  <meta property="og:title" content="Page Title" />
  <meta property="og:description" content="Page summary" />
  <meta property="og:image" content="https://example.com/image.png" />
  <meta property="og:type" content="website" />
</head>

Content Security and Access

Ensure AI assistants can access content:

Don't:

  • Hide content behind login walls (for public pages)
  • Use aggressive anti-bot measures
  • Require JavaScript for content visibility
  • Use CAPTCHA to protect entire public pages
  • Rate-limit too aggressively

Do:

  • Use authentication only for truly private content
  • Allow reasonable crawler access
  • Implement server-side rendering
  • Trust verified bot user agents
  • Monitor crawler activity

Performance Optimization

Image Optimization

Optimize images without sacrificing AI understanding:

<!-- Use modern formats with fallbacks -->
<picture>
  <source srcset="image.webp" type="image/webp" />
  <source srcset="image.jpg" type="image/jpeg" />
  <img src="image.jpg" 
       alt="Descriptive alt text for AI understanding" 
       loading="lazy"
       width="800"
       height="600" />
</picture>

Best practices:

  • WebP format for modern browsers
  • Proper dimensions (no oversized images)
  • Lazy loading for below-the-fold content
  • Descriptive alt text (critical for AI)
  • Compress without losing clarity

Font Loading

Optimize fonts to improve load times:

<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel="stylesheet" />

Or use system fonts:

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
}

Minimize JavaScript

Reduce JavaScript that blocks content:

Good:

  • Defer non-critical JS
  • Async load third-party scripts
  • Code splitting
  • Server-side rendering

Avoid:

  • Blocking scripts in <head>
  • Large JavaScript bundles
  • Unnecessary third-party scripts
  • Client-only rendering

Monitoring and Testing

Test AI Accessibility

Verify AI assistants can access your content:

# Test with curl (simulates a citation in a ChatGPT chat)
curl -A "ChatGPT-User" https://yoursite.com
 
# Test page load time
curl -o /dev/null -s -w "Total time: %{time_total}s\n" https://yoursite.com

Validate Structured Data

Use Google's testing tools:

  1. Rich Results Test
  2. Schema Markup Validator

Regular Audits

Schedule quarterly reviews:

  • Verify AI crawlers can access key pages
  • Test structured data implementation
  • Check brand consistency across pages
  • Measure page load times
  • Review robots.txt configuration
  • Update stale content
  • Fix broken links
  • Validate schema markup

Common Issues and Fixes

Issue: Content Not Accessible

Symptoms:

  • AI assistants don't cite your pages
  • curl shows empty or minimal HTML

Fixes:

  1. Implement server-side rendering
  2. Remove JavaScript dependency for core content
  3. Pre-render static pages

See Is Your Site Visible?

Issue: Cloudflare Blocking

Symptoms:

  • Curl returns challenge page
  • AI crawlers blocked in logs

Fixes:

  1. Whitelist verified bots in Cloudflare
  2. Adjust Bot Fight Mode settings
  3. Create custom rules for AI crawlers

See Cloudflare Configuration

Issue: robots.txt Blocking

Symptoms:

  • robots.txt disallows AI bots
  • Zero AI crawler traffic

Fixes:

  1. Allow specific AI bot user agents
  2. Remove wildcard disallow rules
  3. Keep only necessary restrictions

See robots.txt Guide

Issue: Slow Load Times

Symptoms:

  • AI assistants timeout
  • Partial content in responses

Fixes:

  1. Optimize images and assets
  2. Enable compression
  3. Use CDN for static files
  4. Minimize render-blocking resources

Issue: Inconsistent Brand Info

Symptoms:

  • AI describes your brand incorrectly
  • Outdated information in AI responses

Fixes:

  1. Audit all pages for consistency
  2. Update product descriptions
  3. Ensure pricing is current
  4. Add/update Organization schema

Next Steps

1. Run an AI Visibility Report

Get your free AI Visibility Report to see:

  • Technical accessibility issues
  • Brand consistency score
  • Structured data coverage
  • Competitive positioning

2. Implement Priority Fixes

Start with highest-impact changes:

  1. Fix technical blockers (SSR, robots.txt, Cloudflare)
  2. Add Organization schema to homepage
  3. Add Product/Service schema to key pages
  4. Optimize page load times

3. Content Optimization

Improve AI understanding:

  1. Audit brand consistency
  2. Add FAQ schema
  3. Improve content structure
  4. Update stale information

4. Monitor and Iterate

Track progress over time:

  1. Set up crawler monitoring
  2. Test key queries monthly
  3. Run quarterly AI visibility reports
  4. Measure traffic from AI referrers

Additional Resources