# AI Search Citation Optimizer Methodology

## What the Citation Optimizer Measures

When someone asks ChatGPT a research question, the model does not answer from memory. It runs a search, ranks the results, reads the top pages, and assembles an answer from the passages it trusts. Each of those steps is a decision: search or not, this route or that one, keep this page or drop it, lift this passage or skip it.

The AI Search Citation Optimizer models that path as a sequence of **gates**. A gate is a single pass, warn, or fail decision the pipeline makes between a user's question and a final citation. Because each gate maps to a known or approximable algorithm, we can run a piece of content through every gate and show you exactly where it would survive and where it would fall out.

The result is a stoplight: green where your content clears a gate, yellow where it is at risk, red where it fails. Every warn and fail comes with a specific, actionable recommendation. This is the difference between a tool that tells you where you stand and one that tells you what to do.

This first release models the **ChatGPT** pipeline for B2B-style research queries. Other pipelines are on the way; see [Other Pipelines](#other-pipelines-on-the-roadmap) below.

---

## Why a Pipeline, Not a Prompt Tracker

Most AI visibility tools run prompts against a model and report what came back. That tells you your current standing, but it is built on a moving target. AI responses are nondeterministic; they vary by session, by user, and by personalization. Running more prompts reduces the noise but never removes it, and it never tells you what to change.

The retrieval pipeline is different. For research queries, ChatGPT uses retrieval-augmented generation (RAG): it searches the web, ranks results, reads pages, and scores passages for relevance. Unlike the final wording of an answer, these retrieval steps follow published, repeatable algorithms. They are measurable, and you can optimize for them. That is what the Citation Optimizer does.

---

## The Eight Gates

A run starts from a **fan-out** — a real search query the platform's own pipeline generated, captured from your AI Visibility report — and a page (your live URL, a draft, or an earned-media fragment). The gates run in order. The first three describe the query the run is scored against; the last five score your content.

### Gate 1 — Does the query trigger a search?

Some questions ChatGPT answers straight from training data, with no search and therefore no citation opportunity. This gate confirms the query is one that triggers retrieval. For a tracked fan-out, the answer is already known: the query exists *because* the platform searched for it. For a free-text query you type in yourself, a lightweight model judges search-likelihood. This gate warns; it never blocks.

### Gate 2 — Does the query route to the research pipeline?

ChatGPT runs different pipelines for different intents. A shopping query, a news query, and a B2B research query are handled differently. This release models the research-text pipeline, so this gate checks the query routes there. For a tracked fan-out the route is confirmed; for free-text, a lightweight model classifies the intent. This gate warns rather than blocks, so you still get content guidance even on an off-route query.

### Gate 3 — Fan-out confirmation

This gate is informational. It shows the real, observed fan-out the run scores against, where it came from, and the competitor pages already standing for it. We deliberately do **not** invent queries or generate semantic variants. A run is only as trustworthy as the query behind it, so we use the platform's own observed fan-outs verbatim. A free-text query is always labeled clearly as unverified.

### Gate 4 — Initial candidate filter (the SERP)

Before ChatGPT can read your page, your page has to be in the pool of results its search returns. This gate fetches the organic search results for the fan-out and checks whether your domain ranks in the top 30. A gap here is not a dead end; it is usually your most important finding. So this gate is honest but non-blocking: if you do not rank yet, it shows you the pages that do rank (the model to study) and your closest existing page (the starting point to improve), and the pipeline keeps scoring your content so you still get a full set of recommendations.

### Gate 5 — Fetch and chunk

ChatGPT reads pages in small passages, not whole documents. This gate fetches your page, converts it to clean text, and splits it into token windows to check that coherent, citable passages exist. Pages that are unreachable, blocked, or too thin to chunk fail here, because there is nothing for the model to read. This is a hard gate.

### Gate 6 — Semantic embedding and neural rerank

Retrieval does not match on keywords; it matches on meaning. This gate embeds your best passage and the query into the same vector space and measures their similarity, then runs an **open-source neural reranker that approximates the cross-encoder ChatGPT uses** to make a sharper, side-by-side relevance judgment. When you have a ranking gap, your passage is reranked against the competitor pages that *do* rank, so the comparison is real. Very low relevance fails here.

### Gate 7 — Deep-read audition

The pages that survive reranking are auditioned more closely. This gate combines passage relevance, publisher authority (drawn from the same data behind the [AI Placement Value Score](/docs/methodology/ai-placement-value-score)), and content quality into a composite judgment of whether your page earns a place among the finalists.

### Gate 8 — Synthesis readiness

Finally, the model assembles its answer from the winning passages. This gate checks whether your passage is ready to be lifted: is it self-contained enough to stand on its own, does it answer the question first instead of burying the lede, and is it dense with the relevant entities? Self-containedness is judged on the core passage; answer-first and entity density are judged on the wider window the model actually reads.

---

## What's Research-Backed, and Where We Approximate

The Citation Optimizer is built on published research into how AI citation pipelines behave, and on forensic analysis of real ChatGPT retrieval traffic. We are explicit about which parts are grounded and which are estimates.

**Grounded in research:**

- **The gate sequence itself.** The search → route → retrieve → rerank → read → synthesize path is documented across primary sources and reproduced in network-log analysis of live retrievals.
- **The chunk model.** We split content into **~200-token windows snapped to sentence and structural boundaries**, the target used by OpenAI's documented retrieval tooling, and we audition each passage together with its neighbors (the same sentence-window approach used in the [AI Placement Quality Score](/docs/methodology/ai-placement-quality-score)). We deliberately do not strip boilerplate before chunking, because the real pipeline does not either.
- **Position and answer-first bias.** Heavily cited content front-loads its answer; passages buried late in a page are cited far less often. The synthesis gate reflects this.

**Where we approximate:**

- **The reranker.** ChatGPT's exact cross-encoder is proprietary. We use an open-source neural reranker that behaves like it. The relative ordering it produces is faithful; the absolute scores are our model's, not ChatGPT's.
- **The query gates.** For free-text queries we use a lightweight model to judge search-likelihood and intent. These are estimates and warn rather than block. Tracked fan-outs skip the guessing entirely.
- **Thresholds.** The exact cutoffs any platform uses are unknown. Ours are calibrated against observed behavior and will move as the pipelines evolve. Every run records the model version it was scored under, so old and new scores stay interpretable.

---

## From Score to Rewrite

Scoring tells you where content falls out. The optimizer then closes the loop.

For any warn or fail, the optimizer produces a recommendation grounded in the content template for your page type (product, homepage, informational, press release, or general). You can apply those recommendations yourself, or have the optimizer generate a **rewrite**: a revised draft that front-loads the answer, tightens passages, and restructures for citation, along with a revised title, meta description, and page-appropriate structured data (JSON-LD). The rewrite never invents facts, statistics, or credentials; it works with what your content already supports.

Every rewrite includes unlimited re-scoring, so you can revise and re-check until the page is ready. The optimizer tracks improvement across iterations and tells you when the content is publish-ready or has plateaued, so you know when to stop.

---

## Driving the Loop from an AI Assistant (MCP)

The whole score → rewrite → re-score loop is available to a connected AI assistant through the **Spyglasses MCP server**, so you can optimize content in a conversation instead of clicking through the dashboard.

The server lives at `https://www.spyglasses.io/api/mcp` and authenticates with your Spyglasses account over OAuth. Its citation tools form a loop the assistant runs on your behalf:

- **`list_tracked_fanouts`** — the entry point: the real fan-outs your property is tracked for, highest-impact first, each flagged as a ranking gap or not.
- **`match_pages_for_fanout`**, **`list_property_pages`**, **`list_placements`** / **`get_placement`** — pick the page, draft, or earned placement to optimize.
- **`score_citation_pipeline`** — score it; returns a run id to poll.
- **`get_pipeline_run`** — read the gate results, recommendations, and a **readiness verdict** that tells the assistant whether to revise again or stop.
- **`revise_content`** / **`get_revision`** — generate and read a rewrite.
- **`rescore_revision`** — re-score the revised content, closing the loop.

Scoring and revising run as background jobs, so the assistant fires the work and polls for results. Every tool is scoped to your own properties, and nothing is ever published on your behalf. For setup, see [Chat with your reports](/docs/ai-visibility-guides/chat-with-your-reports).

---

## Other Pipelines on the Roadmap

This release models the ChatGPT research pipeline because it is the best-documented and its gates are the most reusable. The same gate framework extends to other assistants, and the following pipelines are planned for **Q3 2026**:

- **Google AI Overviews**
- **ChatGPT Business, Pro, and Deep Research**
- **Claude**
- **Google AI Mode and Gemini**

Each will reuse the gate-scoring approach; many gates share services with the ChatGPT pipeline. Until a pipeline is modeled, the optimizer scores against ChatGPT only and labels other platforms as not yet scoreable.

---

## What This Tool Is — and What It's Not

**What it is:**

- A **research-backed model** of a real retrieval pipeline, scored gate by gate, with a specific recommendation for every weakness.
- **Honest about gaps.** A page that does not rank yet still gets a full set of content recommendations and a view of the pages it has to beat.
- **Reproducible.** Every run records the query it scored, where the query came from, and the model version it ran under.

**What it's not:**

- **A guarantee of citation.** Clearing every gate means the structural factors are in place. It does not guarantee any platform will cite your page for any specific query on any given day; AI answers remain nondeterministic.
- **A keyword generator.** The optimizer scores against real, observed fan-outs, not invented queries. A free-text query is always allowed, but always labeled unverified.
- **A static metric.** As the pipelines change, the gates, thresholds, and reranker will change with them. The model version on every run keeps historical scores interpretable.
