Documentation

AI Placement Quality Score Methodology

What the AI Placement Quality Score Measures

The AI Placement Quality Score (PQS) answers a simple question: given a specific brand mention on a specific page, how likely is an AI assistant to surface it as a citation?

Where the AI Placement Value Score (AIPVS) rates a publisher's overall value for a placement — treating the whole domain as a single unit — the PQS zooms in on a single placement on a single page. A Tier 1 publisher is only valuable if your mention is actually positioned well within the article. A passing mention at the bottom of a page delivers a fraction of the citation value of a dedicated lead paragraph.

The PQS is expressed on a 0–100 scale. When it's combined with a publisher's AIPVS, the resulting Total Placement Value gives you a single number that answers both "is this domain worth it?" and "is this placement well-positioned?" at the same time.


How AI Systems Actually Pick Citations

Before getting into the scoring formula, it helps to understand what AI citation pipelines actually do with a page of content. The PQS is calibrated against three well-documented structural biases in how large language models extract citations.

Document position matters — a lot

A study by Kevin Indig of 1.2 million ChatGPT answers and 18,012 verified citations found that 44.2% of citations come from the first 30% of a document, 31.1% from the middle third, and just 24.7% from the final third. Independent analysis of the open-source GPT-OSS-20B architecture has shown why: the first ~50 tokens of a page determine which expert processing pathway the content takes, and this routing decision is effectively locked in by the fifth attention layer. Content in the first 10% of a page sits in what's informally called the "peak citation zone" — attention sink effects, early token routing, and summarization heuristics all concentrate here.

The PQS position-score curve follows this distribution directly.

Content is processed in fixed-size chunks

AI systems process documents in fixed-size token windows — typically 128 tokens, roughly 500 characters. When a brand mention is fully contained within a coherent chunk (a chunk that starts and ends at structural boundaries, contains at least one complete sentence, and doesn't straddle paragraph breaks), the mention is easily "lifted" wholesale for citation. When it spans a chunk boundary or sits in the middle of a fragmented block, citation probability drops sharply.

Our chunking is an estimate — we don't know the exact windowing that any given AI platform uses. So for every mention, we evaluate the containing chunk plus the preceding and following chunk (the "halo"). If any part of that halo is coherent, citation is likely.

Sentiment around the mention is load-bearing

Research shows that heavily cited content clusters around a subjectivity score of about 0.47 on a 0–1 scale — analyst-style commentary that blends fact with moderate interpretation. Models trained on balanced journalistic prose prefer this register for citations. Sharply positive content and sharply negative content both fall outside the citation sweet spot, though for different reasons.

Negative sentiment is especially consequential: a scathing review on a high-authority site can actively damage a brand's AI representation. The model learns the criticism and repeats it. For this reason, sentiment operates in the PQS as an asymmetric multiplier — a strongly-negative mention can collapse the score to near-zero regardless of other factors, while a strongly-positive mention caps at 1.3×.


The Five Factors

The PQS is a weighted composite of five factors, each normalized to a 0–1 sub-score.

Document Position (30% weight)

Where the brand mention sits in the document, as a fraction of total tokens. Bucket-mapped to the citation-probability curve:

BucketRangeScore
First 10%0–10%1.00
First 30%10–30%0.85
Middle third30–70%0.60
Final third70–90%0.40
Final 10% / footer90–100%0.20

Position is determined by walking the tokenized markdown of the main article content looking for the brand name or any of its aliases, longest alias first. If no mention is found, the placement fails scoring with an explicit error — the PQS is not computed on pages where the brand isn't actually mentioned.

Chunk Containment (20% weight)

A "halo" check around the containing 128-token chunk of the brand mention. For each of the containing, preceding, and following chunks, we evaluate three structural conditions: does it start at a sentence or heading boundary, does it end at a sentence or heading boundary, and does it contain at least one complete sentence? Combined with a check for whether the brand mention straddles the outer chunk boundary:

StateScoreCondition
Full containment1.00Containing chunk is self-contained, at least one neighbor is also self-contained, and the mention is not within 8 tokens of the containing chunk's outer boundary
Partial containment0.70Containing chunk is self-contained or partial, but the halo as a whole doesn't fully qualify
Fragmented0.40Containing chunk is fragmented, OR the mention sits within 8 tokens of the chunk's outer boundary (straddling the window)

The halo model is deliberately more lenient than requiring a single chunk to be perfect, because real-world AI citation pipelines pull more than one window into their final response.

Placement Type (20% weight)

How the brand is featured in the article — a dedicated article is treated very differently from a passing mention or a roundup listing. Classified via a Claude Haiku model given the three-chunk window around the mention, the article title, and the article token count.

Placement TypeScore
Dedicated article or feature story1.00
Roundup — position #10.85
Roundup — top third0.75
Roundup — middle third0.60
Roundup — bottom third0.45
Passing mention0.40
Sidebar / widget / bio box0.30
Quote only (no link)0.25

Editorial Control (15% weight — prospective mode only)

How much control the brand has over how the mention is written. Full authorship scores highest; unpredictable UGC scores lowest.

Editorial ControlScore
Full authorship1.00
Collaborative / approval rights0.85
Quoted with context0.70
Mentioned, no input0.50
Unpredictable / UGC0.35

Editorial control is not scored in retrospective mode. The decision has already been made — the placement exists, whoever had editorial control exercised it, and the score should reflect what the AI reader sees, not how the placement was produced. The 0.15 weight is redistributed proportionally across the other four signals so retrospective and prospective scores remain on the same 0–100 scale.

The rel attribute of any link pointing at the brand's domain, detected from the raw HTML. When multiple brand-pointing links exist, the highest-scoring one is used — if the article contains even one dofollow brand link, the brand gets credit for that strongest signal.

Link AttributeScore
dofollow (standard)1.00
nofollow0.60
sponsored0.50
ugc (user-generated)0.40
No link (mention only)0.30

Sentiment Multiplier (0.0–1.3×)

Sentiment is applied after the weighted sum of the five factors, as a multiplier. A Claude Haiku model analyzes the three-chunk halo around the brand mention (not the whole article — sentiment around the mention is what matters) and returns a polarity score from 0 to 1. The polarity is then mapped to one of five categories:

CategoryPolarity RangeMultiplier
Strongly negative0.00–0.150.0× → 0.2× (linear interpolation)
Mildly negative0.15–0.350.5×
Neutral / balanced0.35–0.551.0×
Mildly positive0.55–0.751.15×
Strongly positive0.75–1.001.3×

The asymmetry is deliberate: negative sentiment can cancel an otherwise-strong placement because the downside risk of a damaging mention on a high-authority site (where an AI model might internalize the criticism) is significantly greater than the upside of a glowing recommendation (which largely reinforces what the brand already claims). When the sentiment multiplier is less than 1.0×, the PQS panel displays both the raw weighted score and the sentiment-adjusted score so you can see exactly what the multiplier did.


Composite Formula

PQS = 100 × (
  w_position × position_score +
  w_chunk    × chunk_score    +
  w_type     × type_score     +
  w_editorial× editorial_score +  // zero in retrospective mode
  w_link     × link_score
) × sentiment_multiplier

Default weights:

  • Prospective mode: position 0.30, chunk 0.20, type 0.20, editorial 0.15, link 0.15
  • Retrospective mode: position 0.353, chunk 0.235, type 0.235, editorial 0.00, link 0.176 (the 0.15 editorial weight redistributed proportionally across the other four)

The effective weights are snapshotted on every scored placement so the score is reproducible even if the defaults change later. The composite score is clamped to 0–100.


Combining with AIPVS: Total Placement Value

When a placement has a known publisher and that publisher has been enriched with AIPVS data, the PQS panel displays a Total Placement Value:

Total Placement Value = AIPVS × (PQS / 100)

This single number answers the combined question: is this domain worth it and is this placement well-positioned? A Tier 1 publisher (AIPVS 85) with a mediocre middle-of-article passing mention (PQS 55) delivers a Total Placement Value of 47 — solidly Tier 3, despite the premium domain. A Tier 2 publisher (AIPVS 65) with an excellent dedicated-article lead (PQS 90) delivers 59 — higher than the premium domain, because the placement quality compensates.

The tier labels for the combined score follow the same 0–100 mapping as the component metrics (Premium 75+, Strong 50–74, Moderate 25–49, Limited 0–24).


How Retrospective Scoring Works

When a user submits a live URL, the scoring pipeline runs as a background job:

  1. Fetch the page using Spyglasses' static fetcher (with Cloudflare fallback for authenticated users)
  2. Extract main content and convert to markdown (Defuddle + Turndown, same pipeline as the AI Readiness Audit)
  3. Tokenize the markdown using the gpt-4o encoding
  4. Locate the brand mention by walking the tokens for any of the property's name + aliases (longest match first)
  5. Detect the link attribute by parsing the raw HTML with JSDOM and checking rel on any anchor pointing at the brand domain
  6. Classify the placement type via Claude Haiku, given the three-chunk halo + article title + article length
  7. Analyze sentiment via Claude Haiku, given the three-chunk halo
  8. Run key message pull-through — embed each of the property's key messages and each sentence in the placement, score with cosine similarity, flag above-threshold matches
  9. Compute the composite score and persist with effective weights + all intermediate signals

A retrospective score typically takes 15–30 seconds. If the brand mention cannot be located in the content, the placement is marked as failed with an explicit error so the user can correct the URL or add an alias.


Key Message Pull-Through (Informational)

For properties that have configured Key Messages, retrospective placements also run a semantic similarity check. Each key message is embedded alongside each sentence in the placement content using OpenAI's text-embedding-3-small model, then scored pairwise with cosine similarity. Sentences with scores above the threshold are treated as a pull-through match.

Pull-through is informational only. It does not affect the PQS score. It exists to answer a different question: did your messaging actually make it into the coverage? This is valuable for tracking whether a PR campaign delivered on its narrative goals, independent of whether the coverage is well-positioned for AI citation.

The threshold is calibrated for text-embedding-3-small, which runs noticeably cooler than larger embedding models — even near-duplicate phrasing typically scores in the 0.5–0.7 range, and same-topic content lands around 0.38–0.55. We default to a 40% threshold, which catches real signal while keeping false positives low. Every key message is displayed in the side panel regardless of threshold, so you can always see the closest-matching sentence and its score even when it doesn't qualify as a full match.


What This Score Is — and What It's Not

What it is:

  • A directional signal for comparing placement opportunities and auditing existing placements.
  • Research-backed. The position bias, chunk containment thresholds, and sentiment asymmetry are drawn from published research on LLM citation behavior.
  • Reproducible. Every scored placement snapshots the effective weights and all intermediate signals, so the score can always be explained after the fact.
  • Complementary to AIPVS. PQS handles page-level quality; AIPVS handles domain-level value. Use both together.

What it's not:

  • A guarantee of AI citation. A high PQS means the structural factors are in place. It doesn't guarantee any specific AI platform will cite a placement for any specific query.
  • A substitute for editorial judgment. Placement type, sentiment, and link attribute classifications come from an LLM and are occasionally wrong, particularly on ambiguous content. Users can correct any auto-detected value in the UI after scoring completes.
  • A static metric. As AI citation pipelines evolve, we expect the weights and thresholds to change. Every scored placement carries a weight snapshot so historical scores remain interpretable.