The GEO Crash Test scores any public URL on how well it's optimized to be cited by generative AI systems. It runs a six-category analysis, returns a composite score from 0 to 100, and flags the specific issues holding the page back. This page explains the methodology behind the score — what it measures, how it thinks, and why the results matter.
The tool was built to solve a specific problem: traditional SEO audits don't measure citation potential in AI Overviews, ChatGPT, Perplexity, or Claude. Those systems don't rank pages — they extract claims. A different kind of audit is needed to see whether a page is positioned to be one of the sources a language model pulls from.
What the tool analyzes
When you submit a URL, the GEO Crash Test fetches the live page and runs it through a pipeline that examines the same signals AI systems use when deciding what to cite. The analysis covers structural, semantic, and entity-level properties of the page.
The pipeline reads the page's HTML, parses its structured data, evaluates the content for extractability patterns, and checks the surrounding signals that influence whether a model treats the source as authoritative. The result is a category-by-category breakdown rather than a single opaque number, so you can see exactly where the page is strong and where it's leaking citation potential.
The six scoring categories
The score is built from six categories, each measuring a distinct property of citability.
Extractability
Extractability evaluates whether a language model can lift a self-contained claim from the page without surrounding context. The tool examines sentence structure, paragraph length, lead definitions, and how readily the content stands alone in pieces.
Entity authority
Entity authority evaluates whether the page is attached to a recognizable, structured entity. The tool checks for Organization and Person schema, cross-platform identity links, author attribution, and publisher signals that help AI systems answer the question "who is this source?"
Topical depth
Topical depth evaluates whether the page sits inside a coherent cluster of related content. The tool looks at internal linking, related content signals, and the surrounding site structure to determine whether the page reads as a single accidental article or as part of an authority neighborhood.
Structured data coverage
Structured data coverage evaluates the completeness and validity of JSON-LD schema markup. The tool checks for Article, FAQPage, BreadcrumbList, and entity schemas, and flags missing or malformed structures that reduce machine-readability.
Freshness signals
Freshness signals evaluates recency indicators including dateModified, references to current information, and the cadence of surrounding content updates. Stale pages get filtered by AI systems in favor of maintained ones, and this category captures that risk.
Originality of frame
Originality of frame evaluates whether the page contributes something the model can't synthesize from other sources — a named framework, original data, a documented methodology, or a defined term. This is the hardest category to score quantitatively and the most durable signal over time.
How the categories combine
The six categories don't contribute equally to the final score. Extractability and entity authority do the heaviest lifting because they're gating factors — a page that fails on either is unlikely to be cited regardless of how it performs elsewhere.
Structured data coverage and topical depth are the next tier. These determine whether a page survives the model's filtering when multiple candidate sources exist for the same query.
Freshness and originality of frame are the long-tail factors. They matter less for any single page in isolation but matter enormously over time and at the site level. Originality of frame in particular is what separates a citable site from a forgettable one across longer horizons.
The composite score reflects this weighting. Two pages can have similar scores while having very different underlying profiles, which is why the category breakdown matters as much as the headline number.
What the score means in practice
A score in the 0–39 range means the page is effectively invisible to AI systems. The structural signals that drive citation aren't there, and the page is unlikely to appear as a source in generative answers.
A score in the 40–59 range means the page is occasionally surfacing. Some signals are working, but enough gaps exist that citation behavior is inconsistent.
A score in the 60–79 range is the working zone for most professional content. The page is structurally sound, entity-attributed, and citable across multiple AI systems.
A score of 80 or above indicates a page that combines all six factors and contributes original frames the model has no alternative for. These pages are difficult to displace once established.
Most pages on most sites score in the 30–55 range on a first run. The gap between current state and citable state is usually closable in a single optimization pass.
What the tool doesn't do
A few things are worth naming explicitly.
The tool doesn't crawl your entire site — it scores one URL at a time. Authority signals from the surrounding site are inferred from what's linked from the analyzed page rather than measured across the full domain.
The tool doesn't predict ranking in traditional search. Traditional SEO and GEO measure different things, and a page can perform well in one and poorly in the other. The score is specifically about citation potential in AI-generated answers.
The tool doesn't guarantee citation. It measures whether the structural conditions for citation are in place. Whether a specific AI system cites a specific page on a specific query also depends on the query itself, the competitive set of other available sources, and the model's own selection logic.
Why the methodology is public
The thinking behind the score is public because the value of the tool is in the analysis, not in the secrecy of the rubric. Anyone can read this page and understand what's being measured. What makes the tool useful is running the measurement consistently across hundreds of URLs and surfacing the specific, actionable issues on each one.
If you want to see how a page on your site scores against these six categories, run it through the GEO Crash Test. For the broader context on how AI citation works, see How AI Overviews Decide What to Cite. For a definitional reference on what a GEO score measures, see What is GEO Score?.
FAQ
How does the GEO Crash Test work?
The tool fetches a public URL and runs it through a six-category analysis examining extractability, entity authority, topical depth, structured data coverage, freshness signals, and originality of frame. It returns a composite score from 0 to 100 along with a category-by-category breakdown of specific issues.
What does the GEO Crash Test measure?
The tool measures citation potential in AI-generated answers from systems like Google's AI Overviews, ChatGPT, Perplexity, and Claude. It evaluates the structural, semantic, and entity-level signals that influence whether a language model will pull from the page when synthesizing an answer.
How accurate is the GEO Crash Test score?
The score reflects the structural conditions for citation rather than guaranteeing citation behavior. It measures the same signals AI systems use, but actual citation depends on the query, the competitive set of available sources, and the model's selection logic.
Does the GEO Crash Test crawl my entire website?
No. The tool scores one URL at a time. Authority signals from the surrounding site are inferred from what's linked from the analyzed page rather than measured across the full domain. For a multi-page audit, run each URL individually.
Can the GEO Crash Test predict my Google search rankings?
No. Traditional SEO and GEO measure different things. A page can rank well in Google and score poorly on the GEO Crash Test, or vice versa. The score is specifically about citation potential in AI-generated answers, not ranking position in traditional search.