Five AI Engines, Five Content Diets: A Q2 2026 Citation-Type Breakdown
Across 1,430 structurally classified citations from five production AI search engines, ChatGPT cites the vendor's own first-party site 68% of the time. The other four engines run 46 to 52%. Only Perplexity cites video meaningfully (9.7%, almost entirely YouTube). Community discussion (Reddit, Quora) appears in zero ChatGPT, Claude, or Perplexity citations, and only 1 to 2% of Gemini and Google AI Overview citations. Each engine is running on a different slice of the web.
Methodology
Each citation in the Q2 2026 Citation Benchmark dataset (375 buyer-intent responses across 75 prompts × 5 engines, 2,583 cited URLs total) was tagged with one of fifteen structural domain categories defined in state/research/domain-taxonomy.json. The taxonomy spans vendor first-party (the brand's own canonical site), niche publisher hub (Healthline, Sleep Foundation, OutdoorGearLab), listicle / content-farm (emailvendorselection.com, insiderone.com), business press, tech press, lifestyle media, institutional (.gov, .edu, nonprofit .org), video (YouTube), community UGC (Reddit, Quora, HN), review aggregator (G2, Capterra), marketplace (Amazon, Chewy), personal blog / Medium, developer platform (GitHub, npm), and search-engine cache URL. Hand-classification covered the top-volume domains; long-tail citations are left unclassified and reported separately. We report 1,430 classified citations (55.4% of the total). All per-engine percentages use the engine's classified subset as denominator and the underlying counts are reported for transparency. The aggregation script publishes alongside this report.
Five engines, five content diets
Days 1 through 5 of this research series measured the cross-engine citation graph at the domain level and the vertical level. The Q2 2026 dataset has another lens that has not been pulled yet: what kind of content each engine reaches for. Every cited URL in the benchmark was tagged with one of fifteen structural categories. The patterns by engine turn out to be striking.
ChatGPT cites the vendor's own first-party site 68% of the time. The other four engines run between 46% and 52%. ChatGPT also cites zero video, zero community discussion, zero personal blogs, and zero review aggregators. Claude cites zero video, zero community discussion, and zero tech press. Perplexity is the only engine in the five that cites YouTube at scale, and Gemini is the only engine that cites Reddit at any meaningful rate. The engines are running on visibly different slices of the web even when they're answering the same buyer-intent question.
Population (n = 5 engines × 75 buyer-intent prompts)
375
Responses sampled
2583
Citations extracted
1430 (55.4%)
Citations classified
14
Categories in taxonomy
25
Verticals covered
Q2 2026 benchmark
Source dataset
The per-engine summary table
Each engine returned 75 responses, but the count of citations per response varies by a factor of nearly three. Google AI Overview leads at 9.3 citations per response. Gemini ships 9.3. Perplexity 7.3. Claude 5.3. ChatGPT 3.3. The classified subset within each engine is the right denominator for any structural comparison.
| Engine | Citations / response | Classified | Classified % | Top category |
|---|---|---|---|---|
| ChatGPT | 3.28 | 150 / 246 | 61.0% | Vendor first-party (68.0%) |
| Claude | 5.33 | 205 / 400 | 51.3% | Vendor first-party (52.2%) |
| Gemini | 9.25 | 370 / 694 | 53.3% | Vendor first-party (45.7%) |
| Google AI Overview | 9.31 | 385 / 698 | 55.2% | Vendor first-party (48.8%) |
| Perplexity | 7.27 | 320 / 545 | 58.7% | Vendor first-party (47.8%) |
The full per-engine, per-category matrix
Each cell is the share of the engine's classified citations going to that category. The categories are ordered by total classified citation volume across all five engines. Cells where the engine cites zero of a category are shown as 0%, which is what they are. There are eleven such zeros in the matrix, all five clustered around ChatGPT and Claude.
| Content type | ChatGPT | Claude | Gemini | AIO | Perplexity |
|---|---|---|---|---|---|
| Vendor first-party | 68.0% | 52.2% | 45.7% | 48.8% | 47.8% |
| Niche publisher hub | 11.3% | 23.9% | 19.7% | 18.4% | 21.6% |
| Listicle / content-farm | 2.0% | 11.2% | 9.5% | 10.1% | 10.0% |
| Business press | 2.7% | 3.9% | 5.4% | 5.2% | 1.6% |
| Video (mostly YouTube) | 0.0% | 0.0% | 3.0% | 2.6% | 9.7% |
| Institutional (.gov / .edu / nonprofit) | 6.7% | 3.4% | 1.9% | 1.8% | 3.1% |
| Personal blog / Medium | 0.0% | 1.5% | 4.1% | 3.1% | 1.3% |
| Tech press | 4.7% | 0.0% | 2.7% | 3.6% | 0.3% |
| Marketplace | 2.7% | 2.4% | 1.9% | 1.3% | 2.2% |
| Review aggregator | 0.0% | 0.5% | 1.6% | 2.1% | 1.3% |
| Community discussion (Reddit, Quora, HN) | 0.0% | 0.0% | 2.4% | 1.0% | 0.0% |
| Lifestyle media | 1.3% | 0.0% | 1.4% | 1.0% | 0.3% |
| Developer platform | 0.7% | 0.0% | 0.5% | 0.5% | 0.3% |
| Search-engine cache URL | 0.0% | 1.0% | 0.3% | 0.3% | 0.6% |
Coverage: how broad each engine's citation diet is
The fourteen classified categories are not equally represented across engines. Gemini and Google AI Overview cite from all fourteen. Perplexity covers thirteen (the only gap is community discussion). ChatGPT and Claude each cover nine of the fourteen. The five categories ChatGPT skips entirely are video, personal blog, review aggregator, community discussion, and search-engine cache URLs. The five Claude skips are video, tech press, community discussion, lifestyle media, and developer platform.
| Engine | Categories cited | Categories absent |
|---|---|---|
| ChatGPT | 9 / 14 | Video (mostly YouTube), Personal blog / Medium, Review aggregator, Community discussion (Reddit, Quora, HN), Search-engine cache URL |
| Claude | 9 / 14 | Video (mostly YouTube), Tech press, Community discussion (Reddit, Quora, HN), Lifestyle media, Developer platform |
| Gemini | 14 / 14 | (all covered) |
| Google AI Overview | 14 / 14 | (all covered) |
| Perplexity | 13 / 14 | Community discussion (Reddit, Quora, HN) |
Five engine signatures
Run the per-engine columns through the eye and a personality emerges for each one. Below is the simplest version of that read.
ChatGPT
The vendor-first-party engine. 68% of its classified citations are the brand's own .com. Tiny appetite for community discussion, video, or personal blogs.
Top vendor first-party citations
- hubspot.com (3)
- semrush.com (2)
- salesforce.com (2)
- zoho.com (2)
- atlassian.com (2)
Claude
The niche-authority engine. 52% vendor first-party, 24% niche publisher hub. Cites Healthline, BetterTrail, Sleep Doctor, The Good Trade. Zero video, zero community.
Top niche publisher hubs
- healthline.com (6)
- bettertrail.com (3)
- sleepdoctor.com (3)
- thegoodtrade.com (3)
- upgradedpoints.com (2)
Gemini
The omnivore. Cites all fourteen categories. Accounts for 69% of all community-UGC citations in the dataset. Closest engine to a general web view.
Top community UGC citations
- reddit.com (8)
- quora.com (1)
Google AI Overview
Also covers all fourteen categories, with the highest video and tech-press share among the four web-grounded engines. Cites 9.3 sources per response.
Top tech press citations
- pcmag.com (5)
- cnet.com (5)
- techradar.com (2)
- zdnet.com (1)
- tomsguide.com (1)
Perplexity
The video-grounded engine. 9.7% of its classified citations are video, almost all YouTube. Mid-pack on everything else.
Top video citations
- youtube.com (31)
The vertical overlay: SaaS is vendor-first, CPG is publisher-first
Collapse the 25 verticals into three super-categories: tech SaaS, consumer services, and CPG / retail. The engine-level patterns above are real, but they are partly an artifact of which verticals each prompt set covered. Looking at the vertical lens directly, the gap between SaaS and CPG / retail is much larger than the gap between any two engines.
| Content type | Tech SaaS | Consumer services | CPG / retail |
|---|---|---|---|
| Vendor first-party | 76.0% | 58.7% | 9.4% |
| Niche publisher hub | 0.6% | 15.7% | 48.5% |
| Listicle / content-farm | 12.4% | 2.1% | 8.2% |
| Business press | 1.0% | 3.7% | 8.4% |
| Video (mostly YouTube) | 2.0% | 3.3% | 6.1% |
| Institutional (.gov / .edu / nonprofit) | 0.0% | 5.0% | 5.9% |
| Personal blog / Medium | 1.7% | 2.1% | 3.5% |
| Tech press | 1.9% | 4.1% | 1.8% |
| Marketplace | 0.3% | 0.4% | 5.1% |
| Review aggregator | 2.6% | 0.4% | 0.0% |
| Community discussion (Reddit, Quora, HN) | 0.7% | 1.7% | 0.8% |
| Lifestyle media | 0.0% | 1.2% | 1.8% |
| Developer platform | 0.9% | 0.0% | 0.0% |
| Search-engine cache URL | 0.0% | 1.7% | 0.4% |
Tech SaaS responses cite the vendor's own first-party 76% of the time and a niche publisher hub 0.6% of the time. CPG / retail responses cite the vendor first-party 9.4% of the time and a niche publisher hub 48.5% of the time. The buyer asks "best CRM software" and gets salesforce.com. The buyer asks "best mattress" and gets sleepfoundation.org. Two different AI-search products, depending on the vertical.
The single engine-exclusive category
The taxonomy includes one category that is effectively a one-engine signal in Q2 2026:
- Community discussion (Reddit, Quora, HN): Gemini accounts for 69.2% of all 13 citations to this category in the dataset.
Community discussion is a Gemini-shaped signal in this dataset. Google AI Overview picks up some of it (4 of 13 community citations). ChatGPT, Claude, and Perplexity pick up none of it. A brand or publisher investing in Reddit visibility specifically as an AI-citation lever should be honest with themselves that the payoff is concentrated in two of the five major engines and zero in the other three. The total citation volume to community UGC across the entire dataset was only 13 URLs (0.9% of classified citations).
What this means for a publisher
Three implications for an operator who cares about AI-search visibility across more than one engine.
One. Optimize for vendor first-party first. Every engine puts the brand's own .com in the top three content types (ChatGPT 68%, Claude 52%, Perplexity 48%, Google AI Overview 49%, Gemini 46%). The single highest-leverage investment is making the product pages, resource centers, glossary entries, and FAQs deep enough that an answer engine wants to lift from them. This dovetails with the AEO Readiness finding that 29.6% of scanned sites ship no structured data at all.
Two. Earn niche-publisher-hub citations for consumer-facing categories. If the product sits in CPG, retail, or health, the citation lever shifts from first-party to a category authority publisher: Sleep Foundation for mattresses, Healthline for health, OutdoorGearLab for outdoor gear, The Good Trade for sustainable consumer. Across all engines combined, these niche publisher hubs are the second-largest content type after vendor first-party.
Three. Treat video and community discussion as engine-specific bets, not universal ones. YouTube is a Perplexity bet, with a weak echo on Gemini and Google AI Overview, and nothing on ChatGPT or Claude. Reddit / Quora / HN is a Gemini bet with a weak echo on AI Overview, and nothing on ChatGPT, Claude, or Perplexity. Invest in these channels only if the engines that cite them are part of the target visibility set.
Other reports in this series
This is Day 7 of an ongoing research cadence. Days 1 through 5 draw from the Q2 2026 AI Search Citation Benchmark and measure the engine side of citation. Day 6 measures the publisher side: how technically prepared the typical scanned domain is to be cited. This Day 7 report slices the engine side by content type, the lens Day 2 collapsed away into binary aggregator-vs-vendor.
- AEO readiness across 311 websites. Median AEO 46 vs median SEO 86. The 40-point publisher-side gap.
- Top 100 most-cited domains in AI search (Q2 2026). Only 12 of 1,119 domains are cited by all five engines.
- ChatGPT vs. Google AI Overview: the same prompt, two different webs. 4.1% Jaccard, 64% zero-overlap.
- Buyer intent reshapes AI citations. Discovery, shortlist, and variation cite different domain sets.
- When AI engines cite the reviewer vs. the brand. The 70-point vendor-vs-aggregator gap by vertical.
- AI Search Citation Benchmark, Q2 2026. The underlying dataset.
Frequently Asked Questions
Why is the unclassified portion 44.6% of all citations, and is that a problem?
The 15-category taxonomy was applied to the high-volume domains that account for the structurally meaningful share of citations. Long-tail domains, each cited once or twice across the dataset, were left unclassified to keep the classification accurate rather than guessed. Every per-engine percentage in this report uses the classified subset as the denominator, so the unclassified portion is not double-counted or hidden. The 1,430 classified citations are still the largest hand-classified AI-citation sample we are aware of for Q2 2026, and the patterns visible in the classified subset are large enough that even pessimistic assumptions about the unclassified long tail would not flip the comparative engine rankings.
ChatGPT cites the vendor first-party 68% of the time. Does that mean ChatGPT is the easiest engine to win on?
Easier to influence, harder to monitor. Vendor first-party concentration means that the brand's own .com pages have the most direct lever on ChatGPT visibility, so investments in first-party content quality (product pages, resource centers, glossary, FAQs) translate the most cleanly into ChatGPT citations. But it also means a smaller surface area for diversification: when a brand has no organic citations from independent publishers, AI-search visibility becomes brittle. The strongest playbook on ChatGPT is to ship first-party content that is dense enough that an answer-engine wants to lift from it, while in parallel earning citations on the niche publisher hubs and institutional sources that other engines reach for.
Why does Perplexity cite YouTube and the other engines don't?
Perplexity is the only one of the five engines that surfaces YouTube as an inline source consistently. ChatGPT cites zero video. Claude cites zero video. Gemini and Google AI Overview cite YouTube but at very low rates (1.6% and 1.4%). The mechanism is that Perplexity's grounding model treats YouTube video transcripts as first-class web content and is willing to ship the YouTube URL in the citation panel, whereas the other engines either don't index transcripts heavily or don't surface video links as canonical citations. The practical implication for a publisher is that YouTube content has a meaningful AI-search payoff only if Perplexity is part of the target engine set, and that the citation lift from YouTube is therefore engine-specific, not universal.
Community discussion (Reddit, Quora, HN) is 0% on Claude and Perplexity. What does that say about the user-discussion-as-AI-signal thesis?
The popular thesis is that getting your brand discussed on Reddit translates into AI search visibility because models train on Reddit and engines cite Reddit. In this Q2 2026 sample that is true for Gemini and very weakly true for Google AI Overview, and not at all true for ChatGPT, Claude, or Perplexity. Out of 13 community-UGC citations in the entire classified set, 9 came from Gemini, 4 from Google AI Overview, and 0 from the other three engines. Reddit discussion is a real lever inside a narrow subset of the engine universe, but framing it as a general AI-search lever overstates how broadly the signal propagates.
How does this connect to the rest of the Q2 2026 benchmark series?
Day 1 measured cross-engine fragmentation by domain. Day 2 collapsed the 15 categories into a binary aggregator-vs-vendor lens and broke it down by vertical. Day 3 broke citations down by buyer intent. Day 4 paired ChatGPT and Google AI Overview specifically. Day 5 published the top-100 most-cited domains. Day 6 pivoted to the publisher side (AEO readiness across 311 scanned sites). This Day 7 report goes back to the engine side and reads the 15 categories per engine, which is the lens Day 2 collapsed away. The five engines turn out to have markedly different citation diets, which is the engine-level counterpart to Day 2's vertical-level finding.