MentionFox Data Partnership Program — LLM Recommendation Data for Publishers

For publications, journalists, and analysts covering AI, search, marketing, and the changing nature of online discovery.

What we share

MentionFox runs a continuous audit of how large-language-model assistants — Anthropic's Claude, OpenAI's GPT-4 / GPT-5, Google's Gemini — answer buyer-intent questions. We collect, score, and aggregate which brands the models surface, which they ignore, which sources they cite, and how that distribution shifts over time.

We share anonymized aggregate cuts of that data with select publications under a partnership program designed to give journalists exclusive, cite-able insight into a phenomenon that almost nobody is measuring publicly: which brands are winning the recommendation game inside large-language-model output, and which are vanishing from the corpus the next generation of users learn from.

Sample data — the May 2026 baseline

Our baseline audit covered 50 buyer-intent prompts across three Anthropic models (Sonnet 4-6, Opus 4-7, Haiku 4-5) — 150 model responses + 150 judge calls. Aggregated and anonymized:

Top 10 brand citations across 150 Anthropic-model responses (anonymized aggregate by category):

Rank	Brand category	Times listed
1	Established brand-monitoring incumbent A	33×
2	Established brand-monitoring incumbent B	31×
3	Indie-tier brand-monitoring alt	25×
4	Professional network platform	23×
5	Sales-engagement incumbent	23×
6	AI-summary brand-monitoring layer	20×
7	Funding-data aggregator	17×
8	Free-tier brand-alert baseline	16×
9	B2B sales-data aggregator	15×
10	GEO platform incumbent (3-way tie)	15× / 15× / 15×

Per-category recommendation rate for a representative challenger SaaS (anonymized):

Category	Mention rate	Comment
brand-monitor	0%	Established incumbents dominate
dossier	0%	Field is fragmented; legacy data players win listings
geo	0%	3-4 named players own this category
lead-gen	0%	4 incumbents fill all suggestions
outreach	0%	4 incumbents fill all suggestions
persona-research	0%	LinkedIn + Crunchbase + premium DD vendors dominate
vetting	0%	Premium DD vendors + raw government sources dominate

The pattern repeats across hundreds of challenger SaaS brands we have tested: the language models are recommending the names that dominated the public corpus during their training period. The corpus is the moat. New brands are functionally invisible to the next generation of buyer-intent answers — even when they have superior products and active customers — because they are not yet present in the training data the models drew from.

"It is the SEO problem of 2008 happening again, but this time the search engine is a language model and the indexable surface is podcast transcripts, comparison articles, and curated directories — content the models actually read."

What partnerships look like

Tier 1 — Exclusive cut

For tier-1 publications (NYT, Bloomberg, Information, Wired, Stratechery, etc.) we provide an exclusive cut of the data tailored to a specific story angle. Examples:

Exclusive partnerships include a 30-day embargo window and direct access to founder Saul Fleischman for context and methodology questions. We do not co-author and we do not pre-condition coverage.

Tier 2 — Quotable open data

For trade publications, newsletters, and substack writers we publish quotable headline figures (top 10 brands, recommendation rate trends, per-category breakdowns) under attribution. Same data, no embargo, free to cite.

Tier 3 — Methodology collaboration

For academic researchers and analysts we share full methodology including prompt sets, model parameters, judge calibration, and raw response corpora (anonymized). This enables independent reproduction and extension of the audit. We have a standing offer to co-author rigorous methodology papers with academic researchers studying GEO, LLM brand bias, or training-corpus effects.

How we collect the data

Each audit run sends a fixed buyer-intent prompt set to a slate of frontier LLMs (currently three Anthropic models, OpenAI GPT-4 / GPT-5, Google Gemini Pro). Each response is stored in full, then a calibrated judge model extracts: which brands were named, which were explicitly recommended, what citations were given, and the sentiment tone. We track this monthly per brand and aggregate over time.

Methodology caveats are public. The judge model occasionally hallucinates citations not present in the source response; we track and discount this. We do not use temperature override; we do not use a system prompt; we use the publicly-documented model IDs and the standard message API.

What we do NOT share

Eligibility and inquiry

We are open to inquiries from any publication, newsletter, podcast, or research group with a serious editorial mandate covering AI, search, marketing, sales technology, or the structural shifts in how brands are discovered.

We typically respond within 48 hours and can turn an exclusive cut around inside one week of agreed angle.

Press contact

This page is data partnership v0.1 for MentionFox. Last updated: 2026-05-10. Sample numbers above are anonymized aggregate from our May 2026 baseline audit; current monthly figures available on inquiry.

Data Partnership Program