Directional data is better than no data
The measurement framework we use at Minuttia when no single metric tells the full story.
đ Hey, Iâm George Chasiotis. Welcome to GrowthWaves, your weekly dose of B2B growth insightsâfeaturing powerful case studies, emerging trends, and unconventional strategies you wonât find anywhere else.
This note is brought to you by SERP Conf.
I spoke at SERP Conf. last year, and Iâm happy to support them again this year.
If youâre serious about organic growth and where search is heading in an AI-dominated world, this is one of the European conferences worth paying attention to.
The 2026 edition in Vienna this November brings together a super strong lineup of professionals whoâre in the trenches doing the work and not just preaching.
Iâm supporting the event as an ambassador because Iâve seen the work the team behind it does.
These folks are world-class.
And I mean it.
If you want to attend, thereâs a 20% discount using the âgeorge-20â code at checkout:
I got asked a question on a client call last month that I keep coming back to.
âIf prompt tracking is unreliable, what should we actually be measuring?â
Fair question. And one I did not have a clean answer for a while.
I have written about the problems with prompt tracking and the variance that makes point-in-time snapshots unreliable. This note covered what is broken (to a certain extent).
This one covers what to do instead.
I recently put together a list of 10 metrics we track for AI search performance. Not because any single metric tells the full story.
Because together, triangulated across multiple data sources, they start to paint a picture that is directionally useful.
This list will change! Probably within a year.
The tools will mature, the data will get cleaner and some of these metrics will be replaced by better ones.
But right now, in mid-2026, this is the framework we are working with at Minuttia.
Why a single metric will mislead you
Before I get into the list, I need to explain why measurement for AI search cannot be reduced to one number.
In the AEO survey Kevin Indig and Minuttia ran, 40.6% of 599 respondents said their single biggest challenge is a lack of reliable measurement tools and attribution. That was the number one response.
The SparkToro study I covered in my note on the free AI search audit showed that ChatGPT returns the same brand list less than 1% of the time for identical prompts. That variance alone should tell you that any single visibility metric is noise.
But there is a deeper problem:
AI search does not work like Google search. No stable SERP exists for you to point to.
No rank 1, rank 2, rank 3 persists between queries. The output changes based on the user, the session, the phrasing, and who knows how many other parameters.
That means the measurement approach has to be different, too.
You cannot port your SEO reporting model into AI search and expect it to work. You need multiple signals, from multiple sources, cross-referenced against each other.
Here is what that looks like.
The 10 metrics
Here is the (non-exhaustive) list of metrics you can use to measure your AI search performance:
1. Referral traffic from AI search engines
What it is: sessions landing on your site from ChatGPT, Perplexity, Gemini, Claude, and other AI engines. Tracked in GA4 under referral sources.
Why it matters: this is the most direct signal that AI search is sending buyers to you. If someone asks ChatGPT âwhat is the best tool for Xâ and ChatGPT links to your site, that registers in your analytics.
The catch: a big chunk AI search traffic currently lands masked as Direct in GA4 because AI engines do not always pass referrer data. You need custom channel groupings and UTM parameters where possible to catch what you can. Even then, you are undercounting.
Authorâs Note: Self-reported attribution is highly recommended and can definitely help here.
2. Brand mentions and visibility
What it is: how frequently your brand appears in AI-generated responses to relevant prompts. Tracked through prompt tracking tools or manual audits.
Why it matters: even when AI engines do not link to you, mentioning your brand by name influences buyer perception. A recommendation from ChatGPT carries weight regardless of whether the user clicks through.
The catch: this is where the SparkToro variance becomes relevant. Point-in-time snapshots arenât as reliable. You need large sample sizes run over time to see directional trends. One measurement tells you nothing.
3. Share of voice
What it is: how often your brand appears relative to competitors for the same set of prompts. Expressed as a percentage.
Why it matters: knowing you appear in 40% of responses to a category prompt while your main competitor appears in 70% gives you a competitive benchmark. Tracking this over time shows whether your efforts are closing or widening the gap.
The catch: share of voice depends entirely on which prompts you choose to track. If your prompt list is biased (and it probably is), your share of voice calculation will be biased too. This metric is only as good as the prompt selection behind it.
4. Average position in AI responses
What it is: when your brand appears in a list of recommendations, where does it tend to land? First, second, fifth?
Why it matters: position bias exists in AI responses just as it does in search results. Brands mentioned higher get more attention. Tracking your average position over time shows whether you are moving up or down in AI enginesâ preference hierarchy.
The catch: many responses do not follow a numbered list format. Some are conversational. Some mention brands in passing without ranking them.
Authorâs Note: This metric only applies to a subset of AI responses, and interpreting it requires context about the response format.
5. Citations and citation share
What it is: how often AI engines cite your content as a source in their responses. Tracked by monitoring which URLs appear in citations across AI-generated answers.
Why it matters: citations are the AI equivalent of backlinks. When Perplexity or ChatGPT with web search cites your page, it signals that your content is considered relevant (among many other things) for that topic. A high citation share means your content is part of the information architecture that AI engines rely on.
The catch: citation behavior varies wildly across AI engines. Perplexity cites aggressively. Googleâs AI Overviews sometimes show source cards, sometimes do not. You need to track citation share per engine, not across all engines as one number.
6. Conversions and revenue from AI search
What it is: leads, sign-ups, demo requests, or purchases that originated from AI search referral traffic. Tracked in GA4 or your CRM with proper attribution.
Why it matters: this is the metric that connects AI search to business outcomes. Traffic and mentions are nice. Revenue pays the bills.
The catch: attribution is the biggest gap here. Because so much AI search traffic arrives as Direct, you are almost certainly undercounting AI-influenced conversions. Assisted conversion paths and self-reported attribution (âhow did you hear about us?â) help fill the gap, but neither is perfect.
7. Sentiment
What it is: whether AI engines describe your brand positively, negatively, or neutrally when they mention you. This connects to what I call Perception Deviation.
Why it matters: being mentioned is one thing. Being described accurately and favorably is another. If ChatGPT recommends your product but adds caveats like âhowever, users have reported issues with pricingâ or âthis tool is better suited for smaller teams,â that sentiment shapes the buyerâs perception before they ever visit your site.
The catch: sentiment is hard to quantify at scale. You can audit it manually (which I recommend in my note on the free AI search audit), but automating sentiment tracking for AI responses is still primitive. Most tools in this space are not mature enough to do this reliably.
8. Incrementality (h/t Kevin Indig)
What it is: the revenue or pipeline that AI search generates that would not have existed without it. This is the hardest metric on the list but the most important one for proving the channelâs value.
Why it matters: your CFO does not care about share of voice. They care about whether AI search investment generated revenue that would not have come in otherwise. Incrementality is how you answer that question.
The catch: measuring incrementality requires controlled experiments. Geo-based holdouts, time-series analysis, matched market tests, or before-after comparisons.
Authorâs Note: Most companies are not set up to run these for AI search yet. But if you can isolate a cohort of customers who discovered you through AI search (via self-reported attribution) and compare their LTV to other cohorts, you start to get directional answers.
9. AI crawler activity
What it is: how frequently AI crawlers (GPTBot, ClaudeBot, PerplexityBot, and others) visit your site and which pages they access. Tracked in server logs.
Why it matters: AI crawlers are the mechanism by which AI engines learn about your content. If GPTBot is crawling your pricing page weekly, that content is likely being ingested and used in responses. If certain pages get zero crawler visits, they probably do not exist in the AIâs knowledge of your brand.
The catch: crawler activity tells you about ingestion, not about output. Just because GPTBot crawls your page does not mean ChatGPT will surface that content in responses. But it is a necessary precondition. No crawl means no chance of appearing.
10. AI Overview impressions
What it is: how often your pages appear in Googleâs AI Overviews (formerly SGE). Tracked in Google Search Console.
Why it matters: AI Overviews are one of the most visible AI search features on the internet right now because they sit at the top of Google search results. If your content is being pulled into AI Overviews, you are getting massive exposure to searchers even if they never click through to your site.
The catch: Google Search Console data for AI Overviews is still limited. You can see impressions (which, of course, is quentionably accurate), but click-through data is incomplete.
And the relationship between appearing in an AI Overview and appearing in standalone AI search engines (ChatGPT, Perplexity) is unclear. They are related but separate channels.
Triangulating across data sources
Another thing to keep in mind is that no single tool gives you all 10 metrics. You need to pull from multiple data sources and cross-reference them. Here is how we do it at Minuttia:
Prompt tracking tools give you metrics 2, 3, 4, and 5 (brand mentions, share of voice, average position, citations). But remember the variance problem. Use large sample sizes and look for trends over time, not point-in-time snapshots.
GA4 and web analytics give you metrics 1 and 6 (referral traffic, conversions). Set up custom channel groupings to separate AI search traffic from Direct. Add self-reported attribution fields to your lead forms or onboarding experience.
Server logs give you metric 9 (AI crawler activity). Parse your logs for known AI crawler user agents. Track crawl frequency by page to understand which content is being ingested.
Brand mention monitoring tools give you a different angle on metric 2 and metric 7 (mentions and sentiment). These tools scan the web for brand references, including AI-generated content that gets published or shared.
Bing Webmaster Tools give you data on how Microsoftâs AI features (Copilot, Bing Chat) interact with your content. Since Bingâs infrastructure powers several AI experiences, this data is more relevant than most people realize.
Google Search Console gives you metric 10 (AI Overview impressions) plus traditional search data that provides context for how AI search performance relates to organic search performance.
Authorâs Note: No single source is reliable on its own. But when referral traffic is up, crawler activity is increasing, prompt tracking shows improved visibility, and self-reported attribution mentions AI engines more frequently, you have convergent evidence that your AI search efforts are working.
The attribution reality
I want to be direct about the biggest gap in this entire framework: attribution.
AI search traffic is systematically undercounted in most analytics setups.
When someone asks ChatGPT a question and ChatGPT recommends your product, one of three things happens:
The user clicks a link in the response (counted as referral, if the referrer is passed).
Or opens a new tab and types your URL directly (logged as Direct).
Or Googles your brand name (attributed to Organic Search).
In the second and third scenarios, AI search gets zero credit.
This means your GA4 data is almost certainly undercounting AI-influenced traffic by a significant margin.
What you can do about it:
Create custom channel groupings in GA4 that isolate known AI search referrers (chatgpt.com, perplexity.ai, claude.ai, copilot.microsoft.com, and others). This catches the traffic that does pass referrer data.
Authorâs Note: GA4 recently added a channel that tracks referral traffic from AI search called AI Assistant. So, setting up your own channel may be redundant at this point.
And as I mentioned a couple of times already, you should also add a âhow did you hear about us?â field to your lead forms or to your onboarding experience with AI search, as an explicit option.
Self-reported attribution catches the brand searches and direct visits that analytics miss.
Look at assisted conversion paths. Even if AI search is not the last click, it may appear earlier in the journey as a referral or as a brand search that followed an AI recommendation.
Build reports in Data Studio (or your BI tool of choice) that blend your content inventory with GA4 referral data. This lets you see which pages are receiving AI search traffic and correlate that with your prompt tracking data showing which pages AI engines cite.
To be clear:
None of this is perfect. But waiting for perfect attribution is the same as choosing to fly blind. Directional data is better than no data.
Final Thoughts
Measurement for AI search is messy. I will not pretend otherwise.
But the companies that build imperfect-but-directional measurement systems now will have a significant advantage over the ones that wait for the industry to figure it out.
Because by the time clean, standardized measurement exists, the early movers will have months or years of trend data to inform their strategy.
Start with what you can track today. Referral traffic, crawler logs, brand mention audits and self-reported attribution. Layer on prompt tracking data (with appropriate skepticism about its precision!).
And cross-reference everything. The signal is in the convergence, not in any single number.
This list will evolve. I will revisit it as the tools and the data get better.
But for now, these 10 metrics, triangulated across six data sources, are the best framework I have for answering the question: is our AI search effort working?
Thank you for reading todayâs note, and see you again next week.
Research Disclaimers and Limitations
GrowthWaves and its author are not sponsored by or compensated by any company mentioned in this note. This is independent editorial analysis and does not constitute investment, financial, or legal advice. The author may have relationships with, work with, or hold equity in companies referenced; however, no content in this piece was influenced, commissioned, or incentivized by any such relationship. AI tools were used as a research assistant in the preparation of this piece. All claims are sourced and linked throughout.
Sources
Minuttia x Growth Memo, âThe State of AEO: What 599 Marketers Told Us About AI Searchâ
GrowthWaves, âWhatâs actually wrong with prompt trackingâ
GrowthWaves, âThe free AI search audit nobody is runningâ
GrowthWaves, âPerception Deviation: The most important metric youâre not trackingâ
SparkToro, âNEW Research: AIs are highly inconsistent when recommending brands or productsâ
George Chasiotis, â2-Day Intensive AEO Courseâ



