Back to blog

What "Material Finding" Means in an AI Visibility Diagnostic: The Three-Criteria Test

By Jordan Ellison

The Question Behind "What Did You Actually Find?"

Any analytical report a CHRO is asked to circulate to a CMO, a CFO, or a board faces one quiet test before it travels: is this a finding, or is this commentary? The distinction decides whether the document gets forwarded with a one-line endorsement or gets quietly closed after the executive summary.

A hiring-specific AI visibility Diagnostic measures how the four leading AI models -- ChatGPT, Claude, Gemini, and Perplexity -- describe a company when candidates research what it is like to work there. The output is a written report of findings. The word "finding" is doing real work in that sentence, and it is worth defining precisely, because the value of the report is entirely a function of how high the bar sits.

A material finding is an observation in the final report that satisfies all three of the following criteria. If any one of the three fails, the item is an observation, not a finding, and it does not count.

  1. Specific named issue -- the finding identifies a specific query, persona, model, competitor, or citation source.
  2. Data evidence -- the finding cites at least one captured model response or citation, not generic commentary.
  3. Actionable category -- the finding indicates a remediation direction.

Three criteria. All three required. The rest of this post works through each one with a positive and a negative example, then explains why the bar is published rather than held privately.

Criterion One: A Specific Named Issue

The first test is specificity. A finding has to name the thing. It points at a particular query, a particular candidate persona, a particular model, a particular competitor, or a particular citation source. Vagueness fails this criterion automatically, because a vague statement cannot be acted on and cannot be verified.

Consider a revenue-organization example. A company hires senior account executives and wants to understand its AI visibility for that persona.

Fails criterion one: "The company's AI presence for sales roles could be stronger." This is true of almost every company. It names nothing. There is no query, no model, no source, no persona granularity. A reader cannot act on it and cannot check it.

Passes criterion one: "For the query 'what is the sales culture like at [company]' run against the senior account executive persona, three of the four models returned an answer anchored on a single 2024 review thread, and one model returned no company-specific answer at all." This names the query, the persona, the model behavior, and the source. It can be acted on and it can be verified.

The discipline here is that specificity is not optional polish. It is the load-bearing property that makes everything downstream possible. A report full of true-but-vague statements is a horoscope. A report of specific named issues is an operating document.

Criterion Two: Data Evidence

The second test is evidence. A finding cites at least one captured model response or citation. It does not assert; it shows. The captured response is the proof, and it is what allows a skeptical executive to read the finding and the underlying material side by side.

Consider a healthcare example. A regional hospital system is recruiting experienced nurses and wants to understand how AI frames the system as an employer.

Fails criterion two: "AI tends to describe the system's culture in mixed terms." This is an interpretation with nothing under it. Whose interpretation? Drawn from what? A reader has to take it on faith, and executives do not forward documents they have to take on faith.

Passes criterion two: "When asked 'is [system] a good place to work as a nurse,' two models returned answers that cited staffing-ratio discussions from a named clinician community thread, and one cited a regional news article about a 2023 labor action. The captured responses are reproduced in Appendix B, findings 14 through 16." Now the interpretation rests on captured material the reader can inspect. The finding is falsifiable, which is exactly what makes it trustworthy.

Evidence is also what separates a measurement deliverable from a point of view. Anyone can have a point of view about how AI talks about a company. A finding has to carry the receipts.

Criterion Three: An Actionable Category

The third test is direction. A finding indicates a remediation category -- it tells the reader what kind of thing could be done about it, even if the specific plan is downstream work. A finding without a direction is a diagnosis with no treatment axis: interesting, possibly alarming, and operationally inert.

Common actionable categories include: a citation source is missing, the narrative contradicts across models, a persona is invisible, a competitor is the more cited answer, a cited source is stale. The finding does not have to prescribe the full fix. It has to place the issue in a category that has a known remediation direction.

Consider an engineering-hiring example. A software company is recruiting staff-level engineers.

Fails criterion three: "AI rarely surfaces the company's engineering content." A reader nods and then asks the only question that matters -- so what would we do? -- and the statement has no answer built into it. It does not say whether the content does not exist, exists but is not being cited, or exists and is cited but outranked by a competitor. Three completely different actions hide behind the same sentence.

Passes criterion three: "For staff-engineer discovery queries, the company's functional engineering content exists but is not appearing in AI synthesis; the cited technical surfaces are third-party community threads instead. Category: citation-source mismatch -- the company publishes into surfaces AI is not reading for this persona." Now the reader knows the direction: the issue is not absence of content, it is a mismatch between where the content lives and where AI reads. That is a different project than writing more content, and naming the category is what makes the difference visible.

The Line Between a Finding and an Observation

Put the three criteria together and a clean line appears. An observation is something true about the AI surface. A finding is something specific, evidenced, and actionable. Most of what surfaces during analysis starts life as an observation; the work of the Diagnostic is to push each one across the line or set it aside.

The line matters most at the moment of circulation. When a CHRO forwards a report to a CMO or presents two slides of it to a board, every item on the page is going to be read by someone whose instinct is to test it. An observation invites the response "interesting, but is that actually true, and what would we even do?" A finding has already answered both halves of that question on the page. The report survives the room because each item was built to.

This is also why the count is what it is. The Diagnostic commits to surfacing at least 10 material findings, and the commitment is to findings -- items that clear all three criteria -- not to observations padded into a deck. Every Diagnostic Report includes a Finding Audit Appendix that numbers each finding and enumerates how it satisfies the three criteria, so the buyer never has to interpret the definition or adjudicate the count. It is presented in the deliverable, finding by finding.

Why the Bar Is Published

There is a reason to write the definition down in public rather than hold it as internal craft. Publishing the bar does two things at once.

First, it lets a buyer audit the deliverable against a standard that existed before the report was written. The criteria are not retrofitted to whatever the analysis happened to produce. They are fixed in advance, applied uniformly, and shown in the appendix. A buyer can check the work.

Second, it converts a soft promise into a hard one. The guarantee attached to the Diagnostic is specific: full refund if it surfaces fewer than 10 material findings. That sentence only means something if "material finding" means something, which is why the definition has to be precise, public, and applied the same way every time. A guarantee on a fuzzy noun is marketing. A guarantee on a defined noun is an obligation.

The bar is the product, in other words, as much as the report is. A buyer is not paying for ten statements about AI. A buyer is paying for ten items that each name a specific issue, carry captured evidence, and point at a remediation direction -- and for the discipline that keeps everything below that line out of the document.

That is what "material finding" means. Specific, evidenced, actionable. All three, every time, audited in the appendix, and counted against a number the buyer can hold us to.