AI Citations, Indexing Quality & Hidden Manipulation

Search Digest Issue #001 — week of 17 Feb 2026

Three of the five pieces this week are circling the same problem. Kevin Indig's citation research, Lily Ray's organic visibility analysis, and Jan-Willem Bobbink's indexation reframe are all asking a version of the same question: what does it actually take for AI systems to trust, cite, and surface your content? The data is starting to move from interesting hypothesis to something concrete enough to act on.

The iPullRank negative query study sits slightly apart but connects into the same picture. Most SEOs I know haven't run these tests on their own brand yet. They should. Your brand could be appearing in AI answers as something to avoid right now, sourced from threads written two or three years ago, and there would be nothing in Search Console, no rank tracker, no alert to flag it.

Then there's the Microsoft story, which made me stop and re-read it. Thirty-one companies. Hidden prompt injections designed to manipulate AI recommendations. These are recognisable brands, not fringe experiments. The manipulation attempts are here. That's now the baseline we're operating in.

Search Engine Land

44% of ChatGPT Citations Come From the First Third of Content

I've read a lot of GEO research over the past year. Most of it is either too vague to act on or so specific to one platform's quirks that it doesn't travel. This one is different. Kevin Indig analysed 1.2 million AI answers and 18,000 verified citations, which gives it the sample size to be taken seriously.

The central finding is structural rather than stylistic. It's not just what you write, it's where it sits. The first 30% of a page is doing nearly half the work when it comes to citation chances. That's not a content quality tweak, it's a rethink of how pages open and how quickly they get to the point.

Key points

  • 44.2% of all ChatGPT citations come from the first 30% of a piece of content, a distribution Indig calls the "ski ramp" pattern
  • Five traits correlate with higher citation rates: definitive statements, conversational Q&A structures, high entity density, balanced sentiment, and plain language
  • Definitive language consistently outperforms hedged or qualified statements — "X is Y" gets cited more often than "X may be Y" or "X can sometimes be Y"
  • Pages with clear entity definitions near the opening are cited more frequently, regardless of industry or content type
  • The pattern holds across how-to guides, product reviews, and analysis pieces — it's not format-specific

Key takeaway

Your intro is no longer just a hook. It's where citation chances are won or lost. Rewrite your opening sections to lead with the direct answer, the entity definition, and a Q&A structure. Not the context, not the scene-setting — the answer, right at the top.

Also worth considering

Entity density matters throughout but especially early. Use your first paragraph to establish what you are, what you do, and the specific claim you're making. Every sentence in your opening section should earn its place. If it's setting the scene rather than stating something, cut it.

What I'm testing

Front-loading entity definitions and a direct Q&A block within the first 200 words on several pages. Running it over 60 days to track whether AI citation visibility shifts.

Read the full article

Lily Ray / Substack

Are Citations in AI Search Affected by Google Organic Visibility Changes?

There's a quiet narrative building in GEO circles that you can effectively ignore your organic search performance and build an AI search presence independently. Lily Ray's analysis challenges that directly, and she does it with data rather than theory.

The short answer is yes, organic visibility changes do affect AI citation rates. And the correlation is close enough that treating GEO and SEO as separate disciplines you can prioritise independently is looking increasingly hard to justify. I've had this conversation with clients who want GEO to be a clean reset. A way to start fresh if their organic performance is in poor shape. This research makes that argument a lot harder to make.

Key points

  • Sites that lost Google organic visibility also saw measurable drops in AI citation rates — the correlation holds at both domain and individual page level
  • Recovery in traditional search rankings tracks closely with recovery in AI citation share over time
  • Domain authority and E-E-A-T signals appear to transfer between traditional and AI search channels rather than operating independently
  • Smaller sites with very strong topical authority show some resistance, but the general pattern is consistent across site types and industries
  • The data challenges the idea that GEO optimisation can operate independently of a site's organic search health

Key takeaway

Fix your SEO first. AI citation rates track organic authority more closely than most GEO-focused advice is willing to admit. If someone tells you they don't care about Google rankings anymore and just want to appear in AI answers, push back hard.

Also worth considering

If you're running a GEO audit, run it alongside an organic health check. The two are not independent. Fixing technical SEO issues, improving E-E-A-T signals, and recovering lost rankings will likely move your AI citation numbers too — even without any GEO-specific changes.

What I'm testing

Comparing AI citation visibility against organic ranking changes through Q1 2026. Want to see whether the link holds at smaller scale with less domain authority behind it.

Read the full article

Jan-Willem Bobbink / LinkedIn

Indexing is Now a Quality Signal, Not a Technical Step

I've held this view for a while but I hadn't seen it stated this cleanly before. The framing matters because it changes the entire diagnostic process when you're dealing with pages that won't index.

Most technical SEOs respond to non-indexation by checking robots.txt, reviewing canonical tags, and submitting URLs in Search Console. That's the right starting point. But when those checks pass and the page still won't index, the standard next move is to dig deeper technically. This post argues — and I agree — that the better next move is to stop and ask a different question: why doesn't Google think this page deserves to exist in its index?

Key points

  • When a page isn't indexed despite correct technical setup, it's a quality judgment from Google, not a crawl or configuration failure
  • Thin category pages, near-duplicate product descriptions, and FAQ pages that restate content already covered elsewhere are the most common victims
  • Google's indexation threshold has risen as the web has grown — there's more competition for index slots and less tolerance for marginal content
  • Content consolidation (merging weaker pages into a single stronger one) frequently works where content improvement alone doesn't
  • The diagnostic question shifts from "what's blocking this page?" to "why doesn't Google think this page is worth indexing?"

Key takeaway

When a page won't index despite passing all technical checks, treat it as a content quality problem. Rewrite it, consolidate it into a stronger page, or remove it entirely.

Also worth considering

Before you delete a non-indexed page, ask whether the content it contains could strengthen a page that is indexed. Consolidation often beats deletion outright — you keep the information, give Google one clearer signal, and avoid creating dead internal links in the process.

What I'm testing

Auditing a site's non-indexed pages purely from a content angle rather than a technical one. Measuring whether rewrites and consolidations alone shift indexation status within 90 days.

Read the full post

iPullRank

What Not To Buy: Performing Negative AI Searches

This is the piece most SEOs aren't reading. They should be. The iPullRank team ran the tests that almost nobody has thought to run yet: they put brands through negative queries on AI platforms and documented exactly what came back.

AI Mode named companies directly as things to avoid. It pulled those signals from Reddit and Facebook. Those threads could be two or three years old. Copilot was more cautious. ChatGPT was inconsistent. But across all three platforms, the same underlying issue appeared: negative user-generated content from old forum threads is feeding into AI brand recommendations right now, and there is no monitoring infrastructure in place to catch it.

Key points

  • Google's AI Mode names specific companies directly when asked negative queries like "what brands to avoid" or "worst X for Y use case"
  • It draws primarily from Reddit and Facebook threads as sources for negative sentiment signals — user-generated content is shaping brand recommendations
  • Microsoft Copilot takes a more conservative approach and avoids naming specific vendors in negative answer contexts
  • ChatGPT produced inconsistent results across the same queries, in some cases recommending a brand immediately after warning against it in the same answer
  • No rank tracker or monitoring tool currently covers negative AI brand mentions — detection requires manual testing across platforms

Key takeaway

Run your own brand through negative AI queries now. Try "is [brand] worth it?", "problems with [brand]", and "[brand] vs alternatives" across AI Mode, Copilot, and ChatGPT. Find out what AI is saying before your prospects do.

Also worth considering

If you find negative mentions, suppressing old Reddit threads isn't the play. The fix is generating enough positive, authoritative, recent content about your brand that the negative signals get outweighed. Volume and recency both matter to how AI systems weight sources. That's where the effort should go.

What I'm testing

Building a structured monitoring process for negative AI brand queries across AI Mode, Copilot, and ChatGPT. Will share what comes back once I have enough data to be useful.

Read the full article

Search Engine Journal

Microsoft Found 50+ Hidden Prompts From 31 Companies Trying to Poison AI Recommendations

Thirty-one companies. Hidden prompts embedded in web content, designed to influence AI assistant memory and skew recommendations. Microsoft found them, documented them, and removed them. The story isn't surprising in principle — the incentive to manipulate AI recommendations is enormous and the temptation was always going to be there. What surprised me was the scale and the breadth of industries involved.

These are recognisable brands, not fringe actors running experiments nobody's watching. Some of them will have had legal sign-off on this. Some will have had teams working on it. They all got caught relatively quickly. That tells you Microsoft's detection is better than the attempts to evade it, at least for now. But this won't be the last attempt, and the arms race between manipulation and detection is only just beginning.

Key points

  • Microsoft identified 50+ hidden prompt injections embedded in web content published by 31 different companies
  • The prompts were designed to influence AI assistant memory and skew recommendations in favour of those companies
  • The companies involved span multiple industries and are not niche players — these are mainstream brands making deliberate decisions
  • Detection came relatively quickly, suggesting Microsoft has active monitoring in place for this type of content manipulation
  • The incident represents the first documented large-scale attempt at systematic AI recommendation manipulation — and it won't be the last

Key takeaway

Don't do this. Beyond the obvious ethical problems, the detection risk is real and the reputational fallout for brands caught doing it will be severe. Trust, once lost in AI systems, is very difficult to rebuild. Earn your citations.

Also worth considering

The companies that got caught weren't ignoring AI search — they understood the opportunity and chose the wrong route to it. The legitimate version of this is understanding how AI systems weight content and building material that genuinely deserves to be recommended. That's a harder brief, but it's the one with a shelf life.

Read the full article

That's issue #001. The theme running through this week is that AI search is no longer theoretical. There's real data on what gets cited, there's a clear link back to organic authority, there's manipulation happening at scale, and there are brand reputation blind spots most teams haven't even looked for yet. The question of what it takes to be trusted by these systems is getting sharper every week.

If any of these changed how you're thinking about something, or you're already running a test on one of them, I'd like to hear about it.

Free Consultation

Let's Talk


Tell me what you're working on. I'll give you an honest assessment and we'll explore if working together makes sense — no hard sell, just a free, no-obligation call.