SEO Is Not That Hard

Entities Part 6 : Fighting AI Fiction: Grounding Your Brand in Reality

Edd Dawson Season 1 Episode 326

Send us a text

Tired of confident AI answers that crumble under scrutiny? We pull back the curtain on why large language models hallucinate—and how to stop the damage by turning your website into a source AIs can safely cite. Instead of treating models like fact vaults, we treat them like brilliant writers who need trustworthy notes. That shift unlocks a practical playbook: ground responses in real documents, use retrieval‑augmented generation, and structure your content around clear, unambiguous entities.

We break down RAG in plain terms: retrieval first, generation second. Think open‑book exam, where you get to choose the book. When your policies, performance metrics, and definitions are precise and easy to retrieve, AIs pick your pages to anchor their answers. That means replacing “unparalleled performance” with “10,000 records per second, 20% faster than the previous version,” mapping canonical names for products and features, and supporting claims with dates, units, and links. We share cautionary tales—from invented airline policies to fake case law—and translate them into concrete steps any team can take to reduce risk and increase trust.

The bigger win is strategic. As organisations build internal copilots and external chat experiences, they’ll prioritise ingesting domains with reliable, machine‑readable knowledge. This authority economy rewards brands that publish clean, verifiable, entity‑rich content. We walk through a simple content audit you can run this week, how to align claims across your site, and why release notes, policy pages, and structured data make you more “retrievable.” By the end, you’ll know how to write for humans and machines at the same time—and how to become the default reference in your niche.

If this helped you think differently about content and AI, follow the show, share it with a teammate, and leave a review. Got a question you want answered on air? Send a voice note from the link in the show notes.

SEO Is Not That Hard is hosted by Edd Dawson and brought to you by KeywordsPeopleUse.com

Help feed the algorithm and leave a review at ratethispodcast.com/seo

You can get your free copy of my 101 Quick SEO Tips at: https://seotips.edddawson.com/101-quick-seo-tips

To get a personal no-obligation demo of how KeywordsPeopleUse could help you boost your SEO and get a 7 day FREE trial of our Standard Plan book a demo with me now

See Edd's personal site at edddawson.com

Ask me a question and get on the show Click here to record a question

Find Edd on Linkedin, Bluesky & Twitter

Find KeywordsPeopleUse on Twitter @kwds_ppl_use

"Werq" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
http://creativecommons.org/licenses/by/4.0/

SPEAKER_00:

Hello and welcome to SEO Is Not That Hard. I'm your host, Ed Dawson, the founder of the SEO intelligence platform KeywordPupeopleEaser.com, where we help you discover the questions people ask online and then how to optimise your content to build traffic and authority. I've been in SEO from online marketing for over 20 years and I'm here to share the wealth of knowledge, hints and tips I've amassed over that time. Hello and welcome back to SEO It's Not That Hard, it's me here, Ed Dawson, as always. And today we're on to the next episode in our entity series. This is about fighting AI fiction, grounding your brand in reality. So in the last episode, we looked into how a large language model works and we learned that an LLM like ChatGPT or Perplexity or Claude or Gemini they're not a database of facts but instead a giant statistical prediction engines. And they represent entities and concepts as coordinates on a vast multidimensional map of meaning, and it reasons by finding the most probable path from one point to another depending on their input prompt. Now this ability to find and extend patterns is what gives LLMs their the really incredible power for creativity and insight, but also leads directly to the single biggest, most dangerous flaw, a problem that has resulted in legal sanctions, reputational damage, and some truly bizarre and incorrect answers being presented with perfect confidence. So today we're going to talk about these AI hallucinations and we'll explore why they happen. And more importantly, we'll discuss what the strategy is that allows you to position your website as a source of truth that helps ground those AIs in reality to help reduce and prevent those hallucinations. So what exactly is an AI hallucination? An hallucination is when an LLM generates information that sounds completely plausible and it's delivered with a really confident tone as a factual statement, but it's partially or entirely completely made up. And it's crucial to understand why this happens. So remember, an LLM's core function is to predict the next most likely word. It's a pattern completer, it's not a fact checker. So when the model is asked a question and it encounters a gap in its training data, a topic it doesn't have like really good, robust information on. It doesn't just stop and say, I don't know. Its very nature compels it to fill that gap by generating the most statistically probable sequence of words regardless of their connections to factual reality. And it's not just a quirky little book, it's real serious, real-world consequences. There's a few cases, so like in a now quite famous case, a chatbot for Air Canada completely invented a bereavement fare policy when a customer asked about it. So the customer booked a flight based on this information, and then when the airline refused to honour this non-existent policy, the customer sued. And a court later forced Air Canada to honour the policy that its chatbot had fabricated. And then there's another high-profile incident where two lawyers in New York City faced legal sanctions after they submitted a legal brief that cited several entirely fake court cases. Where did they get these cases? They were generated completely but convincingly with convincing sounding legal citations by ChatGPT. So these examples they highlight this the core vulnerability of the technology. Without a connection to a verifiable source of truth, an LLM's output can be dangerously unreliable. So how do we fight this? How do the LLMs fight this? And how do we harness the power of the LLMs while mitigating this risk of misinformation? The primary solution is a strategy called grounding. And so grounding is a word that you might hear come up more and more now in the realm of LLMs. And the idea is to ground the LLM's response in an external, verifiable source of truth. So rather than allowing it to rely solely on its eternal and its potentially flawed memory and this way it does a prediction and making connections, this grounding helps give it something to verify the information that it's talking about. And the most important and widely adopted technique for this is called retrieval augmented generation or rag. And the best way to understand RAG is to think of it it's like giving the AI an open book exam. So a standard LM LLM prompt is like a closed book exam. You ask it a question and it has to answer based only on what it knows from its training, and this is where the hallucinations can happen. A rag system is different, it works in two phases, retrieval and then generation. First is the retrieval phase. So when you submit a query, the system doesn't immediately send it to the LM. Instead, it first uses a query to search through a trusted up-to-date knowledge base. Now this could be a company's internal documentation, it could be specifics of academic papers or the content of an entire website. It could be a Google search, you know, where it goes to Google and does some search, gets some information from the websites that are ranking well in Google for terms around the topic that you're asking them about. And it retrieves the most relevant snippets of information related to your question from these sources. And then second is the generation phase. And this is where the system takes those retrieved factual snippets and it augments the original prompt you give it, and then essentially says to the LLM, here's the user's question, here are some relevant facts I found from trusted sources. Now using these facts, please generate an answer. So this simple, powerful process really does massively improve things. It forces the AI to base its answers on the provided verifiable content, dramatically reducing the chances of hallucinations and ensuring that information is current. And this is where everything we've been talking about in the series comes together for you as a website owner. So in a world powered by Rag Systems, what is the most valuable asset? It's the book that the AI uses for its open book exam. It's the trusted external knowledge base. And for your industry, your website has the potential to be that definitive source of truth. And this is why factual accuracy and unambiguous language are becoming really much more important on the web. Vague marketing claims, they're not only unhelpful to users, they're poison for AI systems. An LLM can't reliably pass and understand the statement that says our product offers unparalleled performance. It's subjective, it's nonsense, it contains no verifiable information. But a statement like our product processes data at 10,000 records per second, a 20% improvement on the previous version, now that's much, much better. It's clear, it's factual, and it's unambiguous piece of data that a rag system can retrieve and use to answer the user's questions with much more confidence. And this is where entities play a really critical role. So when a user asks a question, the RAG system uses entity linking to identify the specific concepts that need to be looked up in the knowledge base, ensuring that retrieved information is highly relevant. So a well-structured website, which is rich with clearly defined entities, makes this retrieval process faster and more accurate. Now there are other more complex strategies that the LLMs use, like post-processing systems that will they'll extract all the factual assertions that the LLMs made from its output and it will then compare that against ground truth sources. So in effect, doing the same thing in reverse. But the principle's the same. It's connecting the AI to verifiable sources of facts. So let's come out and look at the bigger picture. This isn't just about optimized optimising for public chatbots like ChatGPT, but the widespread challenge of large language model unreliability is creating what some are calling a new authority economy. And in this new economy, the most valuable digital assets are no longer just high traffic websites, but clean, well-structured and verifiable databases of entity-based knowledge. So businesses everywhere are building their own internal and external rag systems to power their customer service bots, their internal search tools, and things like data analysis pipelines. And when they build the systems, they will point their data sort of ingestion pipelines towards sources that they deem authoritative and trustworthy. So a website that really meticulously structures its content, ensures factual accuracy and clearly defines its entities, is in effect, it's going to transform itself from a simple marketing channel into a really premium machine readable data source for its specific niche. So by optimising for entities and accuracy right now, you're not just improving your visibility in current search engines, but you're really setting yourself up for your brand to become a really foundational in an AI-driven economy. So the brands that are most consistently and authoritatively cited by these systems will become the default go-to places for answers. So this brings us to what I want you to really think about between now and the next episodes. I want you to look at your most important pages on your website with your kind of AI facts checker hat on. So it could be your homepage, could be your about us page, or it could be your key product or service pages. And just read through it and see if you can find one or two sort of vague subjective marketing claims. And your task really is to look at those claims and rewrite them into something that's more verifiable and data backed. So instead of saying we're a leading provider, try saying things like we've served over 5,000 customers in our sector since 2015. Or instead of saying our software's incredibly fast, think about saying things that are much more verifiable, like our software returns a search query in under 200 milliseconds. And this exercise, it'll start training you to think about your language that you use in verifiable facts rather than in vague subjective sort of non-realities. And that's the kind of language that AIs are gonna and customers are gonna start trusting. So now we've got this is the end of the theory part of entities. Next, what we need to do is to start look at how we can actually get practical. So in the next episode, we'll start with part one of our four-part action plan, which will be a step-by-step guide on how to audit your own entity landscape. So that's it for this time. So until next time, keep optimising. Stay curious, and remember SEO is not that hard when you understand the basics. Thanks for listening, it means a lot to me. This is where I get to remind you where you can connect with me and my SEO tools and services. You can find links to all the links I mentioned here in the show notes. Just remember with all these places where I use my name, the Ed is spelled with two Ds. You can follow me on LinkedIn and Blue Sky, just search for Ed Dawson on both. You can record a voice question to get answered on the podcast, the link is in the show notes. You can try my SEO intelligence platform Keywords People Use at keywordspupleuse.com where we can help you discover the questions and keywords people asking online. Finally, connect your Google Search Console account for your sites so we can crawl and understand your actual content. Find what keywords you rank for and then help you optimise and continually refine your content.eddawson.com. My for now and see you in the next episode SUA is not a hammered.

People on this episode