The key to making AI useful is RAG aka retrieval augmented generation #ai #rag

Name: The key to making AI useful is RAG aka retrieval augmented generation #ai #rag
Duration: 170 s
Description: The key to making AI useful is RAG aka retrieval augmented generation #ai #rag

2:50 Jun 08, 2025 291,800 16,800

@askcatgpt

542 words

If you wanna know about AI, you have to know about RAG. LLMs are very powerful prediction engines. They take your prompt and then generate an answer based on a plausible string of follow-on text, okay? It's predicting what is most likely to come next. As humans, we perceive this, what's most likely to come next, as an answer to the question or the prompt. The way that they do this is by being trained on every book that's ever been written and the entire corpus of the internet. LLMs are extremely good at predicting what words follow other words, because it's consumed almost every word that's ever been written in the exact order that it's ever been written. However, it has its quirks. And one of those quirks is that it has a cutoff date. So like for ChatGPT40, for example, has not read anything that has been written past October, 2023, okay? And even if it had, it doesn't have access to private information. It's only things that are out in the public, like the internet and public libraries. So if you ever wanted to ask ChatGPT, like, what is my vacation policy at work? It doesn't have access to your employee handbook. It doesn't know. And if you ask it something specific about you, it certainly doesn't know the answer, unless you tell it and you train it on that beforehand. But that requires a lot of energy, and then you'd have to paste it in every single time you want it to reference a document. I mean, you can do it, but it's just kind of annoying. So RAG is super simple as a concept. It takes the LLM and combines it with a database of other information. This database can be specific information about you. It can be all of the journal entries you've ever written. It could be all of your text messages. It could be all of your emails for a company. It could be your employee handbook. It could be all your financial data, whatever. It's just more information that the LLM has never previously seen, and therefore has not been trained on. Essentially, when you provide a text prompt and input, it doesn't directly get sent to the LLM. Instead, it gets searched in your database, so whatever you provided as additional context. Anything that could be potentially relevant to your prompt would get pulled, and then that additional information that gets pulled gets appended to the end of the prompt that you provided. So now you've got an extra massive text string, and that whole chunk of your initial prompt, plus all the additional information that got sucked out of the database, gets smooshed all together, and that whole thing gets sent to the LLM. And because the LLM has that additional context to go off of, it then can generate a response back to you that makes sense and isn't made up. I know this might sound like crazy science fiction or something that would be really difficult to implement, but it's actually pretty easy. And if you're interested, I can make another video showing you how to create a RAG pipeline for something as easy as your own email inbox.

No AI insights yet

Save videos. Search everything.

Build your personal library of inspiration. Find any quote, hook, or idea in seconds.

Create Free Account No credit card required

Original