Giving metacognition to LLMs

Name: Giving metacognition to LLMs
Duration: 459 s
Description: Giving metacognition to LLMs

7:39 Jun 08, 2025 15,600 1,006

@cjtrowbridge

1162 words

I think I figured out how to give LLMs metacognition, the ability to think about how they are choosing to think. The ability to choose how to think. This is something that is currently missing in its entirety to the great detriment of efforts towards creating agents, towards creating any kind of autonomous systems with LLMs, and I think I figured it out. Okay, so think about this. Everyone has seen the game where you take a few words and you start hitting the middle button on word suggestion on your keyboard, on your phone, and it makes up a sentence, a reply. That is all that an LLM is at its core, and beyond that, there's tooling, and the tooling is actually where all the magic happens. So, there is a process called temperature. The model of an LLM, a large language model, its output, based on the input that you give it, is a list of all the words that could possibly come next. I know it's more complicated than that, but that's basically, just go with me, okay? All the words that could possibly come next, maybe a comma could come next, maybe half of a word could come next, whatever. All the possibilities, the probability that each one should be next. That's all that the model gives you. Then, temperature takes, randomly, one of the top suggestions, and that is the output. So, there are a lot of times when you are doing this, like, hit the middle button over and over, and it's like, well, actually, if I hit this one, it would be more interesting, right? Theoretically, that is what's happening with temperature. The problem is, it's totally arbitrary, and the model has no choice but to build on what it has already said. So, everything a model says becomes a post hoc justification of the words before it, and that creates a bias that could be nonsense towards some claim that doesn't make any sense. But it has to keep going, and then you can get an answer back from it that is just crazy. If you ask it, is that a good answer? And it's like, no, and then it will tell you why, and the second one will actually be a better one. So, I've been testing LLMs. I've been working with, for example, I was working on GPT-3 before JaiGPT came out. I've been thinking about this problem for years, for years. That's why I wrote my graduate thesis in AI on machine cognition, and what's wrong with the way that we think about it, and how it could be better. I think I figured it out. There are like N to the N permutations of possible responses you could get back when you write a prompt to an AI model, and you get a sentence back. There are a lot of sentences that you might get back, right? If you ask it the same thing twice, and you have the same random seed, it will give you the same answer. This opens up a world of possibilities for mapping the potential responses that you could have gotten. For example, if I ask it a question, let's say there are 10 possible responses for the first word, and there are certain probabilities for each one. Maybe it's 80%, 40%, the percent that it should be the next one, etc. Take the ones that are more than half, more than 50% probability, or above the median of whatever the responses are. Then create a chain for each of them, and then all of the responses to that word, and that word, and that word. You very quickly get this large number of possible responses. Even if you just did the top three, the resource requirements would not be that much higher. I mean, they would, but it's manageable. If you take the top three, and then the top three for each of those, and the top three for each of those, you will very quickly get a list of the first four or five words that it's going to respond. Then make a list of those, and have it rank them as how good of a response that is. You will very quickly see that some of the responses are crazy. This craziness factor is the thing that's missing in terms of the metacognition that machines are currently capable of doing. Think about the way that you think about things. Human cognition is a complex set of deep structures in the brain that are just throwing things out into the default mode network. There's this background noise going on where it's like, I want a sandwich. How many trees are there? All of these things just popping up. Some of them are crazy. What would it feel like to jump in front of the traffic if you have PTSD and have intrusive thoughts? When you hear those, it's like, that's an intrusive thought. Discard. AI doesn't have that. That's what's missing. Discard that thought. To notice what something is, and to say, this one, we don't want that one. We're going with this one. That's metacognition. That's the process of thinking about thinking. That's what's currently missing from large language models. I think this approach of making a list of just the top possible responses, it will very quickly give you insight not only into what it could have said, what a good response actually looks like, but also the diversity of possible good responses, which is the ideal. If you ask it a question about a complex social issue, there is more than one good, correct answer. The perspective that it's taking depends on the context. If you ask a question, I'm not even going to give an example because that will become what the comments are about. Think of any political or social question. There's probably more than one good answer that different groups would say is correct. It should have those in mind when it chooses the one that it's going to use to respond. Developing that list using this method is, I think, an excellent way to give it metacognition about what are the range of possible good responses, and then which one is most appropriate for this context, this situation. This is what's missing today from agents and what is missing from LLMs, preventing them from being a good tool to use for agents. I think this is a good approach. I'm going to be building a demo of this as part of my current long-form AI big-picture guide that I'm making a series about on The Other Place, The Red One. Check it out. Let me know what you think. Let me know if people have other thoughts about this. I think this is one of the best ideas I've ever had. I've been working on it for four or five years. I'm curious what people think.

No AI insights yet

Save videos. Search everything.

Build your personal library of inspiration. Find any quote, hook, or idea in seconds.

Create Free Account No credit card required

Original