ChatGPT leaving watermarks in the text that it ...
ChatGBT has started leaving watermarks in the text that it writes, and they survive whether you paste that into Word, Google Docs, anywhere basically. So let's look at how to find them, how to remove them, and why they're even there. So these new ChatGBT models, specifically 03 and 04 Mini, are using these almost invisible types of space character. So not the type you get when you hit the spacebar on your keyboard, but these narrow, non-breaking spaces and zero-width spaces. Often they seem to appear between numbers, values, percentages, things like that. So they show up a lot if you're writing articles, essays. So to reveal them, you just need to paste your text into this website. I've popped the link in the comments. And this is just some random text from some research I've been doing. I pasted that in. You can see that these characters are absolutely all over it. And again, it's just these two types of space, right? You can see these two different codes showing up. One is the narrow space, and the other is the zero-width space. To remove them, I just take my text, again copied from ChatGBT, and paste it into a code editor. I'm going to use Sublime Text because it's free and it's really easy to use. It's actually very handy in general to have. So I just paste it in. And again, I can see those unicodes dotted about the place. So let's get rid of the 202 one. So we just select it, copy it, go to Find, paste it in, hit Find All, and it instantly just selects them all. And if I just hit Backspace, they're all gone. And the same goes for the other type of space there. I just find them all, delete them, all gone. Now, OpenAI have responded to this. They said it's not an intentional watermark. It's just a quirk of large-scale reinforcement learning, which is pretty plausible to me. I don't see why OpenAI would be incentivized to have people called out for using ChatGBT, right? Quite the opposite, in fact. So I don't think they are intentionally watermarking. I think it's quite plausible that the LLM has just learned that as a way of dividing up tokens, i.e. words and characters, to make more logical sense of the building blocks of language, perhaps. Who knows? Or maybe somewhere in its training data, it just got stuck on that as a good idea, kind of like it did with the em dashes, you know, the long hyphens that people get now called out for using anywhere in their text. So anyway, now you know how to find them, go and check your boss's emails and LinkedIn posts.
No AI insights yet
Save videos. Search everything.
Build your personal library of inspiration. Find any quote, hook, or idea in seconds.
Create Free Account No credit card required