AI is getting smarter… but is it smart enough t...
Okay, so there's this crazy new AI app, Gemini Live, and it can just talk to you in real time and also look at your screen or look at your video feed and give you input over what you're doing on your computer. And I think it's really crazy and worked great for me yesterday. So let's, let's check it out. I'm going to share my entire second screen over here, just so that I can do work over here and Gemini, I know that you're listening to me right now. I'm going to launch Comfy UI stable diffusion. And I would love you to take a look at my command window and let me know if you see any errors or inconsistencies pop up. And also help me with my Comfy UI workflow. Sure, I can take a look at your command window and your Comfy UI workflow. I see a lot of error messages in the command window regarding module imports and other things. It seems like the program is having trouble loading certain parts of the code. Is there anything specific you would like me to take a look at? Yeah, that's really interesting because Comfy has been working fine. But I know that sometimes there are errors within the startup window. So would love to hear from you what I can do to resolve these issues. Okay, it sounds like you're seeing some errors during the startup of Comfy UI, even though it normally works fine. Looking at the command window, I can see a couple of reoccurring error messages. Many of them relate to missing modules such as PyTorch underscore lightning, PydomL, segment underscore anything, and OpenGL. This suggests that Comfy UI is not set up with all of the proper dependencies. Perhaps you could try installing those missing modules or dependencies and see if that resolves the issues. Great. I'm not an expert at coding by any means. So could you give me instructions on how to solve these issues? Sure, I can help with that. Since you're not an expert in coding, let's go through this step by step. First, it looks like the error messages you're getting in the command window mention several missing Python modules. These are libraries that Comfy UI needs to function properly, and you'll need to install these using pip. The first module is PyTorch underscore lightning. To install this, you'll need to open up a command prompt or terminal and type the following command. pip install pytorch lightning Once that's installed, you can try doing the same thing for the other missing modules. If you are using the same command prompt, you can type pip install pytomol, press enter, then type pip install segment anything, and then press enter, followed by pip install pyopen gl, followed by enter. Let me know if you need any more help with that process. Great. Thanks so much. I will definitely do those after, but since for now my Comfy UI is working fine, I have this very simple Flux workflow, and I would love your input on if there's anything that I could make, any changes that I could make that would make it better. Okay, I understand. You're going to install those missing modules later, but for now you want feedback on your current workflow. Looking at it, I can see that it's relatively simple, using common nodes such as load diffusion model, clip text encode, VAED code, random number, and ksampler. One thing that I noticed is that the random number node is only set to generate one number, which could mean that every generation will be the same since the seed will be the same. If you want to have different images on each generation, you can increment that number or have that number be a random one. However, if you want the same result across multiple runs, then this could work well. Do you have any specific goal in mind with this workflow, or is there anything else I could look at? Well, I was wondering if there are any nodes that I could add into this workflow that might give me more interesting or more polished results from Flux. That's a great question. There are a few nodes that could definitely add more interesting and polished results to your workflow. One very common node that many people use is the VAE encode node, which you would place right after your load VAE node. This node essentially allows you to use the VAE that you loaded, which can greatly improve image quality, especially when combined with a high-quality diffusion model. It might also be useful for you to add the image scale node in your workflow before the preview image node. This way you can control the size of the images as well as upscaling them. Another node you might want to consider is the control net node. This would allow you to control the generation using another image as reference. You could input a pose or a depth map to have more creative control over the final result. Do any of those sound interesting? Yeah, that all sounds great. Thank you so much, Gemini. All right, so I stopped my stream there with Gemini and I'm going to close down Comfy as well just so that I can just talk about this for a second. It didn't give me exactly the information that I was looking for, but it still did clearly show that it has some knowledge about how Comfy UI works, about flux, and about different nodes. With trial and error, I think that this could be... For this specific use case, I think it's really interesting. I think that there will be clearly a lot of trial and error, but if I was using this for almost anything else that it does have solid knowledge on, that would just be information that I could just speak back and forth with Gemini about. I don't know, that seems pretty interesting and pretty exciting to me. The other thing that I thought was really good, that it did catch all the errors that my command window was showing, and there are some dependency issues in there that obviously I need to fix. So yeah, this tool is super, super good. I'm stoked about it. For now, I think you can only use up to 10 minutes at a time of each session, but you can just restart a new session after that and keep going. Like with any other LLMs and AI programs that you might use, you have a specific context window. I think that after 10 minutes, we're probably exhausting that context window for Gemini. I think it's really cool. I think everybody should give it a try and test it out and see what ideas you can come with using it for. So this is part of Google AI Studio, which you can just go to aistudio.google.com slash live and then stream real time. This is the option that's going to let you either share your screen or share your video and just converse with Gemini in this way, which is pretty awesome. I hope you like this little overview. Let me know if you have any questions or anything you'd like me to test out with Gemini or any other AI apps that I haven't reviewed yet.
No AI insights yet
Save videos. Search everything.
Build your personal library of inspiration. Find any quote, hook, or idea in seconds.
Create Free Account No credit card required