• 1 Post
  • 10 Comments
Joined 1 year ago
cake
Cake day: September 3rd, 2023

help-circle


  • Img2img is one of many ways to constrain the AIs efforts to your compositional desires, it’s rad. You can control the amount of “dreaming” the AI does on the base image to get subtle changes, or a radically different image based on the elements of the previous (sometimes to trippy cool results, often to horrendous mutations if the desired image is supposed to be humanoid xD).

    Inpainting is another tool, it’s like a precise img2img on an area you mask. Hands are often the most garbled thing from the AI, so a brute force technique is to img2img the hands - but the process works a lot better if you help the AI out and manually fix the hands. So I’ll throw the image into photoshop, make a list (if I remember :P) of everything I need to fix, address them directly and then toss it back into Automatic 1111. Often the shading and overall style are hard things for me to get right so I’ll inpaint over my edits to get the style and shading back.


  • Thank you! Essentially I’ll come in with a visual idea, some sketches already or I’ll do one with AI in mind (keep the lines simple so it doesn’t get confused). Generate a batch of images with img-2-img and cherry pick the ones that fit closest to the idea or are surprising and wonderful. Rework those for anatomical errors or other things I want to fix or omit -> send it back through img-2-img if it needs it or to inject detail -> upscale and put it as my desktop/phone wallpaper :P

    (I’m using Automatic 1111 which is a webui for Stable Diffusion btw)


  • I replied to a previous comment about the “assistance” part which is sorta an abridged version of my workflow (“workflow” is also a term used in Comfy UI, a visual layout that processes the image sequentially through modules). It’s super fun I highly recommend it! Feel free to PM me anytime I’d be glad to help!

    Really it was looking up terms and areas of Automatic 1111 I was unsure of and finding various sites and guides. Civitai has LOTS of guides often written by model makers or people with lots of hours in the field - it’s also my main resource for LoRAs and Models. But there’s tons of info on there. The most helpful ones where settings and workflows on actual image generation (I can definitely find some links for you there) to get quality results without too much “and if I change this, what happens?” But honestly I love poking around like that so I still spend hours tweaking just to see what happens xD


  • For sure! Often I’ll come in with a visual idea already, or will iterate on some with the AI giving inspiration. If I have the idea strongly I’ll sketch out the composition and elements I know I want - sometimes on real tricky poses like fingers I’ll take a photo of myself doing them. Throw that into stable diffusion with img-2-img to generate images based on my sketch/photograph to something more full featured or something I hadn’t thought of but really like (you can also set how “dreamy” the AI should be, how much it should vary from the input material).

    There’s a lot of detail I could get into but the “assistance” is fleshing out a composition -> I go in and correct anatomical mistakes or elements I want to change specifically -> run it through again if it needs it.


  • I hear you, when this stuff was blowing up I couldn’t shake that it was trained off artists’ work that they didn’t consent to having in the datasets. Sure it’s similar to how human artists work (for music and art the prevailing recommendations for me, or any artist, was to consume material relevant to your art. For visual art they really just wanted you to constantly keep your head open for shapes and form) but it felt closer to plagiarism than inspiration. Some generations can be very close to an individual style (especially if the model was trained specifically off that) but I found that generations that omitted an artist ended up creating something compelling but not tied to one artist specifically - still undoubtedly a conglomeration of the multitudes it was trained on (including photography). It’s muddy water for sure, and the angle of AI replacing workers in general is still relevant - but I also think it empowers people like me who have the visual ideas but can use the help making them fully fleshed out.

    The crux, for me, feels like “when you can see whatever you want, what do you want to see?” A lot of our AI woes are reflections of questionable human behavior (racist chat models, AI for war, deepfakes and dishonesty).

    How do you feel about it?