Recommended Posts

5-1 Let's learn about ControlNet(Stable Diffusion Practical Guide Table of Contents)

List

  • Chapter 5 Let's use ControlNet
  • Chapter 6 Let's create and use LoRA
    • 6-1 Let's learn what we can do with additional learning
    • 6-2 Let's create an image using LoRA
    • 6-3 Create your own dedicated painting style LoRA
    • 6-4 Let's create various types of LoRA
    • 6-5 Let's evaluate the learning content
  • Chapter 7 Let's use the image generation AI more
  • >>>What is ControlNet?

    ControlNet is a derivative extension of Stable Diffusion, which was published in the February 2023 paper, <Adding Conditional Control to Text-to-Image Diffusion Modes>. Using this, you can input an image or pose in addition to the prompt to generate complex compositions that cannot be indicated by the prompt alone, poses that are difficult to explain with text, and images that reproduce the characters in the original photo.

    The person who created this technology is lllyasviel (zhang Lvmin), the main author of the paper, and as of March 2024, lllyasviel has released the details of the technology and the model on GitHub and Hugging Face.

    Github - Lllyasviel / ControlNet-v1-1-nightly

    https//github.com/lllyasviel/ControlNet-v1-1-nightly

    Additionally, Mikubill has created an extension to allow ControlNet to be used in AUTOMATIC1111 and released it as open source.

    Github - Mikubill / sd-webui-controlNet

    https://github.com/Mikubill/sd-webui-controlnet

    ControlNet is an artificial neural network technology that adds spatial condition control by the diffusion model of Stable Diffusion. It consists of several types of 'preprocessors' such as openpose, which extracts poses from images, and canny, which extracts outlines, and the extracted information is used as condition control for image creation using txt2img. If you use each preprocessor separately according to its purpose, you can control the composition or pose that was difficult to handle in the existing txt2img to create an image as you intended.

    >>>Know the difference between img2img and ControlNet

    If you think about generating a new image based on an image and a prompt, you might think of img2img, but img2img and ControlNet are completely different technologies.

    While img2img identifies the features of the entire input image to generate an image, ControlNet pre-analyzes the information (image, border, pose, depth, etc.) on the input image using a pre-processor, and extracts only the features of the specific elements that each pre-processor is responsible for from the input image to generate an image. For example, you can reproduce only the pose from the input image as shown below.

    Original image

    Image source pexels

    Pose of a character extracted with the openpose preprocessor

    Original image

    Pose of a character extracted with the openpose preprocessor

    Comments