2025년 3월 7일 금요일

5-1 Let's learn about ControlNet(Stable Diffusion Practical Guide Table of Contents)

>>>What is ControlNet?

ControlNet is a derivative extension of Stable Diffusion, which was published in the February 2023 paper, <Adding Conditional Control to Text-to-Image Diffusion Modes>. Using this, you can input an image or pose in addition to the prompt to generate complex compositions that cannot be indicated by the prompt alone, poses that are difficult to explain with text, and images that reproduce the characters in the original photo.

The person who created this technology is lllyasviel (zhang Lvmin), the main author of the paper, and as of March 2024, lllyasviel has released the details of the technology and the model on GitHub and Hugging Face.

Github - Lllyasviel / ControlNet-v1-1-nightly

https//github.com/lllyasviel/ControlNet-v1-1-nightly

Additionally, Mikubill has created an extension to allow ControlNet to be used in AUTOMATIC1111 and released it as open source.

Github - Mikubill / sd-webui-controlNet

https://github.com/Mikubill/sd-webui-controlnet

ControlNet is an artificial neural network technology that adds spatial condition control by the diffusion model of Stable Diffusion. It consists of several types of 'preprocessors' such as openpose, which extracts poses from images, and canny, which extracts outlines, and the extracted information is used as condition control for image creation using txt2img. If you use each preprocessor separately according to its purpose, you can control the composition or pose that was difficult to handle in the existing txt2img to create an image as you intended.

>>>Know the difference between img2img and ControlNet

If you think about generating a new image based on an image and a prompt, you might think of img2img, but img2img and ControlNet are completely different technologies.

While img2img identifies the features of the entire input image to generate an image, ControlNet pre-analyzes the information (image, border, pose, depth, etc.) on the input image using a pre-processor, and extracts only the features of the specific elements that each pre-processor is responsible for from the input image to generate an image. For example, you can reproduce only the pose from the input image as shown below.

Original image

Image source pexels

Pose of a character extracted with the openpose preprocessor

Original image

Pose of a character extracted with the openpose preprocessor

댓글 없음:

댓글 쓰기

Recommended Posts

Love is taking responsibility till the end

  I created a campaign called "Don't Abandon Your Beloved Dog" with Stable Diffusion. I tried creating it several times, but i...