Stable diffusion with LoRA!

What are LoRA models and how to use them in AUTOMATIC1111 - Stable Diffusion Art

LoRA models are small Stable Diffusion models that apply tiny changes to standard checkpoint models. They are usually 10 to 100 times smaller than checkpoint

stable-diffusion-art.com

위 게시물의 한글 번역. 이해내용.

LoRA model은 standard checkpoint에 작은 변화를 가할 수 있는 작은 stable diffusion 모델이다.

오리지널보다 10~100배정도 작기때문에, 사람들이 사용하기 쉽다.

( 의견 : large model을 가져다가 어떻게 fine tuing 시키겠는가! controlnet도 그렇고, large모델의 weight를 고정하고 곁다리 모듈을 붙여 fine tuning 하는 듯)

본 튜토리얼에서는 LoRA가 무엇인지, 어떻게 사용할 수 있는지, LoRA데모등을 봅니다.

LoRA(Low-Rank Adaptation) 는 Stable Diffusion을 fine tuning 하는 모델입니다.

이러한 테크닉은 Dreambooth and textual inversion 에서도 보았던 방법입니다. LoRA는 이들과 file size와 trainning 파워가 다름.

Dreambooth 는 강력하지만 large model file 을 결과적으로 만들어내고 (2~7G), textual inversion 은 100kb이긴하지만, 많이 쓸수 없다??( -> 활용이 제한된다 같음)

LoRA는 file size가 2-200MB이고, training power도 decent (= 어느정도 좋다 )

LoRA는 model checkpoint file에 작은 변화를 가한다. (LoRA modifies styles by applying small changes to the accompanying model file.)

LoRA는 SD의 cross attemtion layer(sd의 가장 중요한 모듈) 에 변화를 준다. 이부분은 image와 prompt가 만나는 부분이다. 로라의 paper저자들은 이분에 대한 튜닝이 좋은 결과를 얻기에 충분하다는걸 찾아냈다.

cross-attention layer의 weights는 매트릭스로 배열됩니다. 매트릭스란 cols x rows 를 말합니다.(엑셀의 스프레드 시트 처럼!!)

LoRA의 핵심은 ! 아래 이미지 참고시, 저차원으로 고차원을 나타냄으로써 파라미터수를 획기적으로 줄인다는 점이다.

LoRA is breaking a matrix into two smaller (low-rank) matrices. It can store a lot fewer numbers by doing this.

Let’s say the model has a matrix with 1,000 rows and 2,000 columns. That’s 2,000,000 numbers (1,000 x 2,000) to store in the model file. LoRA breaks down the matrix into a 1,000-by-2 matrix and a 2-by-2,000 matrix. That’s only 6,000 numbers (1,000 x 2 + 2 x 2,000), 333 times less. That’s why LoRA files are a lot smaller.

'VirtualTryon' 카테고리의 다른 글

CLIPVisionModel Projection, PBE image encoder to SDXL 이식기록 (0)	2023.09.24
Virtual Tryon(개발 아이디어) (0)	2023.09.03
clip - ViT & Image projection (0)	2023.08.25
[논문리뷰] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge 풀번역 (0)	2023.02.05

오늘, 최선을 다하자

Stable diffusion with LoRA!

'VirtualTryon' 카테고리의 다른 글

티스토리툴바

Stable diffusion with LoRA!

'VirtualTryon' 카테고리의 다른 글

'VirtualTryon' Related Articles

티스토리툴바