1) attentionReplace
2) Refinement : 단어가 추가되어 refinement 필요
3) reweight
=> 일단 궁금한것 refinement!!
단어길이가 다른데, 어떻게 처리했음?
Attention Control Options
더보기
- cross_replace_steps: specifies the fraction of steps to edit the cross attention maps. Can also be set to a dictionary [str:float] which specifies fractions for different words in the prompt.
- self_replace_steps: specifies the fraction of steps to replace the self attention maps.
- local_blend (optional): LocalBlend object which is used to make local edits. LocalBlend is initialized with the words from each prompt that correspond with the region in the image we want to edit.
- equalizer: used for attention Re-weighting only. A vector of coefficients to multiply each cross-attention weight
fraction : 비율
돌리다가 발생한 에러 (colab 테스트 완료 23.06.20)
버전이슈가 좀 있나보다
requirements.txt에서 diffusers 버전은 변경하고,
diffusers==0.14.0
ptp_utils.py 의 forward 함수를 아래와 같이 변경한다.
def forward(hidden_states, encoder_hidden_states=None, attention_mask=None, **cross_attention_kwargs):
x = hidden_states
context = encoder_hidden_states
mask = attention_mask
batch_size, sequence_length, dim = x.shape
h = self.heads
q = self.to_q(x)
is_cross = context is not None
context = context if is_cross else x
k = self.to_k(context)
v = self.to_v(context)
q = self.head_to_batch_dim(q)
k = self.head_to_batch_dim(k)
v = self.head_to_batch_dim(v)
sim = torch.einsum("b i d, b j d -> b i j", q, k) * self.scale
if mask is not None:
mask = mask.reshape(batch_size, -1)
max_neg_value = -torch.finfo(sim.dtype).max
mask = mask[:, None, :].repeat(h, 1, 1)
sim.masked_fill_(~mask, max_neg_value)
# attention, what we cannot get enough of
attn = sim.softmax(dim=-1)
attn = controller(attn, is_cross, place_in_unet)
out = torch.einsum("b i j, b j d -> b i d", attn, v)
out = self.batch_to_head_dim(out)
return to_out(out)
참고
'딥러닝 > 생성AI' 카테고리의 다른 글
sd, sdxl 아키텍처 (0) | 2024.07.03 |
---|---|
[diffusers] controlnet 코드 분석 (0) | 2023.08.02 |