INDEX
Explanations
phrases related to fashion choices and the swapping of clothing items
New Auto-Interp
Negative Logits
rud
-0.17
ëijĺ
-0.16
both
-0.15
éϤäºĨ
-0.15
sice
-0.15
both
-0.15
anan
-0.14
akk
-0.14
anza
-0.14
pany
-0.14
POSITIVE LOGITS
elsewhere
0.23
naopak
0.21
ones
0.20
equally
0.18
else
0.17
others
0.17
Ones
0.16
STANCE
0.16
ylko
0.16
же
0.16
Activations Density 0.302%