INDEX
Explanations
expressions of admiration or compliments regarding fashion and personal style
New Auto-Interp
Negative Logits
Leather
-0.16
sons
-0.15
homo
-0.15
帽
-0.15
Gil
-0.15
Homo
-0.15
bald
-0.15
owo
-0.15
Wooden
-0.14
ager
-0.14
POSITIVE LOGITS
skirts
0.28
skirt
0.26
dresses
0.25
hem
0.24
hem
0.24
fro
0.22
skirts
0.22
gown
0.21
chiff
0.21
Dresses
0.21
Activations Density 0.172%