INDEX
Explanations
references to clothing and personal appearances
New Auto-Interp
Negative Logits
меж
-0.16
fold
-0.14
plat
-0.14
allon
-0.14
ODB
-0.14
ادت
-0.14
_hint
-0.14
abox
-0.14
egin
-0.14
اص
-0.14
POSITIVE LOGITS
wearing
0.61
wears
0.47
wear
0.46
wore
0.46
dressed
0.45
Wear
0.38
wear
0.36
clad
0.33
dress
0.29
worn
0.29
Activations Density 0.204%