INDEX
Explanations
verbs and actions that signify emotional expression or communication
New Auto-Interp
Negative Logits
meiden
-0.17
çª
-0.16
aeper
-0.14
_nth
-0.14
icer
-0.14
osy
-0.14
ifax
-0.14
comprom
-0.14
enza
-0.14
erap
-0.13
POSITIVE LOGITS
Those
0.16
those
0.16
those
0.16
Those
0.15
ήÏĤ
0.14
Wich
0.14
Hood
0.14
å¤īãĤı
0.14
íĻ©
0.14
éĤ£äºĽ
0.14
Activations Density 0.009%