INDEX
Explanations
expressions of excess or extreme quantities
New Auto-Interp
Negative Logits
sko
-0.18
ullo
-0.17
ninger
-0.17
erotische
-0.16
gest
-0.15
:host
-0.14
nackte
-0.14
Dün
-0.14
inge
-0.14
erland
-0.14
POSITIVE LOGITS
terribly
0.20
anymore
0.19
diss
0.19
flash
0.18
/not
0.18
thrilled
0.17
different
0.17
EITHER
0.16
either
0.16
nor
0.16
Activations Density 0.063%