INDEX
Explanations
phrases that indicate a sense of inadequacy or lack
New Auto-Interp
Negative Logits
uspend
-0.17
ανά
-0.15
iq
-0.14
åłĤ
-0.14
SCRI
-0.14
alette
-0.14
ruž
-0.14
ког
-0.14
Trader
-0.14
коÑĢиÑģÑĤ
-0.13
POSITIVE LOGITS
too
0.47
too
0.43
TOO
0.42
Too
0.42
Too
0.41
-too
0.35
太
0.35
ÑģлиÑĪком
0.34
_TOO
0.32
demasi
0.30
Activations Density 0.203%