INDEX
Explanations
phrases indicating absence or lack
New Auto-Interp
Negative Logits
antro
-0.16
alm
-0.16
ctype
-0.15
jon
-0.15
åħ
-0.15
yk
-0.14
vais
-0.14
nowhere
-0.14
vetica
-0.14
lement
-0.13
POSITIVE LOGITS
743
0.17
puls
0.16
tul
0.16
proper
0.16
626
0.15
Ïĥή
0.15
spath
0.15
rahim
0.14
æ¼
0.14
(defvar
0.14
Activations Density 0.018%