INDEX
Explanations
phrases indicating recurring experiences or states of being
New Auto-Interp
Negative Logits
247
-0.15
ellar
-0.15
onium
-0.15
jure
-0.14
ule
-0.14
retire
-0.14
iÄħ
-0.14
£
-0.14
ium
-0.13
amy
-0.13
POSITIVE LOGITS
been
0.18
<tag
0.14
forward
0.14
ask
0.14
Been
0.14
Zem
0.14
odie
0.13
обÑĢеÑĤ
0.13
holm
0.13
graz
0.13
Activations Density 0.025%