INDEX
Negative Logits
be
-0.26
Be
-0.22
be
-0.20
бÑĥдÑĮ
-0.18
(be
-0.18
guilty
-0.17
Be
-0.17
lec
-0.16
STILL
-0.16
بتÙĪØ§ÙĨ
-0.15
POSITIVE LOGITS
originally
0.25
actic
0.23
previously
0.19
formerly
0.19
invent
0.18
initially
0.18
Originally
0.17
happen
0.17
fare
0.17
antha
0.16
Activations Density 0.066%