INDEX
Negative Logits
chess
-0.07
Dup
-0.06
Tr
-0.06
invaders
-0.06
upe
-0.06
Kay
-0.06
pud
-0.06
velvet
-0.06
ژن
-0.06
abortions
-0.06
POSITIVE LOGITS
career
0.08
pošk
0.07
.what
0.07
.goods
0.07
_four
0.06
trajectory
0.06
renched
0.06
}}>↵
0.06
(lang
0.06
üssen
0.06
Activations Density 0.019%