INDEX
Negative Logits
religious
-0.07
inauguration
-0.07
شح
-0.07
notas
-0.07
anyone
-0.07
ନ
-0.07
fertility
-0.07
songs
-0.07
-0.07
тық
-0.07
POSITIVE LOGITS
privilegi
0.09
ting
0.08
schlä
0.08
kän
0.08
Taco
0.08
킹
0.07
ieden
0.07
0.07
porcel
0.07
bleed
0.07
Activations Density 0.003%