INDEX
Negative Logits
ehen
-0.16
eder
-0.16
egov
-0.16
eza
-0.16
edException
-0.16
spender
-0.15
ierz
-0.15
Ñıд
-0.15
å¼ķãģį
-0.15
ingles
-0.15
POSITIVE LOGITS
ating
0.18
LETTE
0.18
ante
0.17
ancel
0.17
lette
0.17
viol
0.17
ayet
0.16
-viol
0.16
ated
0.16
-force
0.15
Activations Density 0.009%