INDEX
Negative Logits
bero
-0.06
armor
-0.06
Trademark
-0.06
Ст
-0.06
uples
-0.05
includes
-0.05
(pointer
-0.05
_secure
-0.05
Pastor
-0.05
aturdays
-0.05
POSITIVE LOGITS
repl
0.07
pla
0.07
inexp
0.07
共和
0.07
Compar
0.07
leaning
0.07
guide
0.07
شی
0.07
дя
0.06
loosen
0.06
Activations Density 0.001%