INDEX
Negative Logits
itect
-0.74
ĺħ
-0.68
selves
-0.66
harbor
-0.64
warranties
-0.64
¿½
-0.63
Higgins
-0.59
etheless
-0.59
Carbuncle
-0.59
FISA
-0.59
POSITIVE LOGITS
esome
1.26
ppe
1.16
ppo
1.14
grim
1.02
enhagen
0.99
ppa
0.99
berman
0.95
nder
0.95
po
0.95
ber
0.94
Activations Density 0.018%