INDEX
Negative Logits
Candidate
-0.07
paranoia
-0.07
lw
-0.06
posite
-0.06
icts
-0.06
ak
-0.06
L
-0.06
Equality
-0.06
yz
-0.06
Pers
-0.06
POSITIVE LOGITS
_pieces
0.06
Dram
0.06
conservative
0.06
alyzer
0.06
.main
0.06
broadcasters
0.06
Kendrick
0.06
pray
0.06
}}"></
0.06
_dom
0.06
Activations Density 0.025%