INDEX
Negative Logits
jot
-0.08
aromatic
-0.08
singly
-0.08
붙
-0.08
Albums
-0.08
cactus
-0.08
Resort
-0.08
'.'
-0.08
Hotel
-0.08
percussion
-0.07
POSITIVE LOGITS
biases
0.20
Bias
0.18
_bias
0.18
fairness
0.18
bias
0.17
biased
0.17
公平
0.16
Bias
0.16
bias
0.16
biais
0.16
Activations Density 0.025%