INDEX
Negative Logits
s
-0.89
asmen
-0.71
cheng
-0.67
Fisher
-0.66
ph
-0.66
sns
-0.63
I
-0.63
Abp
-0.62
sq
-0.61
ません
-0.61
POSITIVE LOGITS
debate
1.41
Debate
1.34
Debate
1.34
debate
1.20
Debates
1.13
debates
1.11
DEB
1.11
itſelf
1.02
myſelf
1.02
debated
1.00
Activations Density 0.002%