INDEX
Negative Logits
WARD
-0.73
href
-0.72
plur
-0.71
atives
-0.70
ebin
-0.69
Gamergate
-0.69
NER
-0.69
ership
-0.69
kus
-0.69
Filename
-0.69
POSITIVE LOGITS
pudding
1.03
cake
0.96
anut
0.93
flavored
0.92
coated
0.89
chip
0.89
chocolate
0.87
syrup
0.87
flav
0.86
butter
0.85
Activations Density 0.016%