INDEX
Negative Logits
ſelf
-0.91
parsedMessage
-0.88
nahilalakip
-0.79
saurus
-0.77
fillType
-0.77
amental
-0.75
ddelweddau
-0.75
ſelves
-0.71
featureID
-0.70
houſe
-0.70
POSITIVE LOGITS
about
0.49
of
0.42
About
0.40
ABOUT
0.40
benefit
0.34
des
0.34
Were
0.33
vē
0.33
against
0.33
wind
0.33
Activations Density 0.002%