INDEX
Explanations
government and political terms or entities
punctuation marks, particularly periods
New Auto-Interp
Negative Logits
utherland
-0.79
anyl
-0.79
withd
-0.79
outlet
-0.70
auldron
-0.66
itaire
-0.63
waterfall
-0.62
terday
-0.62
oscope
-0.62
drill
-0.61
POSITIVE LOGITS
assetsadobe
0.77
Conclusion
0.76
[+
0.76
Mutant
0.69
COUR
0.69
Mk
0.69
AUT
0.67
wikipedia
0.66
Yon
0.66
Coy
0.64
Activations Density 0.098%