INDEX
Explanations
mentions of specific countries and political events
punctuation marks, primarily periods
New Auto-Interp
Negative Logits
carbohyd
-0.80
transition
-0.74
integ
-0.73
complementary
-0.72
isot
-0.72
heavily
-0.70
interpreter
-0.70
extensively
-0.70
proport
-0.69
advoc
-0.69
POSITIVE LOGITS
Nope
1.39
Instead
1.31
Surely
1.30
Or
1.24
Maybe
1.22
Alas
1.21
Wouldn
1.20
Perhaps
1.16
Sure
1.15
Somehow
1.15
Activations Density 0.491%