INDEX
Explanations
mentions of public gatherings or events, particularly demonstrations
mentions of demonstrations or protests
New Auto-Interp
Negative Logits
saline
-0.76
paste
-0.70
laus
-0.67
chance
-0.67
otto
-0.67
-0.67
oho
-0.65
pop
-0.63
olog
-0.62
nan
-0.61
POSITIVE LOGITS
demonstration
1.03
demonstrations
0.97
GOODMAN
0.91
demonstrators
0.86
emonium
0.81
arily
0.78
antes
0.78
anooga
0.75
stration
0.72
Demon
0.72
Activations Density 0.014%