INDEX
Explanations
terms related to protests
references to protests
New Auto-Interp
Negative Logits
illac
-0.73
lasses
-0.72
efficients
-0.71
nown
-0.71
estone
-0.67
scribe
-0.64
Wonders
-0.64
theless
-0.63
oiler
-0.63
ewater
-0.61
POSITIVE LOGITS
ations
1.01
against
0.92
aires
0.85
encamp
0.85
ant
0.82
ors
0.80
ational
0.77
marches
0.76
rally
0.76
march
0.75
Activations Density 0.043%