INDEX
Explanations
locations or events related to political protests
references to locations or sites related to political activities
New Auto-Interp
Negative Logits
states
-0.87
Nationwide
-0.74
raft
-0.69
erm
-0.66
model
-0.66
Users
-0.63
orem
-0.63
TBD
-0.62
state
-0.61
Survivor
-0.61
POSITIVE LOGITS
rir
1.16
ascus
1.00
ée
0.88
unda
0.84
rall
0.84
unta
0.82
adish
0.82
antes
0.80
ashtra
0.79
andom
0.78
Activations Density 0.011%