INDEX
Explanations
references to protests and demonstrations
New Auto-Interp
Negative Logits
ChildScrollView
-0.86
Lik
-0.74
Crud
-0.68
Darius
-0.67
disposition
-0.65
Jad
-0.62
vician
-0.62
likeness
-0.61
Lik
-0.61
lap
-0.61
POSITIVE LOGITS
Protest
1.25
protest
1.18
protest
1.16
protests
1.12
Protest
1.09
protested
1.05
protes
1.04
protesting
0.92
protester
0.91
protesters
0.88
Activations Density 0.004%