INDEX
Explanations
mentions of strikes and protests
New Auto-Interp
Negative Logits
UILDER
-0.17
Podesta
-0.16
леÑĩ
-0.15
iture
-0.15
ÑĢап
-0.15
itures
-0.15
Trap
-0.14
RAP
-0.14
FileAccess
-0.14
-arrow
-0.13
POSITIVE LOGITS
strike
0.79
Strike
0.67
strikes
0.66
strike
0.59
Strike
0.58
striking
0.54
Strikes
0.51
stri
0.50
_strike
0.50
struck
0.48
Activations Density 0.091%