INDEX
Explanations
mentions of political affiliations and actions, particularly involving gun rights and lawmakers
New Auto-Interp
Negative Logits
μί
-0.18
(æ°´
-0.17
ustin
-0.15
usta
-0.15
ostÃŃ
-0.14
LARI
-0.14
TRGL
-0.14
SEQUENTIAL
-0.14
rze
-0.14
-bars
-0.14
POSITIVE LOGITS
opposite
0.18
personally
0.16
øy
0.15
pom
0.15
himself
0.14
okoj
0.14
addTarget
0.14
close
0.14
gent
0.14
ast
0.14
Activations Density 0.036%