INDEX
Explanations
words related to demonstrations or political activism
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1618
+0.13
0.5%
1865
+0.13
0.4%
1870
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1865
+0.13
0.03
1142
+0.13
0.03
1618
+0.12
0.03
Negative Logits
curi
-0.83
discogs
-0.82
reebok
-0.81
huma
-0.77
timately
-0.75
reluct
-0.75
parf
-0.72
volunte
-0.72
pessi
-0.72
noel
-0.71
POSITIVE LOGITS
protest
1.21
protests
1.10
protesting
1.06
protesters
1.06
protest
1.03
Protest
0.99
protested
0.98
Protest
0.95
protestors
0.78
protester
0.71
Activations Density 0.053%