INDEX
Explanations
phrases related to political issues and events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.09
0.3%
2019
+0.09
0.2%
184
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
310
+0.09
0.03
16
+0.09
0.04
453
+0.08
0.03
Negative Logits
incess
-0.82
vitale
-0.77
brille
-0.77
ló
-0.76
poros
-0.74
fusil
-0.73
perche
-0.73
sappi
-0.72
tenda
-0.72
parati
-0.72
POSITIVE LOGITS
impelled
0.55
relinquish
0.51
endow
0.51
gratify
0.50
quitted
0.49
negroes
0.48
appease
0.48
relinquished
0.48
lingered
0.48
perchance
0.48
Activations Density 0.256%