INDEX
Explanations
phrases related to political commentary
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.13
0.4%
190
+0.09
0.3%
50
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.13
0.05
752
+0.09
0.04
1207
+0.09
0.03
Negative Logits
<bos>
-0.69
Giugno
-0.60
rensa
-0.51
Luglio
-0.51
vostri
-0.51
mặt
-0.48
Conclu
-0.47
>\<
-0.46
ekyll
-0.46
Làm
-0.46
POSITIVE LOGITS
fua
0.55
aen
0.55
hematical
0.54
lele
0.53
exportaciones
0.51
haviour
0.50
mme
0.48
tirage
0.48
vne
0.48
cnbc
0.48
Activations Density 0.223%