INDEX
Explanations
mentions of social media or online actions related to public figures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.17
0.5%
1828
+0.11
0.3%
31
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1828
+0.17
0.07
31
+0.11
0.05
2011
+0.09
0.05
Negative Logits
belec
-0.71
commodation
-0.64
(":");-0.63
Puis
-0.61
cokinetic
-0.61
AndEndTag
-0.61
Même
-0.59
=="")
-0.58
==""){-0.58
ViewFeatures
-0.58
POSITIVE LOGITS
délib
1.13
vété
1.11
récon
1.08
récomp
1.04
appui
1.04
mémor
1.04
habile
1.03
dévou
1.03
clô
1.01
dénon
1.00
Activations Density 0.225%