INDEX
Explanations
references to political structures and processes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
604
+0.18
0.6%
872
+0.17
0.6%
1741
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
872
+0.18
0.06
1288
+0.17
0.05
921
+0.15
0.02
Negative Logits
sovere
-1.02
inev
-1.01
guarante
-0.98
uniqu
-0.97
reluct
-0.95
hcm
-0.93
disagre
-0.92
maneu
-0.91
perfon
-0.91
coö
-0.91
POSITIVE LOGITS
CROSSTALK
0.63
arată
0.53
APPLAUSE
0.52
argument
0.51
Comentários
0.51
település
0.51
DataPropertyName
0.50
wikipagina
0.49
calciatore
0.49
Πηγή
0.49
Activations Density 0.279%