INDEX
Explanations
information related to studies, observations, and analysis carried out by scholars, analysts, economists, regulators, and officials
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
658
+0.10
0.3%
198
+0.09
0.3%
1129
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
658
+0.10
0.08
143
+0.09
0.07
1379
+0.09
0.06
Negative Logits
évé
-0.85
isabel
-0.79
silikon
-0.78
déli
-0.77
écout
-0.76
incroy
-0.74
eiffel
-0.74
lorenzo
-0.70
ecru
-0.68
jacobs
-0.68
POSITIVE LOGITS
themselves
0.65
themselves
0.65
Kaip
0.64
bufio
0.62
<<<<<<<<<<<<<<
0.62
siempre
0.59
Datuak
0.59
astéro
0.56
Læs
0.55
Життєпис
0.55
Activations Density 0.536%