INDEX
Explanations
information related to technology, social issues, and international events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.18
0.5%
678
+0.11
0.3%
845
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1870
+0.18
0.05
227
+0.11
0.08
678
+0.10
0.07
Negative Logits
XmlEnum
-0.83
thinkable
-0.78
LayoutStyle
-0.76
sightly
-0.71
reportWebVitals
-0.70
nakalista
-0.68
relenting
-0.66
PhysRevD
-0.64
desertcart
-0.64
PackageManager
-0.64
POSITIVE LOGITS
emphat
0.94
Simult
0.80
suspic
0.77
sergio
0.76
valencia
0.76
jorge
0.74
uninten
0.71
attemp
0.70
excru
0.69
increa
0.68
Activations Density 0.667%