INDEX
Explanations
proper nouns and locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1842
+0.14
0.5%
1343
+0.10
0.3%
1150
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.14
0.18
1385
+0.10
0.14
565
+0.09
0.12
Negative Logits
encomp
-4.14
intersper
-4.11
depic
-3.90
shenan
-3.90
reluct
-3.76
increa
-3.74
guarante
-3.58
purcha
-3.58
disagre
-3.49
apprehen
-3.49
POSITIVE LOGITS
boxShadow
0.99
CURLOPT
0.99
:
0.97
Dimensiones
0.95
0.95
0.93
0.93
ChatColor
0.93
0.93
0.92
Activations Density 2.705%