INDEX
Explanations
personal names and titles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1577
+0.20
0.7%
50
+0.18
0.6%
1097
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.20
0.17
1097
+0.18
0.16
1177
+0.15
0.07
Negative Logits
RectangleBorder
-0.59
BorderSide
-0.57
marle
-0.53
película
-0.53
dominal
-0.53
Himo
-0.52
litro
-0.51
Dimensiones
-0.50
ogueira
-0.49
berlina
-0.49
POSITIVE LOGITS
attemp
1.55
maneu
1.55
emphat
1.53
reluct
1.50
depic
1.49
strick
1.49
resear
1.49
shenan
1.49
alre
1.46
impra
1.46
Activations Density 1.488%