INDEX
Explanations
names of family members, especially siblings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.14
0.8%
32
+0.11
0.6%
501
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
694
+0.14
0.02
825
+0.11
0.02
1548
+0.11
0.02
Negative Logits
<bos>
-2.77
ⓧ
-0.74
-0.73
/*
-0.70
<?
-0.69
Kontrola
-0.67
oredCriteria
-0.65
intios
-0.63
<?
-0.63
maxcdn
-0.63
POSITIVE LOGITS
Juf
1.46
affor
1.40
fortn
1.35
increa
1.33
maneu
1.32
unden
1.32
aen
1.29
sovere
1.28
Khart
1.28
stockholm
1.27
Activations Density 0.116%