INDEX
Explanations
references to lists or enumerations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.6%
1961
+0.14
0.8%
812
+0.09
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1961
+0.27
0.03
1573
+0.14
0.03
1096
+0.09
0.03
Negative Logits
<bos>
-2.76
ⓧ
-1.02
/***
-0.94
<?
-0.87
-0.86
//---
-0.74
/**
-0.70
<?
-0.66
///**
-0.66
/**
-0.66
POSITIVE LOGITS
list
1.09
kani
1.06
Minang
1.05
jawa
1.05
affor
1.03
lists
1.01
jati
1.00
jaya
1.00
Lists
0.98
maneu
0.96
Activations Density 0.053%