INDEX
Explanations
references to parallel processes or systems
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
545
+0.14
0.7%
1407
+0.14
0.7%
120
+0.13
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1272
+0.14
0.02
1137
+0.14
0.03
1052
+0.13
0.02
Negative Logits
<bos>
-1.68
Гу
-0.68
rungsseite
-0.64
受
-0.64
cookie
-0.63
խ
-0.59
අ
-0.59
pub
-0.58
ઊ
-0.58
تع
-0.58
POSITIVE LOGITS
suspic
1.63
affor
1.56
desir
1.51
emphat
1.50
madonna
1.49
fta
1.48
perfet
1.48
ftu
1.47
foon
1.46
Juf
1.45
Activations Density 0.322%