INDEX
Explanations
contrasting or unexpected information within a context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
605
+0.12
0.4%
1052
+0.11
0.4%
544
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
442
+0.12
0.03
1052
+0.11
0.03
900
+0.10
0.03
Negative Logits
arken
-0.50
bcryptjs
-0.48
Ken
-0.48
imageshack
-0.46
Ev
-0.46
AK
-0.46
Ping
-0.46
kapat
-0.46
TextAlign
-0.45
KF
-0.45
POSITIVE LOGITS
intersper
0.98
bandung
0.91
kuli
0.85
Minang
0.84
encomp
0.84
withal
0.83
jaya
0.82
karna
0.81
gaily
0.80
Banten
0.79
Activations Density 0.080%