INDEX
Explanations
phrases related to various locations and people
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1047
+0.14
0.8%
50
+0.13
0.8%
1222
+0.12
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1222
+0.14
0.02
1047
+0.13
0.02
795
+0.12
0.02
Negative Logits
<bos>
-2.84
<?
-0.88
/**
-0.86
ⓧ
-0.80
-0.78
<?
-0.72
حياته
-0.66
ensure
-0.66
protect
-0.65
дописавши
-0.62
POSITIVE LOGITS
Van
1.31
Juf
1.30
Van
1.25
VAN
1.22
Minang
1.19
rafra
1.16
délib
1.11
ankara
1.10
lele
1.09
soulign
1.09
Activations Density 0.072%