INDEX
Explanations
long, specific names and terms, potentially including surnames and locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
168
+0.16
0.6%
1872
+0.15
0.6%
1482
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.16
0.06
1872
+0.15
0.04
168
+0.13
0.04
Negative Logits
epsfig
-0.55
lüğ
-0.55
vsak
-0.52
leyeb
-0.52
ceğim
-0.52
oseb
-0.52
lenmiş
-0.52
ceğini
-0.51
Septembre
-0.51
ceğ
-0.49
POSITIVE LOGITS
kele
0.89
seksi
0.88
sula
0.88
lele
0.85
antik
0.85
ille
0.84
maksi
0.84
kosme
0.84
silikon
0.84
kollek
0.84
Activations Density 0.298%