INDEX
Explanations
Biblical references and religious terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
0.8%
1978
+0.08
0.4%
208
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1978
+0.18
0.04
274
+0.08
0.03
356
+0.07
0.03
Negative Logits
<bos>
-2.19
ⓧ
-1.23
/**
-0.96
<?
-0.92
-0.90
/*
-0.67
reunite
-0.65
stabilize
-0.64
Transcripción
-0.64
revive
-0.61
POSITIVE LOGITS
lele
1.57
magis
1.36
meis
1.34
ananas
1.30
saar
1.27
maksi
1.25
kafe
1.24
seksi
1.24
wien
1.24
uhr
1.24
Activations Density 0.093%