INDEX
Explanations
words related to living together in a community or organization
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.10
0.3%
1045
+0.09
0.3%
1806
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1806
+0.10
0.05
1265
+0.09
0.04
1045
+0.08
0.03
Negative Logits
lele
-1.37
umo
-1.30
mef
-1.29
Juf
-1.29
fei
-1.29
aben
-1.28
fua
-1.27
meis
-1.26
hej
-1.26
loto
-1.26
POSITIVE LOGITS
<bos>
1.10
astéro
0.70
eat
0.68
Dział
0.66
Bardzo
0.65
Pře
0.64
noStroke
0.62
learn
0.62
interact
0.62
Když
0.61
Activations Density 0.306%