INDEX
Explanations
proper nouns related to individuals and organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.22
0.7%
394
+0.16
0.5%
1177
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1887
+0.22
0.10
50
+0.16
0.11
394
+0.11
0.09
Negative Logits
postres
-0.68
postre
-0.63
Wikimédia
-0.62
LLocation
-0.61
maging
-0.59
cuk
-0.58
Kích
-0.58
RectangleBorder
-0.58
centavos
-0.57
bambu
-0.56
POSITIVE LOGITS
maneu
1.84
shenan
1.79
impra
1.70
reluct
1.62
depic
1.60
affor
1.59
unspeak
1.57
increa
1.56
indestru
1.55
hentai
1.53
Activations Density 2.072%