INDEX
Explanations
locations or nationalities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.1%
605
+0.10
0.6%
1741
+0.09
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1252
+0.17
0.09
143
+0.10
0.08
970
+0.09
0.07
Negative Logits
<bos>
-3.55
ⓧ
-1.24
/**
-1.08
-1.01
/*
-0.98
<?
-0.98
ValueGenerated
-0.93
uxxxx
-0.91
HasIndex
-0.91
#
-0.89
POSITIVE LOGITS
Juf
2.56
increa
2.46
fta
2.35
Augu
2.32
ftu
2.31
affor
2.30
inev
2.29
aen
2.29
reluct
2.26
perfet
2.22
Activations Density 0.403%