INDEX
Explanations
proper nouns, such as names of individuals and organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
394
+0.16
0.5%
50
+0.13
0.4%
227
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.16
0.18
1097
+0.13
0.13
964
+0.12
0.10
Negative Logits
mourut
-0.91
eorum
-0.88
ejus
-0.83
trouva
-0.80
RectangleBorder
-0.80
PLWABN
-0.78
vinil
-0.77
LIRE
-0.75
travaillons
-0.75
tantum
-0.75
POSITIVE LOGITS
has
0.80
attemp
0.78
exemplifies
0.77
is
0.75
began
0.73
ftw
0.73
definately
0.72
succinctly
0.69
embodies
0.69
occupies
0.69
Activations Density 2.105%